Writing a Linear Search algorithm - malformed string representation

I am writing a linear search algorithm. I’ve got the main logic under my belt but it’s the string representation that I haven’t got right. Below is my script and output in my REPL. When I try to search (I call the method seek) for an integer within a list of intergers, the item I test is clearly present but my output shows: “False” as if it is not present.

What can you people tell me about what is wrong with my __str__ and what might you people recommend I use instead? Thanks.

Code:

class LinearSearch:
    # This class accepts an array and an item
    def __init__(self, entered_list=None, item=None):
        self.entered_list = entered_list
        self.item = item
        
    def seek(self, item):
      # Loop through the array and check if the current array element is equal to the item
      for index in self.entered_list:
        if index == item:
            # If it is True, return the index at which the element is found
            return index
        # If the item is never found, return -1
        return False
    
    def __str__(self):
        result = self.seek(self.item)
        if result:
            return str(result) 
        else:
            return "Not Present"

Here is the REPL inputs and outputs:

$ python
Python 3.11.6 (main, Nov 14 2023, 09:36:21) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from linear import LinearSearch
>>> list_obj = [1,2,4,5,7,12,256]
>>> item = 7
>>> item_exists = LinearSearch(list_obj,item)
>>> item_exists
<linear.LinearSearch object at 0x7ff5338add50>
>>> item_exists.seek(7)
False
>>> item_exists.seek(256)
False
>>>

My indentation for 2 of the lines are off by a single space. Not sure why my Python linter didn’t catch that. Anyways, I made the correction and the main issue I originally asked about is still present.

Here is my corrected code:

class LinearSearch:
    # This class accepts an array and an item
    def __init__(self, entered_list=None, item=None):
        self.entered_list = entered_list
        self.item = item
        
    def seek(self, item):
    # Loop through the array and check if the current array element is equal to the item
        for index in self.entered_list:
            if index == item:
                # If it is True, return the index at which the element is found
                return index
            # If the item is never found, return -1
            return False
    
    def __str__(self):
        result = self.seek(self.item)
        if result:
            return str(result) 
        else:
            return "Not Present"

I am not seeing what you are seeing. My apologies. Maybe you could be a little bit more helpful by being more specific rather than sassy and sarcastic.

1 Like

In my original attempt at writing this linear search tree algorithm, my for loop looked like this:

        for index in self.entered_list:
            if index == item:
                # If it is True, return the index at which the element is found
                return index
                # If the item is never found, return -1
            else:
                return False

The else operator there I thought was unnecessary so I removed it and reduced the indentation by four spaces for the return False statement, properly adhering to Python syntax. Either way, I was mistaken. Whether I were using braces and semicolons in JavaScript, or purely indentation in Python, I would have had the same misunderstanding with the way I returned the index or False inside the for loop with both languages.

This clarfies although I don’t know why you couldn’t have just said this up front, @Stefan2. I really don’t appreciate your bad attitude. If you can’t show a little bit more respect to a beginner like me who makes a novice mistake, then it might just be best to not make a sassy remark and let other forum members reply with more friendly and constructive comments.

1 Like

When the the condition of the if index == item is not satisfied, it always exits with the return False.
You want to stay in the loop.
So, don’t put that return False inside the loop, but outside.

for index in self.entered_list:
    if index == item:
        # If it is True, return the index at which the element is found
         return index
# If the item is never found, return -1
return False
1 Like

Sorry for the reception you got with this question. I’ve cleaned up those posts. Be sure to flag posts that are rude or off topic, we can moderate them rather than you needing to deal with the user directly.

1 Like

Hi David. I trust that Python community is very welcoming and friendly in general. Thank you for stepping in on this one. I have some additional follow up questions for this Python script I am writing and am hoping that this thread can remain open for others to contribute to the discussion. Although if I understand correctly, it looks like this thread might be locked/unlisted now. @davidism: Are you able to make it accessible again?

I found this thread through Latest topics, so this is not unlisted.

1 Like

Thank you, I see it now too.

To pickup where the discussion last left off:

Thank you for being explicit. This makes sense.

With your suggestion, my script is running much better. My REPL inputs/outputs now looks like this:

>>> from linear import LinearSearch
>>> list_obj = [1,2,4,5,7,12,256]
>>> item = 7
>>> item_exists = LinearSearch(list_obj,item)
>>> item_exists.seek(256)
256
>>> item_exists.seek(2561)
False
>>> 

While I am closer to where I want to be, there is still a knowledge gap to be worked on. Take note of the second last input. When I search (or ‘seek’) for 256, the algorithm finds it, but the output I am expecting is the index value (so should be index 6). Instead it is showing the item (256). Why is my script behaving this way? As far as I can tell, within the for loop, the index is what should be returned. So something still isn’t right. What might you people recommend I try next?

My guess is that you expect that, in for index in self.entered_list, the variable index is taking values that are indexes. Its values are the elements of the list. So, when the if index == iterm is entered, the return index is returning the value found, not its position.

You could do for index, value in enumerate(self.entered_list), then compare if value == iterm and return index

1 Like

This is clear. Thank you.

Using enumerate as you’ve suggested is the technique I need. This fills the gap.

Here is my latest script:

class LinearSearch:
    # This class accepts a list as input
    def __init__(self, entered_list=None): 
        self.entered_list = entered_list
                
    def seek(self, item):
        # Loop through the list and check if the current element is equal to the item
        for index, value in enumerate(self.entered_list):
            if value == item:
            # If value found, return the index 
                return index
        # If the item is never found, return False
        return False

    def __str__(self):
        result = self.seek(self.index)
        if result:
            return str(result) 
        else:
            return "Not Present"

Here is me testing my script in my REPL:

› python
Python 3.11.6 (main, Nov 14 2023, 09:36:21) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from linear import LinearSearch
>>> list_obj = [1,2,4,5,7,12,256]
>>> item_exists = LinearSearch(list_obj)
>>> item_exists.seek(256)
6
>>> item_exists.seek(1)
0
>>> item_exists.seek(100000)
False
>>>

As you can see, now the index values are successfully being returned when found but when an item is not present, returns False.

I am much closer to where I want to be.

The only lingering question is how might you people recommend I better form my string representation in this context?

This isn’t exactly what is asked, but please do not call seek inside __str__, otherwise something bad will happen later. Not to mention you have removed the variable you are using there from your class.

Nothing about the design really makes any sense, so there isn’t a clear way to recommend an improvement.

The most obvious problem is that the algorithm does not need to be implemented at all; lists already have it built in as the index method. But I assume the idea is to do this as homework. (If you were given this design and required to do things exactly this way, I would strongly encourage you to drop the course and find a different way to learn Python - what they are teaching you does not make any sense and will not properly prepare you for real-world Python use.)

The next most obvious problem is that it makes no sense to create a class for this task. A class conceptually represents a new data type, not an algorithm. Think “the same kind of thing that int is”, not “the same kind of thing that addition is”. While Python does take the “everything is an object” principle very seriously, so that e.g. functions are objects, you don’t need to create a different kind of function in order to express the idea of a linear search - you just need a function that contains that logic.

See also:

Again, consider how the existing built-in solution is organized: the list is an instance of list, which is a type (the thing that you create in user code using class). The algorithm for searching is implemented as a method, which is similar to a function (it just has a “pre-bound” argument).

The next problem is that it makes no sense to use a string to represent the result. The point of getting a numeric index from a search, is to use it numerically - for example, to index back in to the list and replace the first found element. That isn’t possible with a string; the user would have to convert back to integer. I guess that your idea in using __str__ was to make it so that displaying the LinearSearch instance at the REPL lets you see the search result. But this is only a workaround for the previous wrong decision to create a class in the first place.

Relatedly, the organization of the code does not make sense. The purpose of __str__ is to explain what an object should look like at the REPL. Which is to say, the purpose is not to do any computation - the point is to reflect the existing, current state of the object. By doing things this way, the search gets repeated every time you try to display the instance.

The next problem is that it makes no sense to represent error conditions in the same way as valid results. Checking a string value to see whether it matches 'False' seems easy when you’re checking the result in the REPL, but you will quickly realize that it’s annoying (and non-standard, and easy to get wrong) when you want to use this code as part of a larger project.

Again, the built-in method shows how it’s done properly - by raising an exception:

>>> list_obj = [1,2,4,5,7,12,256]
>>> item = 6
>>> list_obj.index(item)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 6 is not in list

which we can deal with in our own code by the standard mechanism of try: and except::

>>> try:
...     print('the item is at index', list_obj.index(item))
... except ValueError:
...     print('the item is not in the list')
... 
the item is not in the list
1 Like

Your first iteration had this comment:

# Loop through the list and check if the current element is equal to the item

This is not what you get by returning the index when you use enumerate.

From your first attempt you were getting False because the final return was over-indented. If the code had been as below the result would have been as expected:

class LinearSearch:
    # This class accepts an array and an item
    def __init__(self, entered_list=None, item=None):
        self.entered_list = entered_list
        self.item = item
        
    def seek(self, item):
      # Loop through the array and check if the current array element is equal to the item
      for index in self.entered_list:
        if index == item:
            # If it is True, return the index at which the element is found
            return index
      # If the item is never found, return -1
      return False
    
    def __str__(self):
        result = self.seek(self.item)
        if result:
            return str(result) 
        else:
            return "Not Present"


list_obj = [1,2,4,5,7,12,256]

item = 7

item_exists = LinearSearch(list_obj,item)

print(item_exists.seek(item))

As others have pointed out there are issues with that code including:

  1. There’s no reason for this to be a class.
  2. Assuming you are using a class, say for fun and experimentation or because you want to store entered_listwhy is an instance being initiated with ‘item’ when you never use that item?
  3. Probably a bad idea to either return the item or False. Better to return True or False so the method returns a boolean OR return the item and None which can be type hinted as Optional[str]

In any case, you can make your code shorter as in the two examples below where the first follows your own pattern; and the second is how I would write it in which seek returns a bool and my calling code would do something with that return.

class LinearSearch_2:
    def __init__(self, entered_list=None):
        self.entered_list = entered_list
        
    def seek(self, item):
      for member in self.entered_list:
        if member == item:
            return str(member)
      return False
    

class LinearSearch_3:
    def __init__(self, entered_list=[]):
        self.entered_list = entered_list
        
    def seek(self, item):
      return item in self.entered_list

You are right that it is for a home work assignment. Although it might help if I add some context here: It is not a course for college or at a university or for any other brick and morter school like a costly boot camp with ridiculous tuition fees. Instead I am taking a Udemy course (MSRP $99 USD on sale for $10 USD in a recent ‘flash’ sale) titled “JavaScript Algorithms and Data Structures Masterclass” which teaches algorithms in JavaScript. What I do is watch the lesson where the instructor uses an interactive and dynamic visual aid to explain how each algorithm’s features work in general (which is language agnostic), provides the pseudo code (also language agnostic), then asks the student to try writing the script themselves in JavaScript, and finally provides the solution. I follow along and do all of that except I implement in Python instead of JS. Afterwards, I look at the solution in JS and make modifications to my Python code as necessary. If I have a question or run into an issue, I reach out to trusted forums like discuss.python.org where friendly cummunity members provide insight and helpful feedback.

LinearSearch is the very first algorithm. It’s just to get the student’s “feet wet”, so to speak. I realize Python has an index() method for the builtin list() class for completing the same task as I’ve attempted to model in this thread. While using a class may be overkill and pointless in this case, I believe the rest of the course material where the instructor uses animated content to describe algorithms such as:

  • tree traversals
  • graphs,
  • binary heaps
  • the radix sort,
  • Dijkstra’s algorithm

…and countless more advanced variations of said algorithms may be more suitable to use Python classes as compared to writing a silly basic rudimentary array search algorithm as I tried to do which you are all cautioning me to abandon my effort.

For those reading this, if you are interested, I have a personal GitHub repo where I intend on documenting my progress as I work on different algorithms in this course - - here is my foundational attempt at writing a Doubly Linked List. A Doubly Linked List algorithm is composed of eight different methods, of which I have only worked on the first - - the append feature. My project is to sprint through as many more as I can in Python.

I agree that a function is better suited in this case, but what do you think about using classes to implement other algorithms I listed above?

I like the way you’ve used an exception. In one of my improved iterations of my class, I removed the string represetnation and replaced the return False clause with this:

raise ValueError(f"{item} not in {self.entered_list}")

Or simply:

print(f"{item} not in {self.entered_list}")

Both of which are similar to yours, almost the same thing.

You have made it very clear that using a __str__ representation method simply doesn’t make sense in my rudimentary algorithm and that using a class is not the right solution altogether. You summarized:

Now for the crecsendo: The instructor of my Udemy course did not suggest writing a __str__ method. I don’t even know if JavaScript has a comparable type of statement for their classes. I know less abotu JS than I do about Python. So where did I get the idea to write a __str__ method? Among other queries, I asked ChatGPT something to the effect of:

How would I use a string method for an algorithm such using a LinearSearch class so that the output in my shell shows the target item if it exists or False otherwise?

The reason why I asked that question in the first place is that I gather that annotating classes in Python with __str__ representations generally speaking is “best practices” but, as the consensus among veteran and senior Python software programmers on this forum including @sgrey, @dem.ola, @franklinvp, in this situation, it is not.

I understand that ChatGPT is frowned upon, and I anticipate subsequent comments to tell me to also stop doing that going forward, but hey, at least I didn’t use that sorry cesspool of 15 year-old obsolete and rotten content on Stack Overflow. While ChatGPT may be better than Stack Overflow, and while cheap Udemy courses offer content taught by unqualified and amateur instructors, I figure I can trust and rely on this particular Python community to step forward to provide honest feedback as well as to provide closure and the final word.

Based on everything I’ve learned from what all the other forums members have said so far, my final take away is that to demo a basic linear search feature, using a function is the best approach. See here:

def seek(entered_list, item):
    for index, value in enumerate(entered_list):
        if value == item:        
            return index        
    raise ValueError(f"{item} not in {entered_list}")

Here are the REPL inputs/outputs for verification:

$ python
Python 3.11.6 (main, Nov 14 2023, 09:36:21) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from linear_func import seek
>>> entered_list = [2,3,4,6,87,89,93]
>>> item = 93
>>> seek(entered_list,item)
6
>>> item = 3
>>> seek(entered_list,item)
1
>>> item = 33
>>> seek(entered_list,item)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/gnull/dev/projects/python/2018-and-2020/algos/basic_linear_search/linear_func.py", line 6, in seek
    raise ValueError(f"{item} not in {entered_list}")
ValueError: 33 not in [2, 3, 4, 6, 87, 89, 93]
>>> 

Thank you all to have participated in this discussion up to this point. I look forward to your comments for those who feel inclined to reply further.

1 Like

As a general tip, don’t ask ChatGPT “how can I do X?”

It will basically never advise that doing X is a bad idea, unless giving instructions for X would violate the AI safety policy.

Treat ChatGPT as a search engine that rewrites what it found into natural language, and occasionally makes things up. It does not do any kind of critical reasoning.

There is no such thing as “adding X to the code is best practice” in a vacuum. You have to take into consideration the purpose of that addition.

It’s important to understand how Stack Overflow is supposed to work. It is not a place where you can go to ask for help with an assignment and get a personalized response. Stack Overflow isn’t a forum like this one, and it’s clearly designed and intended not to be one. It is, instead, a searchable Q&A reference library. When you post a question there, you should think of it as a bug report against the library: a claim that this question is missing (i.e. not a duplicate) and should be answered (and that other people would be able to find it with a search engine, and learn something from an answer). The intended standards are very high. The reason you find a “cesspool” is because the content comes from other random users, most of whom don’t understand how the site is intended to work. There are about 24 million open questions there - that’s more than triple the number of articles on Wikipedia, for nominally a much more restricted scope. So yeah, it’s a huge mess.

1 Like

FWIW, I wholeheartedly echo Karl’s comments.

YMMV, but IMO, data structures and specific mathematical algorithms are all too often overemphasized at an early stage in programming curricula (including the couple formal programming courses I did actually take) relative to their actual practical utility in most programming tasks; by contrast, basic concepts and principles, problem solving strategies, design patterns, code structure, software archtecture, DRY, not reinventing the wheel, and using documentation and online resource is all too often neglected or pushed to much later.

ISTM that this ultimately stems from most teachers of programming being computer scientists, or at least taught themselves by such; these former topics are indeed quite fundamental if you’re going to become a computer scientist (or language designer, or a handful of other relatively niche roles), but are far less so to the average programmer, particularly not as a beginner.

The main utility of implementing algorithms and data structures main utility appears to be as convenient if contrived examples for exploring and internalizing these latter more important concepts, but given your results thus far, it would seem that your course is focused on the “what” and not the “how” of this, and is not really teaching appropriate design patterns beyond rote axioms (e.g. "always supply a __str__—that’s nice, for sure, but doesn’t really matter when a class is not the appropriate construct to use to begin with).

</rant>

It seems you’ve intertwined “algorithms” and “data structures” a bit here. Generally speaking, algorithms are verbs (they do something) while data structures are nouns—they represent a “thing”. Accordingly, in general, algorithms typically map best to functions (though in some specific, more complex cases that involve a lot of non-trivial state, you might want to consider involving classes as well), while data structures are almost invariably represented using classes. TL;DR: action → verb → function; data → noun → class.

FWIW, while SO certainly has its issues for sure (as @kknechtel , a longtime leading SO contributor acknowledges), I’d still trust a relatively recent-ish SO answer with a decent number of votes over a random blog post, article farm or similar day of the week, which is most of the rest of what’s out there aside from the official documentation. Unlike all those other options, SO has the unique advantage of peer review by a large community of subject matter experts, as opposed to simply being one person’s (or machine’s) opinion, written to maximize clicks, length and/or ad/course revenue per unit of effort.

And as for ChatGPT, not only does it have the same issue lacking any human peer review whatsoever (even by an author), but where do you think it got the source data it trained on as far as programming questions and answers are concerned? Stack Overflow and similar sites—and not only the best questions and answers but the rest of the “cesspool” too, and without the benefit of the scores, comments and other indicators you as a human can use to evaluate answer quality. It does a great job mimicking convincing-looking answers, but therein lies the rub—it’s a language model, not a knowledge model, and unlike most real answers, the typically-excellent form of its answers gives no indication about whether the content it contains within is actually true.

Looks pretty good to me at least—that’s basically how I’d implement it, given your constraints. One note is that it works for any sequence type (e.g. tuples)—entered_list need not be an actual list.

Indeed; at least one lawyer found that out the hard way when he asked ChatGPT to come up with cases that supported a particular legal theory he was attempting to use to argue his client’s case. As no such real cases existed, ChatGPT obediently did as it was told and came up with outwardly plausible-looking case citations and summaries, which the lawyer promptly submitted as evidence, only to get in hot water with the judge when the opposing counsel quickly discovered that the cases were entirely fictitious.

1 Like