Help with a Line Counter

I’m almost a complete beginner when it comes to Python, so I was hoping somebody could help me with a pretty specific program. Basically, I want it to count the lines between lines with a specific number.

For example, say you have the data below in a .csv file. I want it to count the lines between lines with a “56” in them.

4, 7, 24, 56, 98, 104, 194

1, 23, 45, 67, 78, 98, 101

3, 24, 35, 65, 67, 89, 99

6, 7, 45, 56, 77, 98, 156

8, 12, 16, 24, 78, 94, 111

5, 10, 21, 32, 56, 78, 82

How would I make the output say the following?

2

1

I’m not even sure if this is possible in Python, but if it is, I’d appreciate any help I could get.

Hello. @Ruthless12, and welcome to Python Software Foundation Discourse!

Yes, it can be done in Python.

We should get some clarification regarding possible special cases. What should be output if the input is as follows?:

1, 23, 45, 67, 78, 98, 101

4, 7, 24, **56,** 98, 104, 194

6, 7, 45, **56,** 77, 98, 156

3, 24, 35, 65, 67, 89, 99

In that case, it should output 1 and 1. It should count all of the lines at the start and end of the file without a 56 in them. Hopefully I’m explaining it well enough.

1, 23, 45, 67, 78, 98, 101 - Since there is no 56 above this line, it would count it as 1 line.

4, 7, 24, **56,** 98, 104, 194 - It would skip this since it has a 56.

6, 7, 45, **56,** 77, 98, 156 - It would skip this since it also has a 56.

3, 24, 35, 65, 67, 89, 99 - It would count this as 1 line.

Then the output would be:

1

1

You mean you want to suppress the zero-counts in the output? In your first example without the zero suppression it would be:

0
2
1
0

In the second example:

1
0
1

BTW: Do not you want that someone writes the complete program for you? Why do not you show what you already tried and which problems you encountered?

So, would we be correct in stating that wherever you have a run of one or more consecutive lines without a 56, you would like to count the number of consecutive such lines, and when you discover the end of such a run, you want to output that count?

You have double asterisks before and after the 56,. Will they be present in the actual input?

Some Python documentation that might be helpful to you would be:

With the above, you can split each line into elements, resulting in a list for each line. Then you could test the list to find out whether it contains a 56. If asterisks are included in the input, you would need to do a little more than that, but it would still be quite doable.

Please give it a try and post your code.

EDIT:

Another approach would be to use the csv module. See csv — CSV File Reading and Writing.

I think I understand, @Ruthless12. Is this correct?

The program should start counting from zero.

  1. Read a line
    • If ‘56’ is in the line then…
      • if count > 0, then…
        • print count
      • set count = 0
      • repeat step 1
    • else:
      • repeat step 1

This has two parts:

A. Read the line from the file.
B. Loop through the lines to find ‘56’.

Have you looked for how to do one or both of these parts? Where is this exercise from? Is this for school?

Also- the csv library is not needed to read data from a file. You can use the open() instruction instead. The csv library is useful, but not needed for basic reading and writing with files.

1 Like

Agreed. The csv library might be appropriate for more complex data on another occasion.

Here is a link to official reference material for the open() function:

For the current situation, you might do something like this to get started:

input_file = open("data.csv")

Then you will need to read the file and process the lines of data within a loop.

Sorry for not replying everybody. I’ve been having issues with identity theft, so I’ve been a bit preoccupied. I’ll reply to you all soon.

I tried to do something like this, but as I said, I’m very new to Python (basically a complete noob). Anyway, I tried my code with the following sample data using “4” as the user_num value, but I didn’t get the exact results that I wanted. This would be the Data.csv file.

1,2,3,4,5
2,4,6,8,10
1,5,7,8,12
1,4,7,9,10
6,8,14,32,5
7,14,68,43,19
2,4,6,8,12
1,8,9,16,32

This is my code:

from _csv import reader

user_num = input("Enter a number: ")

count = 0

with open("Data.csv", "r") as file:
    csv = reader(file)
    for row in csv:
        if user_num in row:
            print("-")
        else:
            count += 1
            print(count)

And this was the output:

-
-
1
-
2
3
-
4

In the end, I’d like it to look like the following. The “-” is just a placeholder for now so I can see what it’s doing.

1
2
1

With the placeholder, it would look like:

-
-
1
-
2
-
1

I hope this all makes sense. Thanks for the help! Also, this isn’t for school or anything. It’s just a thing I want to make for myself. Its purpose is to average the lines in the end. For example, lets say each row were the results of a die being tossed 5 times and I want to calculate the average amount of times a die can be tossed without getting a 5 in a set. The program would count how many rounds in a row didn’t have a 5 each time and assign that a numerical value. After that, I would take all of the numerical values and average them. For this example, the average is 1.6, so you can just say that you’re likely to go two times in a row without rolling a 5 since the values would be 1,2,2.

1,1,3,5,5
2,4,3,1,6      One line without a 5 here.
2,5,6,3,1
1,4,3,6,5
2,2,4,3,3      One line without a 5 here.
6,1,6,3,1      One line without a 5 here.
4,4,3,5,1
1,6,2,2,3      One line without a 5 here.
4,2,1,3,2      One line without a 5 here.
3,5,1,2,6
1 Like

Since the counter is always increasing, try working in this step from the pseudocode I posted above:

set count = 0

After that, you can be more selective about when to print the output. With the latest code, the output is printed for every line checked.

1 Like

This might be a good set of data for testing the program:

6,4,4,2,3
1,1,3,5,5
2,4,3,1,6
2,5,6,3,1
1,4,3,6,5
2,2,4,3,3
6,1,6,3,1
4,4,3,5,1
1,6,2,2,3
4,2,1,3,2
3,5,1,2,6
1,3,4,2,4

Note that neither the first nor the last line contains a 5. It is good to test for edge cases.

If you are counting runs of consecutive lines without a 5, the output should be:

1
1
2
2
1

Plan carefully for when to set the counter to 0 and when to output a count.

Thanks for the reply! I’ve attached my updated code below, but why doesn’t it count the last line when there’s no remaining 5s?


import csv

user_num = input("Enter a number: ")
count = 0

with open("Data.csv", "r") as file:
    reader = csv.reader(file)
    for row in reader:
        if user_num in row:
            if count > 0:
                print(count)
                count = 0
        else:
            count += 1

This is my output:

1
1
2
2

I also used your data:

6,4,4,2,3
1,1,3,5,5
2,4,3,1,6
2,5,6,3,1
1,4,3,6,5
2,2,4,3,3
6,1,6,3,1
4,4,3,5,1
1,6,2,2,3
4,2,1,3,2
3,5,1,2,6
1,3,4,2,4

When the for loop has completed all its iterations, and therefore terminates, all the lines of data have been read. At that point, you need to check just one more time whether count exceeds 0. If it does, you need to display it.

That set of data was designed to reveal that need.

So is this how I would do it? It seems to be working now, but I’d like to verify it with you.


import csv

user_num = input("Enter a number: ")
count = 0

with open("Data.csv", "r") as file:
    reader = csv.reader(file)
    for row in reader:
        if user_num in row:
            if count > 0:
                print(count)
                count = 0
        else:
            count += 1
    if count > 0:
        print(count)

I made a simple data set, and I used 5 as the target number. This was my data set.

1,2,3,4,6
1,2,3,4,5
1,2,3,4,6
1,2,3,4,6
1,2,3,4,5
1,2,3,4,5
1,2,3,4,6
1,2,3,4,5
1,2,3,4,5
1,2,3,4,6

And this was the output.

1
2
1
1

Does everything look right to you?

2 Likes

It also produces correct results on my machine.

Yes, but the most important issue is whether you understand how it works.

See the following:

There’s a yellow rubber duck sitting there, reading your code, eager to learn Python *. Can you “explain it, line-by-line, to the duck”?

You don’t actually need to elaborate here, in writing, how it works, unless you would like us to verify your understanding of it.

*Admittedly, a close look at the code in the picture reveals evidence that it is written in a language other than Python. :wink:

Thanks for the help! I think I understand how it works for the most part. I’m just so new to Python that it took a while to see how it all fit together.

#This imports the csv package.
import csv

#This asks for the user to enter a number to skip.
user_num = input("Enter a number: ")
#This is the default count value.
count = 0

#This tells the program which csv file to use.
with open("Data.csv", "r") as file:
#This returns a reader object that iterates over rows.
    reader = csv.reader(file)
#This tells the program to read the rows of the csv file
    for row in reader:
#This says if the user input is in the current row,
        if user_num in row:
#Check if count is greater than 0.
            if count > 0:
#If it is, print the value of count and set count back to 0 to start the loop again.
                print(count)
                count = 0
#Otherwise add 1 to the value of count and repeat.
        else:
            count += 1
#Once the loop breaks, this checks if count is still greater than 0.
    if count > 0:
#If it is, it prints the result. This is for those last lines that don't have the user input in them.
        print(count)
1 Like

Actually, you are setting count back to 0 to start the counting again, rather than to start the loop again.

Be aware that you are saving the user’s input as a string rather than a number, and that each row is a list of strings rather than numbers. If you plan to work with the data as numbers, you will need to convert it. Also, remember that you can implement your algorithm without using the csv module.

1 Like

Quercus’ post above and mine here might be cases of word choice rather than misunderstanding. Let us know if they clarify your understanding or are simply diction (word choice).

This is more clearly stated as:

  • Once the loop finishes
  • Once the loop ends
    • (but “end” could also refer to a single loop cycle since the loop has a start and an end.)

To ‘break’ a loop is to jump out of it in the middle before it finishes naturally, like this:

for n in range(0,10):
    print(n)
    if n == 5:
        break
print("Loop done")

This loop will stop at 5 and run the code after the loop.

There is also a way to stop a loop cycle and start the next cycle, but I don’t want to add something else to your learning until you fully understand what you’re already working on.

2 Likes
    if n == 5:
1 Like

Just for inspiration for further learning and exploring the standard library I am adding other methods.

This one avoids replicating the condition and print after the loop (DRY principle) by adding a special “row” to the reader iterator. In some cases this could lead to clearer solutions, here it is questionable :slight_smile:

import csv
import itertools

input_file_name = "Data.csv"

user_num = input("Enter a number: ")

count = 0
with open(input_file_name) as file:
    reader = csv.reader(file)
    for row in itertools.chain(reader, (None,)):  # append None "row"
        if row is None or user_num in row:        # recognize the None "row"
            if count:
                print(count)
                count = 0
        else:
            count += 1

This one uses collections.Counter which is especially useful for counting unordered items. Also the example shows separation between data processing and output which is normally desirable in larger programs. This separation automatically resolved the printing of the last value for us :slight_smile:

import csv
import collections

input_file_name = "Data.csv"

user_num = input("Enter a number: ")

counters = collections.Counter()        # dictionary of counters
count_index = 0                         # index of the current counter
with open(input_file_name) as file:
    reader = csv.reader(file)
    for row in reader:
        if user_num in row:
            count_index += 1            # move to the next counter
        else:
            counters[count_index] += 1  # increment the current counter

for index, counter in sorted(counters.items()):
    print(counter)

Here collections.Counter is an overkill. List of integers as counters would be easier :slight_smile:

2 Likes