Finding the number of words in each line

What I want to do is iterate over the whole text file line by line and get the computer to count how many words are in each line. Then I want the computer to print something like this:
line 1 = 3
line 2 = 8
line 3 = 10
ect
This is my code:

with open('file2.txt','r') as file:
    lines = file.readlines()
for i in range(len(lines)):
# I want i to run over each line in the text file
    line = lines[i]
    words = line.split()
# I want to split each line into individual strings of words
    word_count = len(words)
# Then the word count is the number of words in each line. 
print(word_count)

But it prints literally the number 1, please help!

Maybe like this

with open('file2.txt','r') as file:
    lines = file.readlines()
    # Indented to be inside the `with`.
    for i, line in enumerate(lines):
        words = line.split()
        word_count = len(words)
        # Indented to be inside the `for`.
        # Using an f-string we can print the 'line' and '=' with the
        #     information that is changing each time inserted in between.
        print(f'line {i + 1} = {word_count}')

Thank you,

I got an error with using enumerate - is there a way to do this without using that function?

Which error was it?

You could do that part the way you were doing

    # ...
    for i in range(len(lines)):
        line = lines[i]
    # ...

It was an invalid syntax error - I haven’t ever used enumerate before so I am reluctant to use it.

The code that I originally posted doesn’t give me what I want. I wanted the code to iterate through each line and tell me how many words are in each line.

I don’t know why it gave you a SyntaxError. A typo, perhaps !?
It looks like enumerate has been part of Python for a long while.
So, probably not because we are using different version of Python.
Over here it works.

Check your typing, Python is case sensitive. Also look for extra letters or punctuation.

Or post your code here. A beginner often has trouble with details like this. Especially if they are using a bad font. Always use a fixed-width font for writing code.

If you post your code then I can run it here on Python 3.11. Then I can see if it works on my machine.

Hi,

Thank you for yout response, I have tried doing it like this instead:

dict_3 = {}
with open('result2.txt') as f:
    listing = list(f)

   
    for line in listing:
        if len(line)>=0 and len(line) in dict_3:
            dict_3[len(line)] += 1 
print(dict_3)

However, my dictionary is just an empty dictionary.

Hi,

if you start off with dict_3 = {}, it is an empty dictionary - no key-value pairs exist.

In the following conditional statement:

if len(line) >= 0 and len(line) in dic_3:
  1. len(line) >= 0 is always true because if it is empty or includes words, its true.

  2. len(line) in dict_3 is always false since if there are lines in a given line, it will never be equal to a key:value pair in an empty dictionary.

Thus, you will always have the following result in the conditional statement:

True and False

When you and a True and a False, it is always False.

What you want is:

dic_3 = {}

with open('result2.txt') as file:
    Lines = file.readlines()

for line_num, line in enumerate(Lines):

    dic_3[str(line_num)] = len(line.split())

print(dic_3)

Hi,

Thank you for your response.
I am trying not to use enumerate because I don’t want to install packages, I want to do it myself.
So far I have now written:

opening = open('result2.txt','r')
open1 = (opening.readlines())
open2 = open1.split('\n') #I want to get rid of all blank lines in the text otherwise the data will be 
# skewed in a way in whcih is represents a lot of lines with a length of 0. 
dict_3 = {}
for line in open2:
    if line in dict_3:
        dict_3[len(line)] += 1
    else:
        dict_3[len(line)] = 1
sorted1 = sorted(dict_3.items(), key = lambda x:x, reverse = True)
print(sorted1)
#Since this is ordered line by line, I can see how many words are in each line
#Now,I want to plot a distribution curve to visually see how the number of words in each line changes
# I will plot a a line graph because there are 804 lines and this would not fit on a bar chart

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
plt.plot(range(len(dict_3)), list(dict_3.values()),c='purple')
ax.set_title('The distrubution of line lengths across the play') 
ax.set_ylabel('line length')
plt.show()

However, it gives that every line has a length of 1 which is clearly not right.
I also know that I can’t use open2 = open1.split('\n') on this because I get the error AttributeError: 'list' object has no attribute 'split' However I do need to remove all the added lines in the text file.
Please can someone help?

Enumerate is a built-in Python function. No installation required.

Ah okay, I still do not want to use it however - I am trying to do it without

This makes no sense. Why are you using the number of characters in the line as a key? Is this supposed to be an index like what you’d get from enumerate? If so, you can’t do this like this. You need to set up an index variable to zero outside of the loop and increment it at the end of each loop.

Because when I print out the dictionary I want it to be number of words in line: line number
Do you mean set up count = 0 then add one?

You should do all your imports at the beginning of the program before anything else. This will reduce a few pesky errors.

len(line) is neither of these things. Try stepping through your code with pdb and seeing what is actually happening as the code runs.

What is pdb?

The following code fulfills this requirement. The best way to keep track of the line number is using the enumerate built-in keyword, however, since it does it automatically.

import pylab as pl

dic_3 = {}
line_num = []
num_words = []

with open('read_words.txt') as file:
    Lines = file.readlines()

for line_no, line in enumerate(Lines):
    line = line.rstrip('\n')  # Remove '\n' character (comment out for testing)

    if '\n' in line:  # Test if newline is included (delete in final version)
        print('\\n')

    dic_3[str(line_no)] = len(line.split())
    line_num.append(line_no)
    num_words.append(len(line.split()))

plot1 = pl.plot(line_num, num_words, 'b')
pl.xlabel('Line #')
pl.ylabel('# of Words')
pl.title('Line # vs. # of Words')
pl.plot(line_num, num_words, 'ro')
pl.show()

To test this script, I used the following text file:

Hello, everyone? How is everyone doing?
Today is a good day to start our lesson on how to read the number of words in a string.
We will be making use of the keywords with, for, len, and split.
Does anyone have any questions?

I obtained the following result:

line_no_vs_no_words