Extract content rows without letters from a txt file

sam3 · November 28, 2024, 7:41am

Hello everyone, thanks for helping me.I have a TXT file with content like this:

Time Volume Price Type
27/11 16:26:27 4 318.00 A
27/11 16:16:35 1 330.00 A
27/11 15:46:50 1 350.00
27/11 15:03:39 1 269.00
27/11 14:57:49 1 225.00 B
B:335(707)(49.7%) A:364(634)(44.6%) S:53(82)(5.8%)

I want to extract those rows without letters to form 2 lists as follows:
list1
27/11 15:46:50 1
27/11 15:03:39 1
list2
350.00
269.00

Codes created so far are:
with open('content.txt ') as f:
contents = f.readlines()

I just don’t know how to add criteria for dropping the rows ended in letter in codes.
Thank you for the help.

aivarpaalberg · November 28, 2024, 2:47pm

Is it a homework?

onePythonUser · November 28, 2024, 3:31pm

Hello,

you’re a few for loops and conditional statements away from the solution to this homework problem:

The only code you have thus far is reading the contents of this text file. You still need to process the information that you have read. I would suggest reading up on the concepts of iterables, for loops, and conditional statements.

sam3 · November 28, 2024, 3:38pm

No, I just do it for speeding up my daily work.

sam3 · November 28, 2024, 4:10pm

Thank you for the advice

eric.fahlgren · November 28, 2024, 5:52pm

Simplest is to just filter them on read:

    contents = list(filter(lambda s: s and not s[-1].isalpha(), f.readlines()))

Be sure to read about the filter function Built-in Functions — Python 3.13.0 documentation
and str.isalpha method Built-in Types — Python 3.13.0 documentation
to enrich your understanding.

onePythonUser · November 28, 2024, 6:39pm

This function doesn’t work quite like you think it does because “-1” references the newline character “\n”. Even if you try and usurp the newline character, by instead using “-2”, it still would not apply to the last line since its second to last character isn’t a capital letter.

aivarpaalberg · November 29, 2024, 8:47am

One way to solve the problem is to have a plan in spoken language and then just translate it to Python. In broad terms:

get the rows we want from the file
process these rows

In order to make more specific plan we must inspect inventory we have - in this case the file content. Based on our objective of getting rows we want:

skip rows which end with letters (header row, and rows 2 3, 6)
skip rows which start with letter (last row)

So it appears that we can use str.isalpha() for checking whether first (at index 0) and last (at index -1) are letters. However, we should keep in mind that rows in files have newlines at the end. These could be stripped using str.rstrip() before checking.

To process rows and split them we can use str.rsplit() with maxsplit argument. We can then simply append values to two lists.

Put it all above together:


with open("content.txt", "r") as f:
    start = []
    end = []
    for row in f:
        text = row.rstrip()
        if text[-1].isalpha() or text[0].isalpha():
            continue
        else:
            first, last = text.rsplit(maxsplit=1)
            start.append(first)
            end.append(last)

print(start)
print(end)

# will produce:
['27/11 15:46:50 1', '27/11 15:03:39 1']
['350.00', '269.00']

sam3 · November 29, 2024, 5:44pm

Thank you very much for detailed explanations.

sam3 · November 29, 2024, 5:45pm

Thank you for the advice