Removing sublist if matching a criteria

cheesebird · June 18, 2022, 3:27pm

Here’s one that is driving me mad. I’m trying to remove a sublist if the length is one and it only contains a substring ‘X1:’…

 s = [['X1:', '99'], ['X1:'], ['99'],['X1:'],['X1:'],['X1:'],['X1:','99','98']]


for ss in s:
    if len(ss) == 1:
       # print(ss)
        if 'X1:' in ss:
           # print (ss)
            
            s.remove(ss)
            
print(s)

Output…

[[X1:', '99'], ['99'], ['X1:'], ['X1:', '99', '98']]

You can see it removes all but one of the required sublists. I’ve tried getting the index , del , pop and no success. Can anyone see what I’m doing wrong ?

mlgtechuser · June 18, 2022, 3:31pm

Here’s one way:

for i in range(len(s)-1,0,-1):      #go backwards so pops don't mess up indexing
    if s[i][-1] == 'X1:': s.pop(i)  #checks last element in item.
                                    #If only one member...
                                    #...first and last are the same.

Now all that remains its to see if you come back and say “That won’t work because of <this information that I didn’t mention>.”

This works, too:

for i in range(len(s)-1,0,-1):
    if s[i] == ['X1:']: s.pop(i)

It also works as a single-line for: list comprehension:

[s.remove(ss) for ss in s[::-1] if ss == ['X1:']]

cheesebird · June 18, 2022, 4:13pm

@mlgtechuser

Thanks I tried it but only seems to drop 1 of the required sublists

 = [['X1:'],['99','98'],['X1:'],['X1'],['X1']]

for i in range(len(s)-1,0,-1):
    if s[i] == ['X1:']: s.pop(i)
print(s)

Output…


[['X1:'], ['99', '98'], ['X1'], ['X1']]

I’m hoping to drop all.

mlgtechuser · June 18, 2022, 4:14pm

I get this every time:

[['X1:', '99'], ['99'], ['X1:', '99', '98']]

Try the list comprehension. That’s the version I have in VS Code right now.

[s.remove(ss) for ss in s[::-1] if ss == ['X1:']]

cheesebird · June 18, 2022, 4:16pm

@mlgtechuser

You’re right , I’m on my phone and can’t see properly :-(.

Many thanks for the solution it was driving me mad

cheesebird · June 18, 2022, 6:37pm

Here’s the information I forgot to mention part…

I’ve just realized my intial example has some flaws in the list provided . The sublists I’m trying to remove have only 1 element and can contain X1: followed by some text . X1: probably won’t appear as a standalone. See updated example.

So how would i check the subslist only has one element and only contains a X1: Blah Blah.

mlgtechuser · June 18, 2022, 7:18pm

Haha!

I called that one, didn’t I?

We just need a variation on the same solution. We can still identify the singles by finding where the “last” element ~.startswith('X1'). This might put the list comprehension solution out of commission, though.

mlgtechuser · June 18, 2022, 7:24pm

This one works:

s = [['X1:', '99'], ['X1: Cheese'], ['99'],['X1: Cheese2'],['X1: Cheese3'],
     ['X1: Cheese4'],['X1: blah blah','99','98']]
for i in range(len(s)-1,0,-1): 
    if s[i][-1].startswith('X1:'): s.pop(i)

-> [['X1:', '99'], ['99'], ['X1: blah blah', '99', '98']]

Ditto:

for i in range(len(s)-1,0,-1):
    if s[i][-1].startswith('X1:'): s.pop(i)

The list comprehension still works:

[s.remove(ss) for ss in s[::-1] if ss[-1].startswith('X1:')]

Finishing the set:

for i in range(len(s)-1,0,-1):      #go backwards so pops don't mess up indexing
    if s[i][-1].startswith('X1:'): s.pop(i)

The fifth one is a list comprehension using 'pop()`. I hadn’t gotten it working at all yet. Not sure it will work.

[EDIT] Got it:

[s.pop(len(s)-1-i) for i,ss in enumerate(s[::-1]) if ss[-1].startswith('X1:')]

cheesebird · June 19, 2022, 6:06am

Thanks so much, So many choices but i pick this one as the winner.

mlgtechuser · June 19, 2022, 8:57am

You’re most welcome, Ross.

Out of curiosity, what appeals to you about the longer list comprehension?

I would have picked:

[s.remove(ss) for ss in s[::-1] if ss[-1].startswith('X1:')]

cheesebird · June 19, 2022, 9:08am

I wasn’t 100% sure where you were checking that there was only one element in sublists marked for deletion?

mlgtechuser · June 19, 2022, 9:21am

Yeah, that’s the thing with complex list comprehensions; they tend to have a lot of implicit actions. One should definitely go with whichever one is most understandable to them.

It’s like the question “What’s the best life jacket to have if you fall overboard?”

Answer: The one you are wearing!

P.S. You’ll find that they both contain ss[-1].startswith('X1:'), which is the “check that list has a single element” piece.

The enumerate(), on the other hand, is more explicit in the long one.

vbrozik · June 19, 2022, 3:19pm

Please, nooo
This is against important good manners in programming style, a way how to create obscure code.

The comprehensions / generator expressions are functional style. You are supposed to read data, and make new data from them. Here you are modifying the original data using the methods list.pop() and list.remove(). Please do not misuse comprehensions as another way to write a for loop. If you want to perform side-effects, use a for loop.

No, this code checks that a string in the last element of iterable ss starts with X1:

Implementation in the functional style:

Input data

s = [['X1:', '99'], ['X1: Cheese'], ['99'],['X1: Cheese2'],['X1: Cheese3'],
     ['X1: Cheese4'],['X1: blah blah','99','98']]

List comprehension (split to multiple lines for readability):

[
    item for item in s
    if not(len(item) == 1 and item[0].startswith('X1:'))]

You can replace the square brackets for round ones to get a generator.
The old-school functional style:

filter(lambda item: not(len(item) == 1 and item[0].startswith('X1:')), s)

Result (after storing to a list)

[['X1:', '99'], ['99'], ['X1: blah blah', '99', '98']]

mlgtechuser · June 20, 2022, 2:10am

Exactly. And if X1: is last, it’s because X1 is the only element in that sublist. (X1 is always first.)

Agreed. I did this one as an experiment really just to see if it was possible. It’s probably my least favored solution for the reasons you pointed out.

I enthusiastically agree that list comprehension is often used (mis-used) for convoluted parsing, which is why I tend not to use it much. I’d like to know more about their intended proper use. Do you know of a fundamental reference on its use intent that includes a good summary?
(I’ll search myself when I get back to my computer; am on my phone right now.) [EDIT: ] This appears to be a good one: Functional Programming HOWTO.

vbrozik · June 20, 2022, 7:49am

Now I understand, it was with this assumption! I almost always try to write a robust code which does not depend too much on a strict format of the input data. For example the shorter code will fail silently ^[1] with the input ['99', 'X1: Hello'].

I think Tutorial Library – Real Python has good tutorials. Also some talks on PyCon have interesting insight into the functional style. From the latest PyCon US: https://www.youtube.com/watch?v=2gPdodp6i3Y

incorrect behaviour, no error message ↩︎

mlgtechuser · June 20, 2022, 2:08pm

Thank you, Václav. Now I have some homework to do! Do those references cover what you said about not changing data with list comprehension? (Generators, for sure, but I’m looking to increase my understanding of the theory and philosophy of list comprehensions and correct any misconceptions that I might have.)

ROSS: apologies for hijacking your topic a little bit. At least this is relevant to the list parsing marathon we’ve been on.

vbrozik · June 20, 2022, 2:41pm

Unfortunately I do not remember. I think that the description on Wikipedia is good:

mlgtechuser · June 20, 2022, 3:02pm

Vunderbar. It’s also well-summarized in the Functional Programming HOWTO at docs.python:

Functional programming can be considered the opposite of object-oriented programming. Objects are little capsules containing some internal state along with a collection of method calls that let you modify this state, and programs consist of making the right set of state changes. Functional programming wants to avoid state changes as much as possible and works with data flowing between functions. [emphasis added]

Mutating the iterated data is obviously a state change.