List comprehension problem

Hey guys I am trying to remove specific elements from a list that has 200+ links inside of it, I am trying to append the links to a .txt file and then open the txt and use the split() to turn it into a list from the .txt. with the seperator “,” because its a link and has a lot of text.

the method I am using is
plinkssplit = plinksprint.split(“,”)
plinkssort = [xl for xl in plinkssplit if “item” in xl]
print(plinkssort)

problem is, the link im hunting for is aliexpress/item, and even though I specified “item”, I am still getting exactly 18/13 more links than I need, depending on what appears when the page loads.
for example, the aliexpress home page: www.aliexpress.us/, which contains no “item” as seen in the above code.

How can I fix this??

Hello,

I hope I understood your problem correctly. Is it that you want to filter out links from a list that have the word item within it?

If so, you may try something like this:

# Three links have the word 'item' from the six listed
a = 'www.hello.com/sdf/ghi44/item'    # item here
b = 'www.anyone.com/sdfjkg/344/dsf'
c = 'www.there.com/item/44/456sd'     # item here
d = 'www.waiting.com/fjlk/654kl/zklj'
e = 'www.for.com/lkjt/did89'
f = 'www.response/fju/item/kfjk/di'   # item here

link_list = [a, b, c, d, e, f]

links_with_item = [link for link in link_list if 'item' in link]

print(links_with_item)

The output was:

['www.hello.com/sdf/ghi44/item', 'www.there.com/item/44/456sd', 'www.response/fju/item/kfjk/di']

Note that there was no need for the split function. So long as the word link exists, the script will detect it.

2 Likes

That’s what I was looking for, you are correct. It’s also good to note that a lot of people say

[link for link in link_list if 'item' not in link]

not in

when for the first link, or x variable to be set to true it needs to be in

basically it’s variable = [true/false for item1 in item_list if “variable” in item1]

I figured this out earlier, I made another post about how I’m trying to clean up the .txt files that are being produced by converting webdriver data to lists and appending to .txt.

I then have to open them and close them again to do this search on them, because python won’t save the data correctly into a list so it must be saved into a .txt.

finally I have to open it a third time to get the data out, which means the .txt has to be converted back into a list a third time and I’m having trouble converting it into a list and getting the seperator.

basically:

var = file.strip() won’t work because file is an textwrapper
var = list(file) will work
then I do
var = str(file).strip()
because the strip won’t work inside the file (), it won’t work if I convert to a list at the same time, it wont work on the webdata or textwrapper itself, and it won’t seem to work if I do it in a seperate variable after converting to a str.

lotsa problems.

Can you give an example how Python “won’t save the data correctly into a list” but does save it correctly into .txt file?

So, if I appoint the data from selenium driver.find_element(By.XPATH) to a list, it returns a web object. If I convert that into a list and print, it won’t. I have to make a variable = list(webdata)
then I have to convert that list into a string, then split again into a list again to finally get a proper list. Because I am iterating through the webdata and appending it to the list, when I try to access the list it won’t access anything but empty . It basically iterates 1 at a time, adds that to memory to print, then removes it from the list.

if I add this data as I iterate it to list, then print it to a .txt, it saves everything from the backend memory into the .txt where I can access it (not 1 at a time).

Simplest way I can put it.

example code:

for n in x:
list.append(n)

in this, depending on the situation, n is either unsupported data and must be converted to string/text first, but can’t be converted easily until it’s all saved into .txt (when I should just be able to save into a list)
it also may iterate n 1 at a time, then if I print list, it will only give me one index, like theres only one element. but if I add the entire iteration to another variable, ie:

newlist = list.append(n)

than newlist may print.
it’s very janky.

there’s also no tostring() and that makes it impossible to do this data conversion easy.