Hey guys I am trying to remove specific elements from a list that has 200+ links inside of it, I am trying to append the links to a .txt file and then open the txt and use the split() to turn it into a list from the .txt. with the seperator “,” because its a link and has a lot of text.
the method I am using is
plinkssplit = plinksprint.split(“,”)
plinkssort = [xl for xl in plinkssplit if “item” in xl]
print(plinkssort)
problem is, the link im hunting for is aliexpress/item, and even though I specified “item”, I am still getting exactly 18/13 more links than I need, depending on what appears when the page loads.
for example, the aliexpress home page: www.aliexpress.us/, which contains no “item” as seen in the above code.
I hope I understood your problem correctly. Is it that you want to filter out links from a list that have the word item within it?
If so, you may try something like this:
# Three links have the word 'item' from the six listed
a = 'www.hello.com/sdf/ghi44/item' # item here
b = 'www.anyone.com/sdfjkg/344/dsf'
c = 'www.there.com/item/44/456sd' # item here
d = 'www.waiting.com/fjlk/654kl/zklj'
e = 'www.for.com/lkjt/did89'
f = 'www.response/fju/item/kfjk/di' # item here
link_list = [a, b, c, d, e, f]
links_with_item = [link for link in link_list if 'item' in link]
print(links_with_item)
That’s what I was looking for, you are correct. It’s also good to note that a lot of people say
[link for link in link_list if 'item' not in link]
not in
when for the first link, or x variable to be set to true it needs to be in
basically it’s variable = [true/false for item1 in item_list if “variable” in item1]
I figured this out earlier, I made another post about how I’m trying to clean up the .txt files that are being produced by converting webdriver data to lists and appending to .txt.
I then have to open them and close them again to do this search on them, because python won’t save the data correctly into a list so it must be saved into a .txt.
finally I have to open it a third time to get the data out, which means the .txt has to be converted back into a list a third time and I’m having trouble converting it into a list and getting the seperator.
basically:
var = file.strip() won’t work because file is an textwrapper
var = list(file) will work
then I do
var = str(file).strip()
because the strip won’t work inside the file (), it won’t work if I convert to a list at the same time, it wont work on the webdata or textwrapper itself, and it won’t seem to work if I do it in a seperate variable after converting to a str.
So, if I appoint the data from selenium driver.find_element(By.XPATH) to a list, it returns a web object. If I convert that into a list and print, it won’t. I have to make a variable = list(webdata)
then I have to convert that list into a string, then split again into a list again to finally get a proper list. Because I am iterating through the webdata and appending it to the list, when I try to access the list it won’t access anything but empty . It basically iterates 1 at a time, adds that to memory to print, then removes it from the list.
if I add this data as I iterate it to list, then print it to a .txt, it saves everything from the backend memory into the .txt where I can access it (not 1 at a time).
Simplest way I can put it.
example code:
for n in x: list.append(n)
in this, depending on the situation, n is either unsupported data and must be converted to string/text first, but can’t be converted easily until it’s all saved into .txt (when I should just be able to save into a list)
it also may iterate n 1 at a time, then if I print list, it will only give me one index, like theres only one element. but if I add the entire iteration to another variable, ie:
newlist = list.append(n)
than newlist may print.
it’s very janky.
there’s also no tostring() and that makes it impossible to do this data conversion easy.