Help with replace()

lst = [‘company1_hat’, ‘company1_glasses’, ‘company2_hat’]
nlst = [el.replace(‘_’, ‘', '’) for el in lst]
print(nlst)
the output:

['company1", "hat', 'company1", "glasses', 'company2", "hat']

the desired output:

['company1', 'hat', 'company1', 'glasses', 'company2', 'hat']

could use split here,

nlst = []
for i in lst:
  nlst += i.split('_')
nlst

['company1', 'hat', 'company1', 'glasses', 'company2', 'hat']
1 Like

@vainaixr thank you dude

one more way to do this is,

[nlst := nlst + i.split('_') for i in lst]

…and one more - pure functional approach:

import itertools

lst = ['company1_hat', 'company1_glasses', 'company2_hat']

split_pairs = (item.split('_') for item in lst)
split_items = itertools.chain.from_iterable(split_pairs)

list(split_items)

Split to two expressions for better readability.

Note to the original code: You want to create a different number of strings. This cannot be done by using str.replace() because this function changes a string to a different string. This function cannot split or join strings which would be needed to change their number.

2 Likes

ok, one more question:
originally I got the idea to split list elements with the ‘_’ character because I needed to convert it to a dict and then compare it to another list like so:

first list = [“company1_hat”, “company1_glasses”, “company2_hat”]
second list = [“hat”, “glasses”]

so the question is how to determine which company has a missed item from the second list ?
So the result will be campany2 here for example

This will create dictionary of sets which is suitable for the task:

lst = ['company1_hat', 'company1_glasses', 'company2_hat']

split_pairs = (item.split('_') for item in lst)
companies_items = {}
for company, item in split_pairs:
    company_items = companies_items.setdefault(company, set())
    company_items.add(item)

print(companies_items)
{'company1': {'hat', 'glasses'}, 'company2': {'hat'}}

The rest is up to you. Show us the result please.

1 Like

This is sort of a quickly hammered out solution, but you’ll earn your stripes by parsing through the loops, @Moez.

Comparing to Václav’s split generator and dictionary-based approach, the benefits of choosing the optimal data type are very obvious. And I obviously ought to learn more about Python sets and dictionaries.

compositeList = ['company1_hat', 'company1_glasses', 'company2_hat']
itemList = ['hat', 'glasses']
companyList = []
nameList = []
inventory = []

for member in compositeList:
    nam = member[:member.find('_')]
    if nam not in nameList:
        #build a list of companies
        nameList.append(nam)

for company in nameList:
    for composite in compositeList:
        if company not in composite: continue
        #build a list of this comany's items
        inventory.append(composite[composite.find('_')+1:])
    for item in itemList:
        #does this company have this item?
        if item not in inventory:
            print(f"{company} is missing {item}")
    inventory = []

OUTPUT:

company2 is missing glasses

Václav, would you mind explaining how adding a member to the company_items set appends to the set inside of the companies_items dictionary?

The dictionary companies_items has companies as keys, and sets of items as values.

This gets the set of items for the given company. If the company does not exist in the dictionary yet, it adds it and initializes the value to an empty set: set().

    company_items = companies_items.setdefault(company, set())

Now company_items always contains either an existing set or a freshly created empty set and we can simply add the item:

    company_items.add(item)

Note that in the standard library there is also: collections.defaultdict with which you could squeeze the two statements into: companies_items[company].add(item).

How did you know this? When I looked up <dict>.setdefault(), in docs.python it only has a very short statement about what setdefault() returns if the key does or does not exist. It said nothing about creating an object reference to the value in the dictionary.

This is the entire entry:

setdefault (key [, default ])
If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.

I guess what’s not clear from this entry is that it returns a reference to the value. I imagined that it was returning the value as a value (speaking in terms of passing by reference or passing by value).

In Python every variable is a reference to an object. You cannot pass by value in Python.

I imagine it like an object must first exist then you assign a variable to it. A variable is like a label through which you can work with the object which still exists at the same place in memory. You do not pass it by value (by copying it).

If you use an immutable type - for example string:

    company_items = companies_items.setdefault(company, '')

Then the following code is pointless because it just changes the variable company_items to point to a newly created string and the dictionary value would still point to the empty string.

    company_items += item

You can change dictionary values through get() or setdefault() only through their mutation in-place. They must be of a mutable type to be able to do this.

Yes, I realize that now and know intellectually that “in Python, (almost) everything is an object.”

‘key : value’ is an unfortunate choice of terms. I have only used dictionaries for string and integer lookups for information purposes so far, not for object manipulation. Because of this, I was still applying the common definition of “value”.

‘key : item’ would be better, but ‘key : value’ is very embedded now.

You cannot pass by reference in Python either.

Pass by reference has a precise meaning, going back to Pascal in the 1970s and possibly as far back as Fortran in the 1950s.

The fundamental test for pass by reference semantics is to write a swap procedure which accepts any two variables, and swaps their values:

a = 1
b = 2
swap(a, b)
assert a == 2 and b == 1

Note the conditions: the names of the variables are not hard-coded, they can be any variable. The swap procedure does not return anything, it operates purely by side-effect.

You cannot write this swap() procedure in Python.

Python does not use “pass by reference” (or “call by reference”) semantics. Like many other languages, it uses parameter passing semantics which unfortunately goes by many names:

  • pass by object
  • pass by object reference
  • pass by object sharing
  • pass by sharing

Java, in particular, abuses the terminology by calling it “pass by value”, using the “logic” that given some variable:

x = 2

(say), the actual value of the variable, which the compiler copies to pass to a function, is not 2, but some invisible, untouchable machine address like 395632077210.

As the Python luminary the late Fredrik Lundh (the Effbot) wrote:

1 Like

Hi @mlgtechuser, I test the code it works fine but I changed it a little bit so it returns either the company name or ‘N/A’, so I wrote:

for company in nameList:
        for composite in uniformPieces:
            if company not in composite:
                continue
            #build a list of this comany's items
            inventory.append(composite[composite.find('_')+1:])
        for item in uniformSet:
            #does this company have this item?
            if item not in inventory:
                a = f"{company}"
                return a
        inventory = ['N/A']
        inventory[0]

the output remains fine nut only for the first case which is the company name, how to return N/A?

The last line in the nameList loop clears out the inventory for the next company, so it needs to be inventory = [] or inventory.clear().

What conditions apply to “N/A”?

the condition applies to ‘N/A’ if there’s no company has a missed component

Ah. I didn’t see that in the Software Specification. :wink:

How you capture the “company has all items” condition depends on what you’re planning to do with the data later, of course. What are you planning to do with “company X has all items”?

BTW,  a = f"{company}"  is equivalent to  a = company  since company is already a string.

If ALL companies have ALL items?

just return ‘N/A’ or ‘OK’ in case all companies have all items otherwise, the company name which has one or more missing components

Okay. You can use a conditional output message for that:

Set  msgOut = "All companies have all items."  before the company in nameList: loop.

…and in the  if item not in inventory:  section, set the message to empty string (''). You can print the msgOut unconditionally since it will be '' if a company is missing an item.