Splitting a string on multiple delimiters

My issue is the same as in this topic:

However, the suggested solutions won’t work for me. Here is my code:

import re
import string

def func(line: str):
    # TODO: one-shot split fails
    delimiters = string.punctuation[:10]
    line_words = re.split(delimiters, line)
    # but splitting in steps does
    split_line = line.split(" ")
    for delimiter in string.punctuation:
        for unsplit_word in split_line:
            unsplit_word.split(delimiter)
    return split_line


print(func('x="y z"'))

Why does re.split fail in this case?

Your code does not work because re.split takes a regular expression as its first argument. string.punctuation[:10] is !"#$%&'()*, which is a valid regex but does not match anything in your input. Hence, it is returned unchanged.

To create a regex that matches any of the characters in string.punctuation[:10], place them between [], like so:

>>> re.split(f"[{string.punctuation[:10]}]", 'x="y z"')
['x=', 'y z', '']

It looks like you also want to split at white space. In that case, just add \s to the regex:

>>> re.split(f'[{string.punctuation[:10]}\s]', 'x="y z"')
['x=', 'y', 'z', '']

As an aside, note that in your code, the following lines to do nothing:

    for delimiter in string.punctuation:
        for unsplit_word in split_line:
            unsplit_word.split(delimiter)

because you don’t save the output of unsplit_word.split(delimiter).

1 Like