How do I remove XXX from dir printing

woodturner550 · August 5, 2022, 11:59pm

sO YOU KNOW YOUR TIME WAS NOT WASTED!!!
“”"
import re
import math
my_str = dir(math)
pattern = re.compile(r"[a-z]*[a-z]") # catch all dunders won’t show in forum right
result = re.findall(pattern, str(my_str,))
print(result)
if (result):
print(“Yes, there is a match!”)
else:
print(‘No match!’)
length_of_result = len(result)
final_output = my_str[5:]
print(final_output) # THIS WORKS GREAT

Thanks again for the time and help
woodturner550
P.S. I’m still not copying and pasteing to the forum correctly, sorry.

shoult be

vbrozik · August 6, 2022, 10:36am

It is great that you made the program working!

It looks like you are using apostrophes or quotes instead of backticks. If you are unable to find the backtick on your keyboard, copy the three backticks from my post:

```

…or it could be easier to use the “Preformatted text” button </> in the forum’s editor toolbar. You should see the formatted result in the post preview on the right side. Please try to edit your post to fix it.

woodturner550 · August 6, 2022, 5:02pm

sorted_my_string = sorted(my_string)
pattern = re.compile(r"__[a-z]*[a-z]__")  # catch dunders 
result = re.findall(pattern, str(sorted_my_string,))
length_of_result = len(result)
final_output = my_string[length_of_result:]
print(final_output)  # THIS WORKS GREAT for math only

sorry I'm slow.

woodturner550 · August 6, 2022, 5:12pm

the whole program

import re
import math

my_string = dir(math)
sorted_my_string = sorted(my_string)
pattern = re.compile(r"__[a-z]*[a-z]__")  # catch dunders
result = re.findall(pattern, str(sorted_my_string,))
length_of_result = len(result)
final_output = my_string[length_of_result:]
print(final_output)  # THIS WORKS GREAT for math only

vbrozik · August 6, 2022, 9:55pm

That is good to see the code not mangled

Some suggestions:

Try to use descriptive names. For example math_dir, math_identifiers or module_identifiers instead of my_string.
Python’s regex engine in the standard library knows the extended regex constructs and much more. Instead of r"__[a-z]*[a-z]__" you can use equivalent and more readable r"__[a-z]+__". Note that dunder names are not limited to [a-z]. Certainly in the standard library you will find at least ones containing underscores.
Your way of filtering out the dunder names is complicated and fragile. It works only if all dunder names are at the beginning of the list my_string. Now I see that you noticed the problem: “WORKS … for math only”

The straightforward attitude would be to iterate the identifiers and filter them one by one:

module_identifiers = dir(math)
filtered_module_identifiers = []
for identifier in module_identifiers:
    ... # here do the filtering of the identifiers one by one

Later when you learn list comprehensions (and generator expressions) you will see that they can be used for this kind of tasks.

woodturner550 · August 6, 2022, 10:43pm

This is very fragile! I started with just ‘math’ module to remove dunders. Not understanding all the other possible patterns are used. A very good lesson, know your complete data.
on the renaming my varibles better, I am just reading about that in ‘Beyond the basic stuff with python’, a very informative book. Lots to learn. I’m going to drop this line of inquire as it is to fragile and would have taken several pattern matches to do the job.

I want to thank all who have helped. Knowledge is valueable even if its an old man after a stroke, for a hobby.

I need to understand list compression I know.
woodturner550

vbrozik · August 7, 2022, 3:51pm

You can program without them using ordinary for loops. List comprehensions are just a “syntactic sugar.” They allow you to write a shorter code and better express your intent.

For demonstration here is the same task, first implemented using a for loop, then using a list comprehension:

powers_of_two = []
for exponent in range(5):
    powers_of_two.append(2 ** exponent)

powers_of_two = [2 ** exponent for exponent in range(5)]

woodturner550 · August 7, 2022, 7:04pm

Do I understand this? Basicly we are flatening the for loop.

What IF you think that the for loop is clearer in ones mind. Maybe, because this is just new , (list compression) that it seem not natural to me. Could also be ghost of my past, I was CEO and engineer of an Internet Service Provider before my stroke(1994).

Since we are not sure who will be reading our code, we should use ‘list compression’ as a standard, Am I correct? It is something the books don’t make clear, which way to go as a standard.

rob42 · August 7, 2022, 7:17pm

If I may interject:

I would not get too hung up on it, if I were you, but as as @vbrozik rightly says, list comprehensions allow you to write a shorter code.

There is a school of thought that says “readability counts” and I’m sure that someone could come up with something that is way less readable, using list comprehension, than it would otherwise be with a loop routine over four or more lines.

This link may help:

cameron · August 7, 2022, 11:15pm

By Leonard Dye via Discussions on Python.org at 07Aug2022 19:14:

Do I understand this? Basicly we are flatening the for loop.

Yes. But it only works when there’s no special state between iterations.

A list comprehension essentially maps one sequence of values into
another. So squares from small ints:

[ x**2 for x in [1, 2, 3, 4] ]

The [1,2,3,4] can be anything iterable (thus the for-loop parallel).
But the expression on the left (x**2) only gets to look at x, one
of the values from the right.

A for-loop lets you write code where the values can use information from
previous loop iterations, eg accruing a total etc, or changing how you
treat one item based on previous items.

You can do funny stuff to get that in a list comprehension, but you’re
off into code which is hard to understand.

What IF you think that the for loop is clearer in ones mind.

If the for-loop is clearer, use the for-loop. That is the main rule of
thumb: code should be easy to read and understand.

Maybe, because this is just new , (list compression) that it seem not
natural to me. Could also be ghost of my past, I was CEO and engineer

of an Internet Service Provider before my stroke(1994).

Since we are not sure who will be reading our code, we should use ‘list compression’ as a standard, Am I correct?

Use it when it clearly and concisely solves the problem. If a list
comprehension becomes hard to understand, maybe it is time for a
for-loop (or a map() or … as appropriate).

Note that indentation goes a long way to making things more readable:

[ x**2 + f(x) + y for x, y in [ (z*2, z+1) for z in [1,2,3] ] ]

versus:

[ x**2 + f(x) + y
  for x, y
  in [
      (z*2, z+1)
      for z in [1,2,3]
  ]
]

which makes the pieces easier to identify.

Note that once you’re inside some brackets you can put in newlines and
spaces as you see fit. Not everything needs to be on one line. Sometimes
I use brackets entirely for this:

if test1(x) and something_else(y) and yet_anthoer_thing():

versus:

if (
    test1(x)
    and something_else(y)
    and yet_anthoer_thing()
):

It is something the books don’t make clear, which way to go as a standard.

From the Zen (run import this at the prompt): “Readability counts.”

Cheers,
Cameron Simpson cs@cskk.id.au

woodturner550 · August 9, 2022, 6:51pm

I am working on using list comprehension.
As a point of how regex should be showing how the regex works, I think. Learning from ones wrong direction teaches both “know your data” and look at other ways to program the solution.

```working_string = dir(math)  #must come into module as and dunder
sorted_working_string = sorted(working_string)

"Match the regex below and capture its match into backreference number 1 «(\b[a-z][^_A-Z]+)»
   Assert position at a word boundary (position preceded or followed—but not both—by a Unicode letter, digit, or underscore) «\b»
   Match a single character in the range between “a” and “z” (case sensitive) «[a-z]»
   Match any single character NOT present in the list below «[^_A-Z]+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      The literal character “_” «_»
      A character in the range between “A” and “Z” (case sensitive) «A-Z»"

pattern_non_dunder = re.compile(r"(\b[a-z][^_A-Z]+)")  # catch non dunders```

This is a excelent forum! Thanks! Some things are comming back to me and I'm learning!
I use a commercial program for the exact discription.
Leonard Dye

vbrozik · August 9, 2022, 9:07pm

Markdown formatting

The three backticks must be at the beginning of a separate line to work. Like this:

```
x = 2       # your code
y = x + 3   # another line of your code
```

Back to the `dir()` output filtering

Now I regret that I mentioned list comprehensions. I think it is too early to start with them. You first need to successfully use a normal for to iterate a list before you continue to comprehensions.

If a may suggest you how to approach the problem in gradual steps then you can follow these steps:

Get the list of the names in a module. You already have this step mastered: module_names = dir(math)
Iterate the list of the names using a normal for loop. Try to print every name from the list in the loop.
Add filtering into the loop - print names according to a certain condition - for example names which start with an underscore "_".
Improve the filtering condition for the dunder names.
Change the loop body to append the filtered names to a new list instead of printing them.
Now have prepared a code which can be converted to a list comprehension. Keep on mind that the conversion is not necessary, the normal for loop will do the same job too and may be easier to read and understand.

woodturner550 · August 9, 2022, 11:01pm

I know I’m still on the regex. Part of my stubbornness is the recovery from stroke never ends. I was starting to see patterns I had used before the stroke. Anything that connect the past knowledge with the current is good in a stroke surviver. That said I am finished with the original project in regex, It works and is not fragile. More testing is needed. The following is the code that works.Mr. Vaclav Brozik, thanks for your help. I have to work on loops next, thanks for the direction.

import re
import math
import cmath
import array
import collections

working_string = dir(re)  #must come into module as and dunder
sorted_working_string = sorted(working_string)
print(sorted_working_string)
'''Match the regex below and capture its match into backreference number 1 «(\b[a-z][^_A-Z]+)»
   Assert position at a word boundary (position preceded or followed—but not both—by a Unicode letter, digit, or underscore) «\b»
   Match a single character in the range between “a” and “z” (case sensitive) «[a-z]»
   Match any single character NOT present in the list below «[^_A-Z]+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      The literal character “_” «_»
      A character in the range between “A” and “Z” (case sensitive) «A-Z»'''
pattern_non_dunder = re.compile(r"(\b[a-z][^_A-Z]+)")  # catch dunders and leave those with only a-z letters in word
result_list = re.findall(pattern_non_dunder, str(sorted_working_string,))
length_of_result_list = len(result_list) -1
print('list less dunders',  result_list)  # This is what we want so far, many modules to check

I'm sure there is even a better way with regex to do this and we know that file comprehension is a better way when I get there. :slight_smile: 
 

Leonard Dye
woodturner550

woodturner550 · August 11, 2022, 12:47am

Now down to five lines of code.

import math

MODULE_NAME = math
import re
working_string = dir(MODULE_NAME)  # from module as __dunder__, non_dunder, capitals and common commands
'''
Match the regex below and capture its match into backreference number 1 «(\b[a-z][^_A-Z0-9]+\b[^_a-z])»
   Assert position at a word boundary (position preceded or followed—but not both—by a Unicode letter, digit, or underscore) «\b»
   Match a single character in the range between “a” and “z” (case sensitive) «[a-z]»
   Match any single character NOT present in the list below «[^_A-Z0-9]+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      The literal character “_” «_»
      A character in the range between “A” and “Z” (case sensitive) «A-Z»
      A character in the range between “0” and “9” «0-9»
   Assert position at a word boundary (position preceded or followed—but not both—by a Unicode letter, digit, or underscore) «\b»
   Match any single character NOT present in the list below «[^_a-z]»
      The literal character “_” «_»
      A character in the range between “a” and “z” (case sensitive) «a-z»
      '''
pattern_non_dunder = re.compile(r"(\b[a-z][^_A-Z0-9]+\b[^_a-z])")  # catch dunders etc., and leave common commands
result_list = re.findall(pattern_non_dunder, str(working_string,))
print('List of common commands in ' + str(MODULE_NAME) + ' module', result_list)  # What we want so far, common commands \
                                                                                  # many modules to check

I believe this is the end of this except for making it a function, which I will learn later.
Thanks again, I will be working on loops and will post final also in a new post.
Leonard Dye
woodturner550

vbrozik · August 11, 2022, 7:34am

Great that you continue learning.

I would just like to remind you that converting a list of names (as strings) to a single string and trying to filter individual names in the combined string is not certainly an optimal approach. We can see that as a kind of playful exercise with regexes. The for loop will allow you to use a much better approach.

Closer examination of the output of your last program shows that in the output some quotes are replaced by double quotes and there are extra quotes and double quotes. To better demonstrate what is going on, let’s print individual items of the result of re.findall() on separate lines.

Your program with just a different way of printing the result:

import math

MODULE_NAME = math
import re
working_string = dir(MODULE_NAME)
pattern_non_dunder = re.compile(r"(\b[a-z][^_A-Z0-9]+\b[^_a-z])")
result_list = re.findall(pattern_non_dunder, str(working_string,))

for item in result_list:
    print(item)

The printed result (one item per line):

acos', 'acosh', 'asin', 'asinh', 'atan'
atanh', 'ceil', 'comb', 'copysign', 'cos', 'cosh', 'degrees', 'dist', 'e', 'erf', 'erfc', 'exp'
fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'isqrt', 'lcm', 'ldexp', 'lgamma', 'log'
modf', 'nan', 'nextafter', 'perm', 'pi', 'pow', 'prod', 'radians', 'remainder', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc', 'ulp'

When you print that list directly, Python adds quotes around the individual strings (lines in our output) and separates them by commas.

woodturner550 · August 11, 2022, 4:44pm

I understand that using regex for this is wrong way to go. When is it best to us regex in a program over for loops? Or is regex something that will be depreciated soon?

Just so you know I am learning PyCharm and Python at the same time. Lots to learn, and I’m slow now.

But once I have it , I HAVE IT. Sometimes it’s hard for me to get to where it is mine! Self directed learning is OK but maybe a formal course on Python would be better. None where I live, I looked.

Question, Can you use regex in a for loop? I would think yes, but is it best practice.

rob42 · August 11, 2022, 5:37pm

One of the disadvantages of PyCharm (or so I read) is that it’s not suitable for Python beginners, but perhaps that’s a matter of opinion.

As for how ones goes about learning Python, I’m sure there are as many answers to that question, as there are members of this Forum.

My advice (fwiw) is to combine your Python learning with some other passion; maybe you like football, or wine, or books, or music, or flowers, any number of other things. So, set yourself a project based on something else that you’ve a passion for. You’ll be surprised at how quickly a simple project can grow into a very useful app

woodturner550 · August 11, 2022, 6:32pm

One thing I found very helpful, I google search this, python3, question. Example: python3, for loops
I was a programer before the stroke took that away. Learning all over again, still have ghosts of programing comming back. All this is good. One thing a stroke survior learns is to be never be a quiter. Keep working if you can see a way to the finish once you start something.
I like PyCharm, but it is a hand full for the beginner.

rob42 · August 11, 2022, 6:43pm

If you didn’t like the last site I suggested (and, it has to be said, it is a little dry), then try this one:

Not everyone likes it, but I think it’s one of the best ‘learn python’ sites out there, but there are others (many of them).

(the “Est. reading time” is a little off; LOL – it’ll keep you busy for days!)

woodturner550 · August 11, 2022, 7:32pm

Good book that helped with PyCharm, Hands-on Application Development with PyCharm.

How do I remove __XXX__ from dir printing

Markdown formatting

Back to the dir() output filtering

How do I remove XXX from dir printing

Back to the `dir()` output filtering