How to handle 'global' variables?

Hi,

Newby question here.

I’m creating a program that traverses a directory tree using os.walk.
Each directory in that tree is checked for a distinct text file, the text file is opened, and the contents searched for image filenames & urls.

I want to keep track of the total number of text files that were found, and the number of image filenames & URLs found in those text files.

For this I had created variables like counter_urls & counter_images in the ‘root’ of the module, so outside any of the functions that perform the tasks described above that will update these counter variables.

However, Pylint & Pylance don’t like that: “Constant name “counter_urls” doesn’t conform to UPPER_CASE naming style”.
and (when updating a counter from a function): “Redefining name ‘counter_images’ from outer scope (line 22)”

While I do use Constants for setting Debug values, these counter variables are not Constants.

Earlier I defined those counter variables as Global, but that also seems to be frowned upon.

What is the best method for defining these counter variables?

Thanks,
Fred

Hi,

do you want to change a variable in one module and have access to it in another module?
Or have global variables defined in one module with the ability to change their value in another
module? Is this a fair understanding of your query?

This is not simply enforcing a style guide. It’s warning you that, assuming I’ve correctly guessed what the code looks like, it doesn’t mean what you appear to think it means.

If we have something like

counter_urls = 0

def some_function():
    counter_urls += 1

this will fail at runtime:

From the linting tool’s perspective, counter_urls inside the function is a “redefinition” - because it is. Since the code has an assignment to counter_urls, and it does not have a global declaration, counter_urls must be a separate local variable. The tool warns you because people usually don’t want to make a local with the same name.

Then, from the tool’s perspective, the counter_urls outside the function is supposed to be a “constant” because there isn’t any code in the file that could change it.

You need to use the global declaration to make a global variable work the way you want it to.

The complaint is not about the syntax you use, or simply the fact that you have a global. (After all, in Python, the function is also a global variable.)

The problem is that you have code that wants to re-assign a name that wasn’t given to nor created by it. Doing things like that makes it harder to reason about the code - which is part of why Python has this design that requires the global statement. (Although the main reason for the design is that, this way, you don’t need any kind of declaration for local variables, which are much more common.)

There are many ways to manage scope. If you insist on re-assigning a global variable, then the global statement is the only correct way to make that work. But in almost every case, the right way to solve the problem is to give the function what it needs, instead of having the function take those things. That is: use arguments (in calls) and parameters (in the definition) to send information into the function, and use the return value to get information out.

Caveat: Python does allow you to modify the object in a global without this global declaration. (Assuming the object offers any ways to modify it. Many built-in types don’t.) The global name is a way to get at that object, and then whatever you do with it is not the name’s concern. But this is still a disorganized, complex way to program that will bite you in the long run.

Pylint is making 2 separate complaints here:

“Constant name “counter_urls” doesn’t conform to UPPER_CASE naming
style” is about the convention we have that in Python we treat defaults
and other fixed-but-written-as-a-name values like constants in other
languages: we give them loud UPPERCASE names, eg:

 MAX_THREADS = 16

Then we talk about MAX_THREADS elsewhere in the programme as needed,
and it isn’t a variable we expect to be modified.

“Redefining name ‘counter_images’ from outer scope (line 22)” is a
complaint that you’ve defined a variable so that it hides a variable
from some outer scope (in your case, the global scope).

When you write a programme like this:

 x = 1

 def f():
     for x in range(5):
         print(x)

the x in the function is a local variable, and unrelated to the x
in the outer/global scope. Pylint complains about this because you might
then be confused about which x you’re talking about inside the
function. With a function as short as the one above that is unlikely,
but in something more complex it’s quite plausible. The fix here is to
use a different name inside the function.

When you write a function in Python, Python deduces which variables are
local variables and which are not by performing a static analysis of the
function: if a name is assigned to in the function, it is a local
variable. In the function f above, x is local because the for-loop
assigns to it.

Generally this is a good thing: ideally most functions are what are
called “pure functions”: they change noting outside them - all the
communication is done by passing values to their parameters and receiving
values back from a return statement. This makes it easy to think about
functions on their own, and easy to think about using them elsewhere
because you know they are not going to reach out and change something
external (such as a global variable).

Using global variables breaks this ideal scenario, and is the core
reason they are, broadly, discouraged. To make sense of a programme
where functions modify global variables, you need to think about the
entire programme as a whole. That gets progressively harder the bigger
the programme.

I’d encourage you to structure your programme like this:

 def main():
     ... traverses a directory tree using os.walk ...

 def whatever_other_functions()...

 main()

This will make every formerly global variable a local variable of the
main() function, and inaccessable to the other functions. From that
point on you need to pass values to the functions as parameters, and
receive results back via the functions’ return statements.

WRT your desire to count URLs and images, maybe you should try using a
Counter object from the collections module:

You’re already importing things; you will need to import the name Counter in order to use it:

from collections import Counter

In main() construct a Counter with counts for urls and images:

 counts = Counter(urls=0, images=0)

Then you can pass the counts object to whatever functions will be
modifying the counts, and they can update the relevant counts:

 def f(counts):
     if this is an image:
         counts['images'] += 1

Because Python passes variables by reference, the counts inside the
function f above is a local variable, but it refers to the same
Counter object from main() (assuming main calls f like
f(counts)).

The line:

 counts['images'] += 1

is not an assignment to counts, it is modifying an object stored
inside it.