Private, protected modifier and __ notation

smer44 · January 2, 2023, 1:36am

This suggestion could be rather radical for Python users, who got used to its features, however I would post it.
The idea is to add “traditional” private (and maybe protected) modifier on class and instance fields and methods. I know, the “traditional” way is to use a __… in variable name, however the issue is that this variable would be still accessible from outside the class for example by var() method, and a user can work with them as with usual class fields for example change them. So fields like that are not really private, what abuse software design principles.
The idea instead is to have common private (and maybe protected) modifier. Fields, defined like that could be for example in the separate dict, what is not accessible from the outside of the class. This feature is really basic and easy to implement even on the level of IDE, so there should be no problem to implement it in interpreter.
For brief syntax, the privacy must not necessary be defined with the word “private”.
For example, if the variable inside the class is defined as :
“- p = 10” the minus sign indicates that this is a private variable
“+ p = 20” the plus sign indicates that this is public
and default would be protected.

As a bonus, you can get rid of ugly " __ " notation. I know there is a lot bad looking code what use all kinds of horrible looking statements with " __ " so it may remain deprecated for a long time.

ajoino · January 2, 2023, 3:14pm

Your final sentence look malformed, could you try to reformat it to look correct?

Regarding the suggestion, what value would this add to Python beyond fixing some “ugly” notation?

smer44 · January 2, 2023, 5:07pm

As said, this will introduce “real” private class fields, what would be really invisible from outside the class. Now, fields what are marked with " __ " are accessible from outside of the class, further, they can be changed, what violates encapsulation principle and is potential source of misusages .

ajoino · January 2, 2023, 5:18pm

Yes I see that you want to add some notion of data encapsulation, but you haven’t told us how data encapsulation would improve Python. Data encapsulation in and of itself is not necessarily a good thing. I really like how you can mess around with Python objects.

This has been brought up before, though I can’t find those discussions right now, and I think the conclusion of those discussions were that since you cannot access the “raw” data (i.e. individual pointers and bytes in memory) of a class member the data is already encapsulated. At least in the sense that you can’t mess around and get memory errors. Please correct me if I’m wrong.

Cupprum · January 2, 2023, 5:23pm

What the __ notation does, is that it prepends the name of the variable with underscore and name of the class. I hope the following example explains it.

class Test:
    __a = "test"

test = Test()
print(test._Test__a)  # prints test
print(test.__a)  # raises Attribute error

As seen above, you can still access these variables also from outside of the class. They are however left our from the __dict__ representation of the object.

barry · January 2, 2023, 7:45pm

Quite right, and I think the purpose of double-leading-underscore is often misunderstood, or maybe just kind of lost to history. It was added to solve a problem with inheritance. If your class uses single-leading underscore as a hint to its privateness, a subclass of your class could shadow that variable unknowingly, possibly even breaking your class’s behavior. This sometimes happens when your base class isn’t designed for inheritance so doesn’t expect its semi-private-ish variables to be overridden. By prepending the class name to double-leading-underscored variables, you can prevent this accident.

In practice, I don’t think the idea panned out very well, so I suspect the meme of it being for private variables took over.

smer44 · January 2, 2023, 9:05pm

Yes that is correct, if the variable is named_Test__a is will not be accidentally replaced in a subclass, what would have _TestSubclass__a for example, what is issue in Python, again, because unlike other OOP classes it has one naming scope for all variables and methods. If Python would have different scopes for class fields, the replacement of private variable would not be an issue.
However you still can access to this variable via vars(test) and modify it, what should definitely not be allowed.

eryksun · January 2, 2023, 9:09pm

As with any other class attribute, __a is stored in the class dict, i.e. vars(Test), except it’s mangled as "_Test__a". However, an instance attribute named __a will of course be stored as a mangled name in the instance dict. For example:

class Test:
    def m(self):
        self.__a = 42

>>> t = Test()
>>> t.m()
>>> vars(t)
{'_Test__a': 42}

Functions compiled in a class definition are implemented to mangle a name that begins with two underscores if it does not also end with two underscores, even for the names of local variables. For example:

class Base:
    def m(self):
        __a = 'apple'
        __spam__ = 'spam'
        self.__b = 'banana'
        self.__eggs__ = 'eggs'

class Derived(Base):
    def m(self):
        __a = 'apple'
        __spam__ = 'spam'
        self.__b = 'banana'
        self.__eggs__ = 'eggs'

>>> Base.m.__code__.co_varnames
('self', '_Base__a', '__spam__')
>>> Derived.m.__code__.co_varnames
('self', '_Derived__a', '__spam__')

>>> Base.m.__code__.co_names
('_Base__b', '__eggs__')
>>> Derived.m.__code__.co_names
('_Derived__b', '__eggs__')

Rosuav · January 2, 2023, 10:05pm

Different scopes would cause many MANY other problems. An object is a coherent whole, regardless of which parts “came from” which levels in inheritance. That’s true even in C++, where there’s a lot more meaning to ‘which parts came from where’ (since they’re declared in the class block), and definitely true in Python, where an object is an object and it just has attributes.

Can you demonstrate some actual benefit from attempting to stop people from doing things? In my experience, attempting to stop programmers from writing software just means that they find ways around your barriers. Sometimes appalling ways. Life - and coding - finds a way.

stoneleaf · January 2, 2023, 10:07pm

Python has a “consenting adults” policy – we use various idioms, such as leading underscores, to inform other programmers what methods, attributes, etc., shouldn’t be messed with, and then leave it up to them. If they do use or modify these private and/or internal objects, then any consequences of it not working correctly are their responsibility.

However, this also means that people can extend functionality, or more easily work around bugs, because they have access to those private/internal data and functions.

stoneleaf · January 2, 2023, 10:25pm

Raymond Hettinger gave a talk at some point (PyCon 2013?) that illustrated this – here are the slides.

smer44 · January 2, 2023, 11:28pm

Okay, so private fields are intentionally made accessible without any real restrictions. Now I remembered the reason why I got the idea about the privacy issue here. It was the conversation with a front-ender about the possibility to use Python instead of JS to have the same Python code for as well as standalone app and interpreted by browser (it is NOT the suggestion, I know it is too radical). However it is technically possible since these are both language on similar level, and its possible to have JIT Python compiler, etc…
So back to the topic.
This guy pointed that the one of the issues in organizing .js libraries is the encapsulation for security reason. JS also has lack of possibilities to describe variables as private, what opens real security holes in the structure of web-services, if someone imports .js file and then starts to mess around with intern content.
Thinking about why is it such an issue in JS, as this guy described, but is allowed in Python, and is handled properly in common OOP languages, like Java, C# etc… maybe the “consenting adults” policy is disputable. And the language would benefit from blocking user to access where they are not allowed to.

Rosuav · January 2, 2023, 11:46pm

Wait, hold on. In what way is this a security hole? If there are JavaScript systems with a security boundary that is just a simple function/method call, then you have way WAY worse problems than people able to import your .js files.

In web services, the boundary between server and client should be a security boundary, and should be fully checked by the server - not the client library, which is part of the client. Anything that makes this more obvious - like the fact that JavaScript doesn’t stop people from importing libraries and messing with them - is a good thing, because it will mean that problems get found more easily. The solution is NOT to hide “internal” members; it is to have the server do the proper checking.

fancidev · January 3, 2023, 1:20am

Believe it or not, being able to access, and in fact change, private member variables is one of the key features that make Python so handy for a lot of tasks Please don’t take that magic away!

barry · January 3, 2023, 4:21pm

I think even calling them “private” in Python is kind of misleading. It leads to incorrect assumptions. Sometimes we use that term as a shorthand for single or double leading underscore named attributes, but it isn’t really accurate to do so IMHO.

steven.daprano · January 3, 2023, 10:24pm

I’m sure Barry knows this, but for the benefit of @Smer44 they are private by convention. The interpreter intentionally makes no attempt to enforce the rule.

There are languages which attempt to enforce strong access rules, and all that happens is that developers end up spending enormous amounts of time trying to find fragile ways of defeating the compiler.

So it is Python’s philosophy to just leave it up to the developer in the first place. We have a strong convention that you should not touch leading underscore “private” attributes, and that works well enough.

For times where you really need to protect something you can write your class in C and just not expose that attribute to the Python layer at all.

ajoino · January 3, 2023, 10:30pm

Until someone takes that personally and uses ctypes

RobFoster · April 22, 2023, 11:20pm

Let me ask a question here.
Do you honestly believe that cyber criminals, script-kiddies or people with malicious intent give a crap about Python’s Foundation or Python’s Community “consenting adults” policy ?

If you do, then we have much larger problems than you think.
As cyber criminals are depending on the language not to have any protection,
so they can do their malicious activities and many reasons why corporations get their data stolen.
With access control modifiers as standard in python, it would prevent injections into code,
that will also prevent so many cyber crimes that is currently plaguing the world right now.

Just saying

Rosuav · April 23, 2023, 12:08am

That’s not what the policy means at all. If malicious people are able to get code into your codebase, it doesn’t make any difference what sort of private and public qualifiers there are - it’s malicious code in your codebase.

da-woods · April 23, 2023, 5:47am

Control modifiers don’t provide any real protection in C++. If you know how the memory is layed out it’s always possible to modify the memory. They aren’t a security feature - just a oop design tool.