New type: `symbol`

TL;DR

  • Make a proposal for a new type: symbol. In this article, I use prefixed $ to declare a symbol (just for demo).
  • Functions accept symbol surrogates as parameters. Functions will transform symbols to appropriate values to run the implementation; this allows to simplify the code in general cases when calling function.
  • Symbols might be useful for type check (mypy or IDE). Using keyword strings as parameters is hard to be analyzed.
  • With decorators, any existing function could easily support Symbol.

Using Symbols to simplify the code

Considering this code:

from random import random
xs = [(i, random(), random()) for i in range(10)]
print(xs)

from operator import itemgetter
sorted_xs = sorted(xs, key=itemgetter(1))
print(sorted_xs)

sorted_xs = sorted(xs, key=itemgetter(2))
print(sorted_xs)
# xs
[(0, 0.6334513519440723, 0.7831829839523177),
 (1, 0.9329805013131537, 0.19264078683267083),
 (2, 0.2302850861994883, 0.5910166579153023),
 (3, 0.31236313706487373, 0.4142487896581839),
 (4, 0.11192161822024849, 0.510212695052502),
 (5, 0.6956792890111658, 0.7125546018800465),
 (6, 0.3869272875534193, 0.6990655249844869),
 (7, 0.8601507141635526, 0.4229804838296001),
 (8, 0.9867627627329273, 0.38384098978775394),
 (9, 0.29156429346050816, 0.11962291780001866)]
# sorted by item[1]
[(4, 0.11192161822024849, 0.510212695052502),
 (2, 0.2302850861994883, 0.5910166579153023),
 (9, 0.29156429346050816, 0.11962291780001866),
 (3, 0.31236313706487373, 0.4142487896581839),
 (6, 0.3869272875534193, 0.6990655249844869),
 (0, 0.6334513519440723, 0.7831829839523177),
 (5, 0.6956792890111658, 0.7125546018800465),
 (7, 0.8601507141635526, 0.4229804838296001),
 (1, 0.9329805013131537, 0.19264078683267083),
 (8, 0.9867627627329273, 0.38384098978775394)]
# sorted by item[2]
[(9, 0.29156429346050816, 0.11962291780001866),
 (1, 0.9329805013131537, 0.19264078683267083),
 (8, 0.9867627627329273, 0.38384098978775394),
 (3, 0.31236313706487373, 0.4142487896581839),
 (7, 0.8601507141635526, 0.4229804838296001),
 (4, 0.11192161822024849, 0.510212695052502),
 (2, 0.2302850861994883, 0.5910166579153023),
 (6, 0.3869272875534193, 0.6990655249844869),
 (5, 0.6956792890111658, 0.7125546018800465),
 (0, 0.6334513519440723, 0.7831829839523177)]

What if we can pass a Symbol to sorted()?

With Symbol syntax (assume using $), we don’t need to import the operator module; the sorted() method could handle that for us. This would increase readability and coding speed.

from random import random
xs = [(i, random(), random()) for i in range(10)]
print(xs)

sorted_xs = sorted(xs, key=$1)
print(sorted_xs)

sorted_xs = sorted(xs, key=$2)
print(sorted_xs)

The pseudo implementation of sorted():

# pseudo code, just for demo, don't take this too seriously.
from operator import itemgetter
def sorted(..., key=..., ...):
    if isinstance(key, symbol):
        if isinstance(key.value, int):
            action_key = itemgetter(key.value)
        elif isinstance(key.value, str):
            action_key = attrgetter(key.value)
        ...
    else:
        action_key = key
    # from now, use action_key instead of key in the original implementation
    ...

Using symbols for existing functions with decorators

Transforming symbols to values is just a “before” job in aspect-oriented programming (AOP). We can use decorators to support symbols without modifying the original implementation.

# pseudo code, just for demo, don't take this too seriously.

# if the keyword parameter `key` is a symbol, 
# this decorator will call the function `transform` with the symbol 
# to get a valid value for calling the original function.
@desymbolize_on($key, transform_to_getter)
def sorted(..., key=..., ...):
    # keep the original implementation
    ...

# Also support positional parameters using symbols with the decorator.
@desymbolize_on($2, transform_to_getter)
def sorted(..., key=..., ...):
    ...

def transform_to_getter(s: symbol):
    if isinstance(s, symbol):
        if isinstance(s.value, int):
            return itemgetter(s.value)
        elif isinstance(s.value, str):
            return attrgetter(s.value)
    # Not a symbol, we don't handle this parameter
    return s

The decorator @desymbolize_on in this example also takes symbols to find a positional parameter or a keyword parameter to call transform_to_getter().

With decorators and symbols, we can do AOP jobs do make existing libs/modules more easily to use together. Just write one weaver file to make functions from one module to support objects from other modules through symbols. AOP separates business logic from implementation detail. You don’t need to specify a concrete object/function and import specific modules in every .py file.

This makes code more maintainability. If you want to use another module to replace an existing module, you just need to change code in the weaver file (read the symbols, transform them to concrete objects, and apply them to functions). Without AOP, you need to change every files which couples to the modules.

Other benefits

Using symbols instead of strings (or other values) may have other benefits:

  • Symbols in the code can use different colors in the editor, make it easily distinguish symbols from keyword string.
  • IDE and type checker (mypy) may support static analysis, code autocompletion etc.

I’m certainly no expert like others here, but as a general word of caution, at this point in Python’s development, introducing new syntax (particularly a brand new operator symbol) into the language generally requires it to be useful to a broad cross-section of Python users, and provide a substantial compelling benefit that cannot easily be achieved with the existing language/standard library or implemented in a third-party package.

While the above outlined the proposed implementation, both the broad applicability and the compelling benefit over existing or alternative approaches was unclear, at least to me as an “average” Python developer. As such, it would be helpful to provide some specific examples of a variety of common use cases beyond just replacing operator.itemgetter when calling sorted. Furthermore, aside from avoiding an import, a modestly more concise representation, and the possibility of more specific syntax highlighter/type checker behavior, are there things you can do with the new syntax that simply aren’t possible with what we can now, or could implement in a third-party library?

Unfortunately, without compelling answers to both of those, its hard to see a proposal that requires new syntax, new semantics, a new operator and a whole new symbol being given serious consideration.

One would not need a special symbol object to allow that.

If this behaviour was desirable, sorted could simply accept key=1 as a short-cut for using index 1 as the key.

Alternatively, we can use the enum module to create named symbols like this.

TypeScript has literal types and makes keyword strings and numbers analyzable without special $ symbol syntax.

And more relevantly, so does Python…

1 Like

I think I agree here. I prefer reading the initial code and I’d be very happy writing it. I find it makes more sense.

I wonder if it would be useful for sorted to call getitem when passed a non-callable in general then.