Probability of syntax objects being next to each other

Are there any statistics on the probability of particular Python objects being next to each other in codes? For example, if I want to create syntax highlights, I would probably work with different Python syntax objects. Each should have a different color. If I base my design on the premise that colors should be easy to recognize, that means that I cannot have hardly recognizable colors next to each other. Yet, looking at existing color palettes, I can see there are too many syntax objects - too many for a low number of different colors. That’s why I came to the idea described above. If I know that some parts of syntax tend to be more next to each other, I can give them easy-to-recognize colors.

“Python objects”? Do You mean texts like “for”, “in”, “def”, etc. ? Or literals for text, ints, floats, etc.? If for the first, then you can look into Python grammar. For the second one…err…I don’t think if anyone ever thought of doing something like that O_o

Well, I don’t know what I mean because I don’t know how code editors recognize different parts of the syntax to colorize it.

Mostly it depends on a language, but the syntax highlighting does a tokenization (converts a string of text to tokens). You can think of tokens like words with an tag that says what part of Python’s grammar that word is. Some may do a bit of parsing to the tokens to do word-tagging more precisely (for example to change soft keywords to normal text depending on a context).

I think if you really want to write your own syntax highlighting then you must know about what I’ve written above. Without it you won’t go anywhere ¯\_(ツ)_/¯

2 Likes