I’m working on a package called Tortuga, which is a turtle.py implementation that adds non-English function names. The idea is to eventually merge it into the standard library. For example, a color_fondo('rojo') call would do the same thing as bgcolor('red').
My question is: Do programmers writing code in non-English languages typically use accented letters in their function names? For example, would you more commonly see adiós() or adios()? Or should I just have both names?
THESE THREE THINGS ARE IMPORTANT:
First, to help the signal-to-noise ratio, I’d like to hear from people with actual experience rather than “I’ve never seen non-English code but here’s my guess.”
Second, I know that this is not best practice for I18N but that’s not relevant here. This is a project to help beginners (many of whom won’t become software engineers) learn to code by getting rid of the English barrier. Comments about gettext aren’t going to be helpful here.
Third, I know that most people in non-English-speaking countries still write code in English anyway. That’s not a relevant detail here and “they should write code in English” comments aren’t going to be helpful.
(native Spanish speaker with experience in education)
In my experience almost nobody uses accented letters in function or other object names, because it’s common for people to have english keyboards even if they speak Spanish (outside Spain, specially).
So, even if some variables or functions are Spanish-named, like lista or empleados, I almost never saw something like mañana or marrón.
I also think unaccented letters are more natural in computer contexts. They also sidestep any encoding and normalization issues that may crop up (in Python, or in other tools).
Perhaps you could design it so that accented and unaccented versions work the same way, and can be mixed within the same program.
Another thing to note is that some languages except something other than just removing the accent. In German, for example, the grammatically correct way to write grün in ASCII is gruen.
Indeed in German you don’t replace ä, ö, ü with a, o, u; but writing ae, oe, ue instead would be acceptable and used to be very common. For example LibreOffice had thousands of German comments in C++ source code, and they used ae, oe, ue. Example: // naechste Loesung uebernehmen instead of // nächste Lösung übernehmen. Checking old code I wrote as a child I find a variable bloecke (blöcke, to count Tetris blocks) and a function zaehler (zähler, to display the score counter). The German 2008 (Python 2.5.2) book “Python für Kids” uses def fuelle_dreieck(): ... (not fülle, and definitely not fulle) and def caesarcode() ... in a chapter “Cäsar-Code” without explanation; but it has a small info box on the topic near a variable laenge (länge). However all these examples are from a time (and programming languages) where ä/ö/ü were not even an option. I would not be surprised if newer Python 3 books for kids use ä/ö/ü.
A 2023 book “Programmieren lernen mit Python - So einfach!: Spielend lernen anhand von anschaulichen Bildern. Für Kinder und Erwachsene - ab 10 Jahre” uses function names like berechne_fallhöhe, zeige_lösung, öffnen, schließen, füllen, and variables like fläche, gruß, MENÜ, etc. (all mixed with English keywords and standard library names) and even mixed English/German variables like label_höhe, but filenames like glueckwunsch.py.
I have seen both. Especially for languages that are further removed from the latin script it becomes difficult to “just drop accents”. IMO if the goal is for this to be a project that is going to extend to other languages as well then trying to use pure-ASCII is a lost cause and you should bite the bullet early.
(I did write a large paragraph on a more nuanced statement than “they should write English instead” based on personal experience. But based on your sweeping dismissal I wont post it here)
In Finland, English is usually used for coding, but in general accented letters aren’t used for this sort of thing, such as in domain names. For example, Mehiläinen is at www.mehilainen.fi.
Unlike in Germany, accented letters are usually replaced with the non-accented “equivalent”: ä , ö , å → a , o , a.
Translating adiós, it would be nakemiin() and not näkemiin().
The Turkish keyboard layout requires modifier keys for commonly used characters like []. Because of that, it is common for developers to use the US English keyboard layout. However, that is not the case for beginners. I taught introductory Python to business students and they were quite enthusiastic about the possibility of using the characters that are not available in English (ç, ö, ü, ı, ğ, ş). It is also quite common for non-tech-savvy people to mistakenly use those characters while typing URLs so that is generally their expectation to be able to use those characters.
In my experience doing contract work with companies in other countries, it’s exceedingly rare to have non-English code even for non-international/small organizations. I saw that exactly once with a Portuguese company and even in that case their code was partly in English (and their database schemas were 100% in English).
I’ve seen government-adjacent software show tracebacks/error messages which revealed methods/database tables/other entities named in Polish (without accented characters).
I too struggled with the question of using or not accents when creating Reeborg’s World (Reeborg's World) many years ago. At the time, I made the choice of not using accents and I am no longer thinking that this was the right choice.
While this is a good observation and I think is representative of most other responses given here, I think that this is missing the point of the question which asks what would be suitable for teaching beginners who might never become professional programmers (my words).
Based on actual research in Education instead of relying anecdotes obtained from professional programmers
Currently support 47 different languages
Does make use of accents on letters. For example (third tab of lesson 1), in English one can use echo whereas the corresponding French keyword is réponds, with the accent.
Thanks to everyone for their comments. I think the most effective thing to do for my particular case is to have both functions with accents and functions without (I’ll call these the “ASCII version”). The benefit to have both is moderate and the cost is very low (for my particular case.) Thank you!
More specific responses:
I have heard of Hedy! It’s the first programming education tool since Scratch I’ve seen where I thought, “Wow, this isn’t terrible.”
Also, I want to assure people that I know each language may have a different answer to my question. I’m also working with native speakers for each language (and not just using machine translation.) Also, I know for example the German ß symbol translates to “ss” for an ASCII version of the word and this is the kind of thing I’ll rely on native speakers to point out for each language.