The German eszett ‘ß’ has always been available only as a lower case letter. However, an upper case character to it was “included in Unicode version 5.1.0 in April 2008 (U+1E9E ẞ LATIN CAPITAL LETTER SHARP S). The international standard associated with Unicode (UCS), [ISO/IEC 10646] was updated to reflect the addition on 24 June 2008. The capital letter was finally adopted as an option in standard German orthography in 2017. Since 2024, ⟨ẞ⟩ has been the preferred option for depicting the character in capital letters, with ⟨SS⟩ as a second option.” (ß - Wikipedia, last called on May, 4th 2026). By this, ‘ß’.upper() should return ‘ẞ’ instead of ‘SS’ in my opinion.
Talk to the Unicode consortium about that; they define the rules for how characters get uppercased and lowercased. Have fun trying to convince them.
Doesn’t the wikipedia article state that the international standard associoated with Unicode was updated?
My reading of that paragraph is that it was added as a way of representing it, not that they changed what case conversion should do. But notably, Wikipedia isn’t the definition here, and it can certainly be wrong; what matters is the Unicode files themselves.
The change has been proposed in the document L2/25-223 and has gained feedback already. I have learned that there is also a negative case pair stability policy that prevents the creation of new case pairs. (“ß” and “SS” do not form a case pair since the lower case form of the latter is obviously “ss”.)
Well, if it’s accepted, Python will eventually get the update with some version of unicodedata. But based on the feedback in that link, it looks like the result might be some other change, but will definitely NOT change the effect of upper(), lower(), or casefold().
Yes. Creating new case pairs could have serious repercussions. Imagine a case insensitive file system [1] and you create two files that differ by just one character. Then a future Unicode version makes those two characters a case pair. Now those two files are, in fact, the SAME file, only they’re not. Have fun sorting that out! Technically that could happen any time a new case pair is added, but if the file system explicitly rejects any unassigned characters (either erroring out or replacing them with U+FFFD), it would be able to guard against that.
granted, that’s like saying to a Magic: The Gathering judge “Imagine I have Humility and Opalescence in play” - of course it’s going to result in wonkiness ↩︎
I’d expect that the Germany Unicode representative should be aware of the proposal and will block or approved based on government policy. Maybe you can contact the representative and check what is going on.
Given that most German keyboards don’t have the “capital ß” character readily available, I don’t think usage is wide spread yet.
Personally, I find the new character more than awkward when used with other capital letters, so prefer to stick to the double S version - and yes, that’s still officially allowed ![]()
Switzerland elimanted completely the sharp s from its character set. But not Germany. Quite the opposite, it also introduced an upper version of it. And I’m aware of the problems with backward compatibility, however, you also will have present und future cases, in which you need to transform the lower sharp s to its bigger form, not ss and find troubles to do so. Therefore, workarounds will be used, which won’t be Pythonic.
For those cases, I think applications should make a conscious decision and then call '...ß...'.replace('ß', 'ẞ').upper() where necessary.
The 3.14 str.upper doc says “The uppercasing algorithm used is described in section 3.13 ‘Default Case Folding’ of the Unicode Standard.” For 3.15, the link is updated from unicode 16 to unicode 17 as per our normal unicode update policy.