-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Description
By design, str::to_lowercase and str::to_uppercase do not depend on the language of the text (which shouldn’t be assumed to be the same as the locale of the machine running the program).
Mostly, this means ignoring the conditional mappings in Unicode’s SpecialCasing.txt, with one exception: the greek letter Sigma is Σ in upper-case and σ in lower-case except in word-final position, where it is ς. The corresponding mapping in SpecialCasing.txt is:
# <code>; <lower>; <title>; <upper>; (<condition_list>;)? # <comment>
03A3; 03C2; 03A3; 03A3; Final_Sigma; # GREEK CAPITAL LETTER SIGMA
With Final_Sigma defined in the Unicode standard:
C is preceded by a sequence consisting of a cased letter and then zero or more case-ignorable characters, and C is not followed by a sequence consisting of zero or more case-ignorable characters and then a cased letter.
(cased letter and other terms have a precise definition given beforehand.)
Since char::to_lowercase doesn’t know context, I think it should just return σ for Σ. But str::to_lowercase does have context and could implement this conditional mapping.