Unicode CLDR 23.0 contains data for 215 languages and 227 territories—654 locales in all. This release focused primarily on improvements to the LDML structure and tools, and on consistency of data. It includes substantially improved support for non-Gregorian calendars (such as the Japanese Imperial calendar used extensively in Japan). The data and structure has also been modified to easily permit changing between 12 and 24 hour formats, and between 2 digit and 4 digit years. The new Unicode character is used for the Turkish Lira, and information is provided for currencies that round to 5 cents (or other subunits) in cash transactions. For most languages that use non-Latin scripts, characters in the language’s script now collate before those in other scripts (including A-Z). Language-specific letter-casing changes (Lower, Upper, Title) have been added for Azerbaijani, Greek, Lithuanian, and Turkish. Keyboard data has also been updated for Android. Also, as of this release, the LDML specification is split into multiple parts, each focusing on a particular area.
The release had a short cycle so that we could move to the new regular semi-annual schedule. It thus only included a limited data submission phase, for 4 languages only: Armenian (hy), Georgian (ka), Mongolian (mn), and Welsh (cy). For those languages, the data increased by over 100%.