The Unicode Blog: Unicode CLDR v40 Beta available for testing

Wednesday, October 6, 2021

Unicode CLDR v40 Beta available for testing

The Unicode CLDR v40 Beta is now available for testing. The beta has already been integrated into the development version of ICU. We would especially appreciate feedback from non-ICU consumers of CLDR data. Feedback can be filed at CLDR Tickets.

Beta means that the main data, charts, and specification are available for review, but the JSON data is not yet ready for review. Some data may change if showstopper bugs are found. The planned schedule is:

Oct 27 — Release

In CLDR v40, the focus is on:

Grammatical features (gender and case) for units of measurement in additional locales

In many languages, forming grammatical phrases requires dealing with grammatical gender and case. Without that, it can sound as bad as "on top of 3 hours" instead of "in 3 hours"
Phase 1 (v39) of grammatical features included just 12 locales (da, de, es, fr, hi, it, nl, no, pl, pt, ru, sv).
Phase 2 (v40) has expanded the number of locales by 29 (am, ar, bn, ca, cs, el, fi, gu, he, hr, hu, hy, is, kn, lt, lv, ml, mr, nb, pa, ro, si, sk, sl, sr, ta, te, uk, ur), but for a more restricted number of units.

Emoji v14 names and search keywords

These supply short names and search keywords for the new emoji, so that implementations can build on them to provide, for example, type-ahead in keyboards

Modernized Survey Tool front end.

The Survey Tool is used to gather all the data for locales. The outmoded Javascript infrastructure (very difficult to enhance or even fix bugs) was modernized.

Specification Improvements

Notably in the areas of Locale Identifiers, Dates, and Units of Measurement

There are many other changes: to find out more, see the draft CLDR v40 release page, which has information on accessing the date, reviewing charts of the changes, and necessary migration changes.

Unicode CLDR provides key building blocks for software supporting the world's languages. CLDR data is used by all major software systems (including all mobile phones) for their software internationalization and localization, adapting software to the conventions of different languages.

Over 144,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

Wednesday, October 6, 2021

Unicode CLDR v40 Beta available for testing

Links of Interest

Blog Archive

Labels

Followers

Wednesday, October 6, 2021

Unicode CLDR v40 Beta available for testing

Links of Interest

Blog Archive

Labels

Followers

Subscribe to this blog