Thursday, March 13, 2025

ICU 77 Now Available!

Unicode® ICU 77 has just been released. ICU is the premier library for software internationalization, used by a wide array of companies and organizations to support the world's languages, implementing both the latest version of the Unicode Standard and of the Unicode locale data (CLDR).

ICU 77 updates to CLDR 47 (beta blog) locale data with new locales, and various additions and corrections.

ICU 77 is mostly focused on bug fixes, segmentation conformance, and other refinements.

The Java technology preview implementation of the CLDR MessageFormat 2.0 specification has been updated to incorporate the CLDR 46.1 spec plus most but not all of the CLDR 47 changes.

The C++ technology preview implementation of MessageFormat 2.0 is not yet quite up to date with CLDR 46.1.

Please note that for ICU 78 (2025-oct) we are planning to (a) upgrade from Java 8 to Java 11, and (b) remove the ICU4J Locale Service Provider. See the ICU 77 page for details.

Unicode CLDR 47 Release: MessageFormat 2.0 Stable

CLDR 47 is now available and has been integrated into version 77 of ICU. The CLDR 47 release page has information on accessing the data, reviewing charts of the changes, and — importantly — Migration issues including upcoming changes planned in CLDR 48.

The Unicode CLDR project provides key building blocks for software to support the world's languages (dates, times, numbers, sort-order, etc.). For example, all major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)

Key changes in CLDR 47

CLDR 47 did not have a Survey Tool submission phase, and focused on tooling and just a few functional areas. The biggest change is that the MessageFormat 2.0 specification has advanced from Final Candidate to Stable. This means that the stability guarantees are in place and implementations can finalize their APIs.

MessageFormat 2.0 Stable

Software needs to construct messages that incorporate various pieces of information. The complexities of the world's languages make this challenging. MessageFormat 2.0 enables developers and translators to create natural-sounding user interfaces that can appear in any language and support the needs of various cultures.

The new MessageFormat defines the data model, syntax, processing, and conformance requirements for the next generation of dynamic messages. It is intended for adoption by programming languages, software libraries, and software localization tooling. It enables the integration of internationalization APIs (such as date or number formats) and grammatical matching (such as plurals or genders). It is extensible, allowing software developers to create formatting or message selection logic that add on to the core capabilities. Its data model provides the means of representing existing syntaxes, thus enabling gradual adoption by users of older formatting systems.

Tech Preview implementations are available in C++, Java, and JavaScript:

ICU4J, Java: com.ibm.icu.message2, part of ICU 76, is a tech preview implementation of the MessageFormat 2.0, together with a formatting API. See the ICU User Guide for examples and a quickstart guide, and Trying MF 2.0 Final Candidate to try a “Hello World”.
ICU4C, C++: icu::message2::MessageFormatter, part of ICU 76, is a tech preview implementation of MessageFormat 2.0, together with a formatting API. See the ICU User Guide for examples and a quickstart guide, and Trying MF 2.0 Final Candidate to try a “Hello World”.
Javascript: messageformat 4.0 provides a formatter and conversion tools for the MessageFormat 2 syntax, together with a polyfill of the runtime API proposed for ECMA-402.

(Because of the timing, these implement a slightly earlier version of the spec, but can be used for initial evaluation, testing, and experimentation.)

Tooling changes

Many tooling changes are difficult to accommodate in a data-submission release, including performance work and UI improvements. The changes in CLDR 47 provide faster turn-around for linguists and higher data quality. They are targeted at the CLDR 48 submission period, starting in April 2025.

For more information

See the CLDR 47 release page, which has information on accessing the data, reviewing charts of the changes, and — importantly — Migration issues.

Support for the New Saudi Riyal Currency Symbol

In February of this year, the Saudi Central Bank (SAMA) announced the creation of a new symbol to represent the Saudi riyal currency. This was widely noticed by users, font developers and other vendors, and many are wondering how it should be supported. The Unicode Consortium has received a number of inquiries regarding this. In this blog post, we want to let you know about our plans for supporting the Saudi riyal sign, and provide other information to help vendors plan to support the new symbol.

When SAMA announced the new symbol, they also provided related pages with usage guidelines and FAQ information. In the FAQ page, they provide information about expected timeline for implementation:

It can be put into use immediately, but “reflection in financial and commercial transactions and various applications will be done gradually and in coordination with relevant entities.”

Allowance for gradual implementation is important since vendors need time to implement and deploy changes in their products and services.

Implementation Guidance

Vendor support for a new currency symbol can involve many different things, such as the following:

Updates to fonts
Updates to software keyboard layouts or new designs for physical keyboards
APIs for formatting currency values
Generation of financial statements and reports
Updates to applications, online services or devices for commercial transactions

However, all of these depend on first establishing how the new currency symbol will be represented in Unicode. This starts with receiving a proposal to encode the symbol in the Unicode Standard.

After consulting with representatives from the Unicode Technical Committee (UTC), SAMA has now submitted a proposal to UTC for encoding a new character, SAUDI RIYAL SIGN. UTC will be taking up this proposal at its next meeting, to be held April 22 – 24, 2025.

Next steps

It is anticipated that UTC will approve the new character for encoding in Unicode Version 17.0, which will be released in September 2025. The Unicode 17.0 Beta will be released for public review in early May, and we expect the Saudi riyal sign will be included there. Details related to the encoding (code point, name, property data) are unlikely to change after the Beta is released.

There is a small possibility that some changes could be made at the following UTC meeting in July, when technical details for Unicode Version 17.0 are finalized. Some vendors may choose to start working on implementations once the Beta is available, but vendors should not distribute product updates until after Unicode Version 17.0 is finalized.

Extending support with CLDR

Many implementations use Unicode CLDR data for currency formatting, so incorporating the new symbol is an important step for widespread support. CLDR 48 is slated to be released in October 2025, and would contain the new currency symbol character as an “alternative” currency symbol for the Saudi riyal.

The reason for it being an alternative rather than the default is to avoid the symbol being displayed in contexts where fonts might not yet support the new symbol, causing users to see a missing glyph for their currency:

instead of

Later, when there is confidence that the symbol is more widely supported in fonts, a future CLDR version will change currency formatting to make the format with the Saudi riyal symbol the default, rather than an alternative.

People wishing to start using the new symbol in applications and services should anticipate that it could take several months or, in some cases, even years for vendors to implement and distribute product updates.

Working together to support the new Saudi riyal symbol

The introduction of the new Saudi riyal currency symbol marks a significant milestone for financial and commercial sectors in Saudi Arabia, and the Unicode Consortium is honored to help SAMA on this journey. We encourage stakeholders to participate in the public review for Unicode 17.0 Beta and to plan their implementations and adoption accordingly.

Thursday, March 6, 2025

Save the Date! Unicode Technology Workshop [November 11-13, 2025]

We are excited to announce that Microsoft will be hosting the 3rd annual Unicode Technology Workshop at its Silicon Valley campus!

📅 Dates: November 11-13

✈️ Nearest Airports: San Francisco International (SFO) or San Jose International (SJC)

Join us in person for three days of community building around the Unicode technology that makes software work for billions of people. Expect workshops, seminars, free-form discussions, and lightning talks centered around i18n libraries, locale data frameworks, globalization tooling, localization pipelines, input methods, and text rendering. Network with the developers and users to help shape the future of Unicode technology.

Expect to come away with deeper knowledge on how to solve tough problems in the i18n and l10n space and how to engineer products that work better for global users. To encourage maximum collaboration amongst the attendees, this is an in-person-only event.

New for 2025!

November 11th will be a pre conference day with tutorials and training for those who want to learn new or refresh their skills.

Become a UTW Sponsor

Sponsorship opportunities are also available, with discounts for Unicode organizational members.

What’s Next?

The call for submissions along with registration information will be available later this month. Space is limited, so please watch your inbox for further information.

If you have any questions in the meantime, please contact us at events@unicode.org.

Thursday, February 27, 2025

Unicode CLDR 47 Beta available for specification review: MessageFormat now Stable!

The Unicode CLDR 47 Beta is now available for specification review and integration testing. The release is planned for April 17th, but any feedback on the specification needs to be submitted well in advance of that date. The changes in the specification are available at Draft LDML Modifications.

The biggest change is that MessageFormat has advanced from Final Candidate to Stable. This means that the stability guarantees are in place and implementations can finalize their APIs. There are many planned changes for CLDR 48 — see the Migration section for a list of upcoming changes that will affect implementations.

The beta has already been integrated into the development version of ICU. We would especially appreciate feedback from ICU users and non-ICU consumers of CLDR data, and on Migration issues. Feedback can be filed at CLDR Tickets.

CLDR provides key building blocks for software to support the world's languages (dates, times, numbers, sort-order, etc.). For example, all major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)

Via the online Survey Tool, contributors supply data for their languages — data that is widely used to support much of the world’s software. This data is also a factor in determining which languages are supported on mobile phones and computer operating systems. CLDR 47 did not have a Survey Tool submission phase, and instead focused on tooling and a few functional areas.

MessageFormat 2.0 Stable

Tech Preview implementations are available in C++, Java, and JavaScript:

ICU4J, Java: com.ibm.icu.message2, part of ICU 76, is a tech preview implementation of the MessageFormat 2.0, together with a formatting API. See the ICU User Guide for examples and a quickstart guide, and Trying MF 2.0 Final Candidate to try a “Hello World”.
ICU4C, C++: icu::message2::MessageFormatter, part of ICU 76, is a tech preview implementation of MessageFormat 2.0, together with a formatting API. See the ICU User Guide for examples and a quickstart guide, and Trying MF 2.0 Final Candidate to try a “Hello World”.
Javascript: messageformat 4.0 provides a formatter and conversion tools for the MessageFormat 2 syntax, together with a polyfill of the runtime API proposed for ECMA-402.

(Because of the timing, these implement a slightly earlier version of the spec, but can be used for initial evaluation, testing, and experimentation.)

Tooling changes

For more information

See the draft CLDR 47 release page, which has information on accessing the data, reviewing charts of the changes, and — importantly — Migration issues.

Friday, February 7, 2025

Unicode CLDR 47 Alpha Now Available for Testing

The Unicode CLDR 47 Alpha is now available for integration testing.

CLDR provides key building blocks for software to support the world's languages (dates, times, numbers, sort-order, etc.) For example, all major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)

The alpha has already been integrated into the development version of ICU. We would especially appreciate feedback from non-ICU consumers of CLDR data and on Migration issues. Feedback can be filed at CLDR Tickets.

CLDR 47 focused on MessageFormat 2.0 and tooling for an expansion of DDL support. It was a closed cycle: locale data changes were limited to bug fixes and the addition of new locales, mostly regional variants.

RBNF improvements and Transforms

CLDR added Gujarati RBNF support, which provides number spell out functionality, and made improvements to many other languages.

Transforms were also improved in both CLDR 46.1 and 47 releases which included:

Adding a Hant-Latn transliterator
Aliasing Hans-Latn to Hani-Latn
Improvements to several other transliterators

More regional variants

Over the past few years there have been an increasing number of requests for locales to be added to languages, such as English, when they are commonly used in a region as a lingua franca.

CLDR has been adding additional child locales to support these requests and has begun restructuring inheritance to allow for better maintenance of shared regional data, such as currency symbols and metazone names.

46.1 Improvements

CLDR 46.1 was a special interim release of CLDR that focused on MessageFormat 2.0. It included a few additional changes:

More explicit well-formedness and validity constraints for unit of measurement identifiers
Addition of derived emoji annotations that were missing: emoji with skin tones facing right
Fixes to make the ja, ko, yue, zh datetimeSkeletons useful for generating the standard patterns
Improved date/time test data

For more information, see 46.1 Changes

Tooling changes

Many tooling changes are difficult to accommodate in a data-submission release, including performance work and UI improvements. The changes in CLDR 47 provide faster turn-around for linguists, and higher data quality. They are targeted at the CLDR 48 submission period, starting in April, 2025.

For more information

See the draft CLDR v47 Release Note, which has information on accessing the data, reviewing charts of the changes, and — importantly — Migration issues.

_______________________________________________

Thursday, March 13, 2025

ICU 77 Now Available!

Unicode CLDR 47 Release: MessageFormat 2.0 Stable

Key changes in CLDR 47

MessageFormat 2.0 Stable

Tooling changes

For more information

Support for the New Saudi Riyal Currency Symbol

Implementation Guidance

Next steps

Extending support with CLDR

Working together to support the new Saudi riyal symbol

Thursday, March 6, 2025

Save the Date! Unicode Technology Workshop [November 11-13, 2025]

Thursday, February 27, 2025

Unicode CLDR 47 Beta available for specification review: MessageFormat now Stable!

MessageFormat 2.0 Stable

Tooling changes

For more information

Friday, February 7, 2025

Unicode CLDR 47 Alpha Now Available for Testing

RBNF improvements and Transforms

More regional variants

46.1 Improvements

Tooling changes

For more information

Links of Interest

Blog Archive

Labels

Followers

Thursday, March 13, 2025

ICU 77 Now Available!

Unicode CLDR 47 Release: MessageFormat 2.0 Stable

Key changes in CLDR 47

MessageFormat 2.0 Stable

Tooling changes

For more information

Support for the New Saudi Riyal Currency Symbol

Implementation Guidance

Next steps

Extending support with CLDR

Working together to support the new Saudi riyal symbol

Thursday, March 6, 2025

Save the Date! Unicode Technology Workshop [November 11-13, 2025]

Thursday, February 27, 2025

Unicode CLDR 47 Beta available for specification review: MessageFormat now Stable!

MessageFormat 2.0 Stable

Tooling changes

For more information

Friday, February 7, 2025

Unicode CLDR 47 Alpha Now Available for Testing

RBNF improvements and Transforms

More regional variants

46.1 Improvements

Tooling changes

For more information

Links of Interest

Blog Archive

Labels

Followers

Subscribe to this blog