Paul Hastings
The process of making an application ready for global usage is globalization, or G11N (for the 11 letters between the "g" and the "n" in globalization). Globalization consists of two steps: internationalization, or I18N (for the 18 letters between the "i" and "n" in internationalization), and localization or L10N (for the 10 letters between "l" and "n" in localizationâif you're sensing a pattern here, yes there is, people working in this field are particularly fond of numeronyms). The atomic units for globalization are locales. Locales are the most important piece of G11N.
Locales are languages and calendars; date, number, and currency formatting; spelling; writing system direction; etc., that are specific to a geographic region. For instance, the English (color) and date formats (month/day/year) used in Brooklyn are not exactly the same as the English (colour) and date formats (day/month/year) used in Perth.
Since version 7, ColdFusion locales are based on core Java locales, which means the locale information concerning formatting, calendars, etc. are an integral part of the underlying Java platform and vetted by the Unicode Consortiumâyou do not have to research these yourself. Locale resources are accessed via locale identifiers in the form of language_Country, such as en_US for the English language used in the United States, or fr_CA for the French language as used in Canada. Take note of the case, as in all things related to core Java, it matters.
The process of I18N is the first step in making your application globalized. It consists of stripping away any hard-coded text and date, time and number formatting that tie it to a specific locale and replacing these with variables or functions that can dynamically return locale appropriate content or formatting. This allows for the application to be adapted to various locales without major engineering changes. Considering the amount of work this can involve, it is perhaps best done from the outset rather after the fact (though its quite common for applications to be undergo G11N well after development has completed).
Once an application has undergone I18N, the next step is making it ready for use in a given locale. This is done by translating all of the application's text into the various languages that your application will support as well as using appropriate locale-based functions to format dates, numbers, currency, etc. These function are usually prefixed by "LS" such as LSdatetimeFormat. You can find a reference for all ColdFusion functions and tags related to globalization here: Developing Globalized Applications
Note that during the translation process, locales need to be considered as well. It helps fine tune the content to a specific region.
Lets say you have an original web application dealing with appointments and had something like the following block of code:
writeoutput( "You have the following appointments for #dateFormat(now())#:" );
<ul>
for (i=1; i<=todaysAppointments.recordCount; i++) {
writeoutput( "<li>#appointmentWith#: #timeFormat(appointment)#</li>" );
}
</ul>
After I18N that would look like the following:
// note since around CF9, ColdFusion should automatically determine page encoding, therefore <cfprocessingdirective > is
// no longer required. however if you find ColdFusion is not picking up the correct page encoding you can force
// the issue using the following line of code at the tp of your ColdFusion page:
// pageencoding "UTF-8";
setLocale( session.locale );
writeoutput("#appointmentsRB[session.locale].appointmentsText #LsdateFormat(now(),'FULL')#: ');
<ul>
for (i=1; i<=todaysAppointments.recordCount; i++) {
writeoutput("<li>#appointmentWith#: #LstimeFormat(appointment,"MEDIUM")#</li>");
}
</ul>
where
Since your entire globalized application flows from a user's locale it is critical that this be captured and stored for use throughout the application. The easiest way of capturing a user's locale is simply to ask for it. For example, if your application requires registration, ask for it then. As a tip, it's usually a good idea to display locale choices in that language: French in French, Thai in Thai, etc.
If asking isn't possible, a stealthier approach is to examine the user's browser-provided data such as http_accept_language which is accessible via the CGI scope. You can also supplement this by examining the user's IP address and looking up their country from one of the many online geolocation services. Note that it's a good practice, however, to also provide a way for a user to manually change their locale.
It's important that the user's locale be persisted and used throughout the application. A session-scoped variable is often the best choice for this. The relevant ColdFusion function for handling locales are:
A character encoding is a map for each character in a language to a numeric code that can be represented in a computer. For a variety of reasons, there are often more than one encoding for a language. Japanese, for example, is represented by Shift-JIS, EUC-JP, and ISO-2022-JP encodings. Since each encoding is basically the same set of numbers, data and application content couldn't be dynamic. For instance, in a web application, each web page had to be encoded in that language's code page--developers therefore had to develop and maintain one page per encoding the application needed to support. Data for each encoded web page had to be stored in its own column in a table or its own table in a database. Prior to the creation of Unicode in 1987, developers always had to climb this encoding Tower of Babel, making globalized applications extremely costly and very limited in scope. Unicode allows for a single, standard encoding scheme for much of the world's languages. Since all modern databases now support it, it's the logical choice for character encoding for globalized applications. In short, just use Unicode.
A language's writing system dictates the way a language is written. There are basically three flavors:
Writing systems have an impact on an application's layout and display. It's especially important for RTL languages. Not only is the text RTL, but the whole application layout needs to be RTLâvisual attention is no longer in the upper-left corner but instead in the upper-right.
As mentioned above, one common way to handle localized text at run time is via resource bundles. All static text in the application is replaced by variables that hold that specific text translated into the different languages that need to be supported. A resource bundle can be a simple ColdFusion structure:
<cfscript>
appointments={};
appointments.en_US.greeting="Hello"; (American English)
appointments.fr_FR.greeting="Bonjour"; (French)
appointments.de_DE.greeting="Hallo"; (German)
appointments.sv_SE.greeting="Hallå" (Swedish)
appointments.th_TH.greeting="สวัสà¸à¸µ"; (Thai)
appointments.ja_JP.greeting="ãããã"; (Japanese)
</cfscript>
where the various locales in this structure could be accessed at run time by supplying a locale key (en_US, fr_FR, etc.). This approach, however, is very cumbersome to maintain, especially as the application grows in complexity or the number of locales it supports.
Date formats are particularly vexing to many developers. For instance, a date string of 1/2/2012 in en_US locale is January 2, 2012, while in en_AU it's 1 February 2012; that's quite a difference if you're making an appointment. To ensure that date and time formats are correct for a user's locale, use the following functions:
to format dates and times.
Note that it's usually a good idea to format dates and times using one of the regular masks, FULL, LONG, MEDIUM, or SHORT to ensure your application isn't breaking some cultural rule as well as to make sure the dates and times the application displays to the user can also be parsed back to a ColdFusion date-time object via LSParsedatetime function.
ColdFusion dates are based on the Gregorian calendar which will suffice for many locales. While this is an advanced topic, it's important to note that there are several other calendars in common use globally:
Time zones are another issue to many developers, especially if users are in one time zone while the ColdFusion server is in another. Again, this is an advanced topic but developers should take note of the following points:
Similar to date and time formats, it's important to display numbers and currencies in the locale's correct format. For example, 123456789.10 in en_US locale would be understandably displayed as 123,456,789 to Americans. The same number would be displayed as 123 456 789 in fr_FR locale and unfamiliar to most Americans. The relevant ColdFusion functions for formatting numbers and currencies are:
Note that the masks used in these function will map the dollar sign ($), decimal (.) and comma (,) to the appropriate locale symbols. Also note that the LScurrencyFormat function will not convert between currencies; it's not a function for handling exchange rates.
As mentioned above, modern databases are all Unicode-capable and shouldn't be an issue when considering which one to deploy. It's only important that any database-specific setup, data types, collations, etc. be followed to ensure your application's database is able to fully use Unicode and perform sorting in a locale specific way.
Way back in 2011 (somewhere between CF9 and CF10) Adobe steathily introduced the powerful International Components for Unicode for Java libary AKA ICU4J into Coldfusion's standard installation. Despite the lack of fanfare, this is a significant update to ColdFusion's G11N capabilities. Among other things it provides:
For further information see: