Internationalization (i18n) in .NET MVC (resx)

resxHow i18n works in .NET/MVC: If you are creating a .NET/MVC application with multilingual users you would certainly want to localize your content. In this post we will give an overview of how that can be achieved while developing in .NET.

.NET provides us with components to support i18n out of the box with the following features:

  • It can display contents in different languages.
  • It autodetects the language from the user’s browser.
  • It allows the user to override the language of their browser.

Internationalization consists of Globalization which is supporting multiple cultures and Localization which is customizing your application for a given culture.

Continue reading

i18n Resource File Formats: iOS .strings files

Welcome to a new entry in our blog series covering localization (i18n) resource file formats, this time looking at an extremely popular resource file type.

.strings
files are used in the Apple world (e.g. iPhone, iPad, …).

logo170When handling files (for your iOS projects for example), lingohub expects that:

  • key-value pairs are delimited with the equal character (=), and terminated by a semicolon (;).
  • keys and values are surrounded by double quotes (“)
  • place-holders look can be: %.2f, %d, %1$s (regular expression for placeholders: /%[\d|\.]*\$*\d*[dsf]{1}\b+/)
  • comments start at the beginning of the line and span the whole line or multiple lines
  • single-line comments start with double slashes (//)
  • multi-line comments are enclosed in /* */
  • a comment is assigned to the next key-value pair unless there are any blank lines in between
  • on export, single-line comments are exported using single-line syntax, and multi-line comments using multi-line syntax
  • UTF-16LE encoding is expected by default, but users can set different encoding

.strings file example:

// this comment is ignored because it is not directly followed by a key-value pair

/* comments in .strings files 
can be multi line, 
single line */
// or combination of the two
"hello_user" = "Hello %1$s";

// keys and values can be spread to multiple lines
"welcome_message" = "Welcome back, 
we have missed you";

// multi line comments belong to the next key value pair
// as long as they are not interrupted by a white line
"visit_count" = "this is your %1$d visit to our site";

References:

Integrating internationalization in a GIT workflow

Several of our customers asked us to support branches (from versioning systems like GIT or SVN) in lingohub. The reasoning behind this is, that i18n resources and their according associations often belong to one or more branches, and then you should be able to merge them (GIT-like) with others. One of my beliefs is that a good product owner must be very skeptical of “feature requests” and instead look for the problems and/or the desired benefits behind such a request. In this case, a faster release cycle including translated text was of the essence – and is in fact also very dear to lingohub’s mission. Once we narrowed that down, we focused on some of the circumstances and on common practices of the current situation determining the agility of your release cycle:

  • Developers usually text themselves (in English, or their native language, or both).
  • Text changes during the development of a feature, hence a final version often exists only at the very end of a development cycle.
  • Different language implementations are seldom tested already during development.
  • Very often, a review process for the texting and other copywriting (in source and target languages) is not in place.

Based on this feedback, we started working on tool support to allow managing a faster release cycle for your translation. Basically, our main goal is to minimize the time it takes to translate text *without* sacrificing the quality. We already released a couple of features for that purpose: our lingohub CLI client for scripted automation, and lingoChecks for verification – but this is just the start, there will be more announcements in the upcoming weeks. However, this blog post is about a working solution for today. What’s a best practice for integrating internationalization? Let me outline for you how we translate lingohub:

For every feature, we create a feature branch named after the issue ID (we use JIRA, so in our case lwe-xx). Finished features are merged into our master branch (all specs green), from there it takes another merge into the production branch to actually get released. Some leave out the production branch, but this is a typical GIT workflow. However, our i18n branch is special. This is the branch that gets synchronized with lingohub. Every time someone pushes something to the i18n branch, lingohub checks for new or updated texts. Having a specific branch for internationalization allows us to specify the moment when texts are considered finished. This can be during any stage of feature development (specs can be red or the feature is just a mock up). It decouples the development cycle from the translation cycle. For every release (production branch), we then synchronize our resource files by script via lingohub’s REST API: Deployment and retrieval of the texts is one step with one shell command.

Since an image is worth a 1000 words, here is a diagram describing the workflow (without production branch since it is not necessary):

git workflow

  1. Development in feature branch, merge with i18n branch once the texts are ready.
  2. Automatic synchronization with lingohub (we use lingohub’s Github integration, but you can also use git hooks).
  3. Feature development is finished and merged into master (or any other ‘production’ branch).
  4. Before the release of the feature, the texts are synchronized with lingohub.

This is the workflow we use for translating our texts. Since ‘eating your own dogfood’ is the best way to actually see what’s working and what not, we know the strengths and weaknesses of this approach. What we really like about this workflow is how easy it can be integrated in any software development process. Nevertheless, there are still some steps and issues we think can be improved to help you: better feedback on translation texts, faster translations, communication integration and more checks.

This is just one way to use lingohub. With our API you can use the service as you like. Please check out our developer and API documentation. Don’t hesitate to contact me (helmut@lingohub.com) if you are using other approaches or if you are having questions. We welcome your feedback and work especially close with our customers on identifying other areas where we can make localization an even better experience for you.

 

Ruby: ensure_encoding to ensure your encoding

Consider following situation: you receive a file, either from your drive or from HTTP, and you have no clue what encoding was used for the content of this file. Nothing is stored with the content that indicates the encoding. Even scanning the file’s content doesn’t help to derive the encoding. It is just a series of bits and bytes. Every byte maps to a different character based on the used charset. A ISO-8859-1 file may have the same byte sequence as a UTF-8 file. But just by applying the correct encoding as mapping will lead to a meaningful sequence of characters.

You can consider yourself lucky if:

  • you know the file type and which can indicate the encoding
  • the actual encoding is stored inside the file’s content. For example the encoding directive in XML files (which of course can be wrong as well)

At lingohub we use file types to parse language resource files:

Unfortunately, this is the theory, in practice the file’s encoding can be anything . Even developers touching this resource files might not know which encoding is set in their editor. They never had to think about that. The code they edit just works fine since it hardly includes localized characters or is just executed in a similar environment as it was written.
But now imagine that you send your language resource files to your translator. The resource file then gets edited on a Windows systems and is would be automatically saved CP-1257 encoded. You don’t expect that a translator with no technical background will be aware of the fact that your resource file parser actually expects UTF-16, do you?

At lingohub had to find a solution for such a situation. Our import must handle *all* resource files regardless of the used encoding. Like above mentioned there is not actual evidence that indicates the encoding 100%, however, we found a way which works quite well:

  1. Importing the file in binary format.
  2. Start to apply an encoding to this byte sequence,
  3. if it fails, try the next encoding.
  4. Until the conversion works out fine.

Because this is a common approach there are several implementations out there. We have chosen ensure_encoding. It does a great job (even it is called experimental).

After requiring this gem through

  • gem ‘ensure-encoding’, ’0.1′

it will add the ensure_encoding() method to String. This will give you the power to convert the given String to your preferred encoding without actually knowing the input encoding.

This usage will try to convert a String to UTF-8 while ‘sniffing’ the actual input encoding. 

Since it is not always possible to convert every character to the chosen target encoding there are several options how to handle that situation by setting the :invalid_characters option:

  • :transcode – will always try to convert characters
  • :raise – will raise an exception
  • :drop – will drop all non convert-able characters
The readme covers some scenarios, so one may choose which option would be the best fit.

While the ‘:external_encoding => :sniff’ option works great for UTF-16 and UTF-8, it is not able to handle all encodings we have to support for importing/exporting i18n resource files, we haven chosen to give ensure_encoding a hint which encodings we expect:

Your customers prefer Native Language

When translating software, it is particularly important to pay attention to native language usages. Many linguistic services call this type of service localization or “L10n“. Mobile and Web-based programs represent some of the newest and most innovative software. However, in order to compete globally, software vendors need to internationalize products. After all, writing up a beautiful piece of code that is optimized for native English speaking users, automatically removes over 50 percent of the software market from your potential customer base. Those for whom English is a second language have difficulty using software without a native language option. Localization is a necessary part of optimizing the User Experience. The “side effect” of doing so, is maximizing profits.

Why is Localization Important vs. Basic Text Translation?

Localization takes into account colloquial terminology. Translation is often very literal. In fact, with “machine translation”, the results are so literal as to be unusable in many instances. In one example, when a company tried to use machine translation for a restaurant sign, the result was an error message. Translate Server ErrorNot speaking the target language, the text for the error became the company’s translation. Even traditional translation returns results that confuse locals. For example, when translating “Like father, like son,” to Chinese, a translator might go literal, however, the more correct phrase would re-translate from Chinese as “Tigers do not breed dogs.” An expert translator who specializes in software localization will use the most correct translation and not a literal transliteration from language to language. A good example of how important that can be, even within the same language, is using ‘flat’ rather than ‘apartment’ when marketing real estate management software in the UK. That one word change can make the difference between a first page result on a search engine and a tenth page result.

Offer Software in Customers’ Native Languages for Optimum Sales

Maximizing sale potential is all about providing a user-friendly software package that offers value to the customer. If developers do not make the product available in the customers’ native language, they are unlikely to buy. Even skilled English speakers have trouble reading the language fluently, particularly if their native language uses a different alphabet.  Customers want easy to use and intuitive products. Software localization is a necessary step to achieve that. If developers don’t rise to challenge, they stand to lose out on a huge percentage of the market. English may make up 70 percent of the software market today, but with “developing nations” increasing their Internet penetration every day, that is likely to change in the next years. Go native and deliver that great product your customers deserve.