Monday, December 7, 2009

Localization support, part 1. Theory

DataObjects.Net 3.x, the successful predecessor of DataObjects.Net version 4.x, contained tons of useful features and feature named “Multilingual database support” was among them. It was implemented at the very core level of ORM as any other feature in 3.x branch by the only reason: the product was architecturally monolithic and all modules were highly coupled with each other.

The design of 4.x version propagates the idea of low coupling, the ORM consists of a set of separate modules which are mostly independent from each other. This approach we are going to apply to localization feature as well.

Requirements

Let’s list requirements for the feature. What do we want from it?

  1. Simplicity and power in one place. The less developer need to do to add localization support to his application, the better. Declarative approach with minimum coding activity on Domain modeling stage will be the right one.
  2. Adequate performance. Usage of localization should not add performance drawbacks.
  3. Automatic integration with Thread.CurrentCulture and/or Thread.CurrentUICulture infrastructure.
  4. Possibility to add new culture(s) in runtime (optional). The less changes to database schema are required in order to add new culture, the better.
  5. Transparent LINQ support.
  6. The less ORM knows about localization, the better. Ideal option is standalone add-on, made on top of the ORM.
  7. There should be a way to get not only one localization for current culture for the particular entity but a set of localizations. This might be required for application administration (translation, adding new cultures and so on).

Database schema level implementation

While there is numerous ways to implement localization support in Domain modeling level, there are only several ways to do it on database schema level. Personally, I see the following options:

1. Localized columns

LocalizableColumns

Every localizable persistent field is mapped to a set of columns, each of them is mapped to the corresponding culture. This approach was used in DataObjects.Net 3.x.

Pros:

  • No performance drawbacks. No additional queries, joins, subqueries is required.
  • Simplicity and obviousness. It is easy to edit culture-dependent columns right in database.
  • Data integrity out of the box because all culture dependent columns are stored in the same row as entity itself.
  • Possibility to configure parameters of each localizable column (length, nullability, type).

Cons:

  • ORM must know about localization in order to fetch or persist to the required set of columns.
  • Database schema alteration is required in order to add new culture to application.
  • No way go retrieve column values for cultures other than the current one.
  • Is not clear how to cache localized data in Session-level and Domain-level cache.

2. Localized tables

LocalizableTables[1]

Localized columns are moved to separate tables, one for the particular culture. This is an analogue of approach in .NET application localization when localized strings and other resources are located in separate assembly and are loaded automatically.

Pros:

  • Data integrity is provided by the foreign keys constraints with ON REMOVE = CASCADE option.
  • Simplicity. All culture-dependent values for the particular entity are located in corresponding table.
  • Possibility to configure parameters of each localizable column (length, nullability, type).

Cons:

  • Join operation is required in order to fetch columns for appropriate culture.
  • All from “Localized columns” approach.

3. Localized entities

LocalizedEntities

Localizable entity is split into 2 parts: common part - “Page” and localized one - “PageLocalization”. Second entity contains localizable fields and its primary key consists of 2 fields: a reference to localizable entity and string representation of CultureInfo (generally, CultureInfo.Name).

Pros:

  • ORM doesn’t know anything about localization at all.
  • Database schema alteration is not required in order to add new culture.
  • It is rather easy to fetch all translations for the particular entity with 1 query.
  • Standard ORM-level caching out of the box.
  • Data integrity is provided by the foreign keys constraints and ORM.
  • Possibility to configure parameters of each localizable column (length, nullability, type).

Cons:

  • Join operation is required to fetch a set of columns for corresponding culture.

4. Localized strings

LocalizedStrings

The most weird one. Could be invented by some geek in experimental goals only. Anyway, let’s investigate it.

Pros:

  • Database schema alteration is not required in order to add new culture.
  • It is rather easy to fetch all translations for the particular entity with 1 query.
  • Data integrity is provided by the foreign keys constraints.

Cons:

  • ORM must know about localization in order to fetch or persist to the required set of columns. 
  • Is not clear how to cache localized data in Session-level and Domain-level cache.
  • Separate query is required to fetch localized data for the particular entity.
  • Parameters of each localizable column (length, nullability, type) can’t be configured separately. “Strings.Value” column is used for all localizable fields, such as strings, integer types, dates and so on, hence maximal value size should be used with no constraints.
  • It is not obvious how to handle localizable field renaming.

Having this options evaluated should definitely help us to choose the most appropriate one for adding localization support to DataObjects.Net 4.1.

Which option for ORM would you take if you are to decide?

11 comments:

  1. Option 3 (Localized entities), we already have similar system in our framework (currently without DO) and it seems to us the best option we could select.

    ReplyDelete
  2. Option 3 seems the best option! And an UpgradeHint helper would be handy (to convert a string column into a localized string column and from localized to single column)

    ReplyDelete
  3. Hello, guys!
    Thanks for your choice and comments. We consider the 3rd option as most valuable, although there might be some problems with database-based full-text search implementation.

    And special thanks to Marco for the hint about Upgrade hint. ;)

    ReplyDelete
  4. It seams that first case (Localized Colums) can be also implemented without modifiyng DataObjects.Net code. We can introduce new attribute like [LocalizedField] and add all localized columns on domain model definition step. It has one mahor advantege: susch database structure can be used with native full text search engin.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. I would try to implement first option as a separate module, if you manage it then your architecture is truly modular and loosely coupled. It will be like a test.

    Option 3 can be implemented by users without great efforts, so I would not spent time as far as there are important features which users can not implement themselves (like caching or offline support).

    By the way, if truth be told, it is surprising that you decided to implement this feature now.

    ReplyDelete
  7. Hello Alex,

    Both 1 & 3 options can be implemented as standalone modules (or samples, as you like). I'm going to publish the 3-rd option solution and Alex Kofman might do the same for the first one, as he strongly prefers that approach.
    Actually, DO team is probing various patterns on how localization can be made at all. The final decision depends on full-text feature implementation which is coming.

    ReplyDelete
  8. I'd like to see and use option 3, when it can come to nightly build ? ;-)

    ReplyDelete
  9. Hello Peter,

    I'm going to publish the solution tomorrow, it is implemented as standalone module and will be included into DataObjects.Net 4 Samples shortly.

    Stay tuned =)

    ReplyDelete
  10. I just had a look at your sample in the Xtensive.Storages.Samples.. Nice and should work well, but .. how about quering? It should be possible to use Where Page.Title == x (using the current culture)

    ReplyDelete
  11. Hello Marco,
    Glad to hear that =)

    I'll discuss the querying shortly, in one of the upcoming posts.

    ReplyDelete