File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes OO, Patterns, UML and Refactoring and the fly likes Modelling translatable entities in a database Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » OO, Patterns, UML and Refactoring
Bookmark "Modelling translatable entities in a database" Watch "Modelling translatable entities in a database" New topic

Modelling translatable entities in a database

Paul Sturrock

Joined: Apr 14, 2004
Posts: 10336

This is just a fairly general question to canvas people's experience or opinions. We are weighing up the pros and cons of how to model multi-lingual data in a relational database. The "clean" design we reckon is to have a translation table for every entity (so "person" would also have "person_tr"), each translation table would mirror the translatable attributes (i.e. just character data) and be differentiated by locale. So for the entry in "person":

we would have the following in "person_tr":

for example.

This makes the data easy to maintain, however we are concerned that we have tables that might have quite a high number of translatable entities associated with them, so searches by a translated value become significantly more complex. We've got a variety of exotic ways of mitigating this, but I'm curious if anyone has already dealt with this problem or can suggest a better design?

Any input is most welcome.

JavaRanch FAQ HowToAskQuestionsOnJavaRanch
Jan Cumps

Joined: Dec 20, 2006
Posts: 2565

Oracle Apps (Oraccle eCommerce Suite) uses a similar design as the one you are suggesting.
To avoid that each report or form has to recreate the proper links between the core and translation tables, they supply a view that does that work for you.
The view uses a stored function in it's joins to look up the current logged on user's language, and will return the data in that language.

OCUP UML fundamental and ITIL foundation
Scott Selikoff
Saloon Keeper

Joined: Oct 23, 2005
Posts: 3753

Why multi-language in the database at all? Technically, its a facet of the UI. The person is a "secretary", regardless of the language or how its spelled. In other words, it mixes data and presentation layers, to have the database concerned about supporting multiple languages. If possible, I would leave the multi-language aspect in the control of whatever is outputting the data and use common i18n properties files as needed.

Is there an advantage to having the database know about different languages?

[OCA 8 Book] [Blog]
Roel De Nijs

Joined: Jul 19, 2004
Posts: 8393

Hi Paul,

The application of the client I work for has to support 2 languages: Dutch and French (because we live in Belgium ) So we have a similar issue like you, although in our case it's very limited. When we had to decide about the approach we would take we discussed 2 alternatives:
a) the one you mentioned
b) for each translatable attribute create 2 columns: one for dutch and one for french

We opted for the 1st alternative, because it's a lot easier to maintain (certainly if you have to add a 3rd language). So we have a language table with the possible languages and our person_tr table has columns person_id, language_id and a column for each translatable attribute.
Regarding searches it's not that complex, maybe also because we have built a framework ourselves which takes care of these things. Of course you have to join to the appropriate tables, but I would not consider joining as something complex.

Hope it helps!
Kind regards,

SCJA, SCJP (1.4 | 5.0 | 6.0), SCJD
Mark Spritzler

Joined: Feb 05, 2001
Posts: 17276

Yeah, I would rather have resource files for each language and add words to it and let the UI and UI framework use the correct resource file.

For the database, for me, the main point would be what is the easiest to add new words, new tables and/or new languages.


Perfect World Programming, LLC - iOS Apps
How to Ask Questions the Smart Way FAQ
Paul Sturrock

Joined: Apr 14, 2004
Posts: 10336

Thanks folks!

Jan - views are a distinct possibility, unfortunately we have quite a significant support headache for that. We support multiple versions of multiple databases, so pushing the logic into the database is not a popular choice.

Scott/Mark - the reasons is we need end users to be able to set this stuff up on the fly. Though I'd agree with your logic (and its how we deal with all static text data), we can't really push this maintenance on to our customers. My secretary is maybe a bad example, a more common example of what we need is a business description associated with an entity - so several hundred lines of changing text explaining a thing. We need to support customers adding these as new languages are supported by them, and updating their content frequently (in some cases many times in one day).

Roel - its good to hear your experience. Our primary diver was Canadian customers, so a very similar requirement as you probably have in Belgium. Our concern with the complexity of join is that we have some big entities that join to a lot of other entities (we have an awful legacy data model). We already mitigate performance issues by denormalizing some of these; we'd loose that if we used translation tables. So a query currently performing five joins might jump to being a query performing twenty five joins, which we are worried will be a significant determiner of the performance of the application. Needless to say these big entities are by far the most frequently queried things in the database.

Thanks again for you input. Its good to hear we are having the same sort of thoughts as you. Now we just need to make a decision...
I agree. Here's the link:
subject: Modelling translatable entities in a database
It's not a secret anymore!