SitePoint Sponsor

User Tag List

Page 1 of 2 12 LastLast
Results 1 to 25 of 31
  1. #1
    SitePoint Enthusiast haaX's Avatar
    Join Date
    Jun 2004
    Location
    VardÝ, Norway
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    CMS - Multiple languages, DB Design

    Hi guys,

    I'm currently creating a CMS which should support multiple languages. However, I need some input on how to design such a database. What I've done is to design it as three tables (look attached relation diagram).



    The fulltxt in lang_translation is a boolean, either 0 or 1, and chooses what table to look in for the return string (the one table contains a text field for the text, the other contains a varchar(255) field). The tablename in lang_translation is a field I'm not quite sure of. It's there to identify which type of content it is, so I have an sort of identifyer upon creating a search engine. Again, I'm not quite sure of this one.

    Every field which contains text should only be refered to as an integer, based upon the tID from the lang_translation. Here is how a table with references to this table could look like:

    Code:
    CREATE TABLE content (
      conID INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
      catID INTEGER UNSIGNED NOT NULL,
      userID INTEGER UNSIGNED NOT NULL,
      author VARCHAR(64) NOT NULL,
      created DATETIME NOT NULL,
      publish SMALLINT(1) UNSIGNED NOT NULL,
      hide SMALLINT(1) UNSIGNED NOT NULL,
      title INTEGER UNSIGNED NOT NULL, ## Integer referencing to the tID
      shortTitle INTEGER UNSIGNED NOT NULL, ## Integer referencing to the tID
      fulltxt INTEGER UNSIGNED NOT NULL, ## Integer referencing to the tID
      PRIMARY KEY(conID),
      INDEX content_FKIndex1(conID),
      INDEX content_FKIndex2(catID)
    );
    ... maybe shuold the title, shortTitle and fulltxt also be foreign keys (indexes)?

    Here's is the interface of what a Translator-class could look like.

    PHP Code:
    /**
    Class Translator
      Used to translate senteces and fulltext    
    **/

    class Translator
    {
        public function 
    Translator()
        {
        }
        
        
    /* return string */
        
    public function get($lang$id$type)
        {
        }
        
        
    /* return boolean */
        
    public function update($tableName$type$id$lang$newString)
        {
        }
        
        
    /* return boolean */
        
    public function add($tableName$type$id$lang$string)
        {
        }
        
        
    /* return boolean */
        
    public function delete($tableName$type$id)
        {
        }

    Would this be appropiate and functional enough, even when I also tries to create a search engine for this? Any serious pullbacks on this when it comes to efficiency?

    Any comments would be welcome.

    Best regards,
    haaX

    __
    Attached Images Attached Images

  2. #2
    SitePoint Zealot
    Join Date
    Mar 2004
    Location
    Milano
    Posts
    127
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Why don't you drop your lang_fulltext table and you move tour TEX column to lang_sent?
    You can set that both TEX and SENTENCE accept null as value and use one of the two in each record. For retrieve the data you have not to know where it is stored simply doing:
    Code:
    SELECT lang, IF(ISNULL(tex), sentence, tex) AS translation FROM lang_sent
    I don't really understand the tablename column, is that for namespace conflicts?
    Last edited by ponticelli; Aug 31, 2004 at 08:26. Reason: Small typo in the code.

  3. #3
    SitePoint Enthusiast haaX's Avatar
    Join Date
    Jun 2004
    Location
    VardÝ, Norway
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ah cool. Tried to look up isnull in the mysql manual but I didnt figure out what multiple statements within it meant.. and what that if does.. But I understand how the result will be.. and will do so. Thanks!

    Tablename is.. uhm.. kinda reference to what section the text belongs to. I think it's useless. But now, when I do a search in this table, how do I know how to output the content? Or is that a basic problem in searching which is easily solved (haven't looked into searching.. yet).

  4. #4
    SitePoint Zealot
    Join Date
    Mar 2004
    Location
    Milano
    Posts
    127
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    There was a typo in the code... no multiple parameters in the ISNULL() function.

  5. #5
    SitePoint Zealot
    Join Date
    Mar 2004
    Location
    Milano
    Posts
    127
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    About the content I think you are mixing too much 2 different aspects. One thing is translating, the other is showing the result. I think that storing how to show the result in the translation row is not a good combination.
    One thing I should you have to consider if there are possible name conflicts. For example the word "pear" should have many meanings (the fruit or the php repository?) and namespaces may be of any help in those sistuations.

  6. #6
    SitePoint Addict
    Join Date
    Mar 2003
    Location
    Germany
    Posts
    216
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm also pondering on i18n, and I decided to use a "context" or "namespace" column to avoid conflicts. For every translatable string I assign a namespace like "NAV_MENU" or "CONTACT_FORM". I think this can also help to quickly look up all strings for a certain area or page (although I don't know if this is relevant to your particular situation).

  7. #7
    SitePoint Enthusiast haaX's Avatar
    Join Date
    Jun 2004
    Location
    VardÝ, Norway
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ahh great ponicelli. The if-statment all makes sense now So I wouldn't need to consider searching for now? Hmm good, then table_name is gone.

    Concerning the possible conflicts; a context or a namespace column. I'm not quite sure if I get what you're trying to say.. but you define where the current translation can be found? I'm trying to put it all in context (in my mind).

    I think it's getting clearer. Although when will these namespaces be of use?

    Would probably have been good to have a relation between that table and another table; in this table we could store all possible namespaces. Then we refer to this table by an id (integer) in the namespace coulmn in the translation table.

    ..or?

  8. #8
    SitePoint Zealot
    Join Date
    Mar 2004
    Location
    Milano
    Posts
    127
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If you want to be more correct, maybe namespace should be called vocabulary. So you end up with a vocabulary table and a translation table.
    Attached Images Attached Images

  9. #9
    Non-Member
    Join Date
    Jan 2004
    Location
    Planet Earth
    Posts
    1,764
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Good topic, not yet had to ponder this task of variable languages myself, though I have put some thought into it, and was wondering if anyone has used an XSL stylesheet(s) for doing the actual translation, from English to Spanish for example ?

    This I think though would be more suitable for static texts, rather than database pulled data ?

    Any thoughts on this ?

  10. #10
    SitePoint Enthusiast haaX's Avatar
    Join Date
    Jun 2004
    Location
    VardÝ, Norway
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I used php and a switch statement for my static page :> Which was really messy and crappy, and not very suited. Thus, when I figured out I wanted to write an CMS, a mulit-lingual application was very natural. However, I don't know about XSL-stylesheets.

    However, I'm not quite sure if I figued out about this vocabulary thingie. Like, what different vocabulary should you have? As simple as each for each placement?

    Eg. all translations which are "standard" content should be 'CONTENT', page titles as 'TITLE', all information from the contact form as 'CONTACT_FORM', all menu translation as 'MENU' .... or should I be more specific? If not, I think I got it

  11. #11
    Non-Member
    Join Date
    Jan 2004
    Location
    Planet Earth
    Posts
    1,764
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Didn't quite understand the use of PHPs extension in reference to language internationalisation myself either, looking at one article it seamed to me a lot of extra work

    Though I'm looking over at www.xml.com for something on XSL-T and maybe if I find nothing there, then www.topxml.com should have something I'm hoping

    As for Vocabularies, you've lost me

  12. #12
    SitePoint Zealot
    Join Date
    Mar 2004
    Location
    Milano
    Posts
    127
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I think we are loosing a little the objective of this topic. Can you Haax make the point of what are your needings related to translations?
    Make it for points please...

  13. #13
    SitePoint Enthusiast haaX's Avatar
    Join Date
    Jun 2004
    Location
    VardÝ, Norway
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The CMS is a normal CMS, with support for multiple languages. Therefore, absolut all text outputted on the site will be created in the db so all text can be edited from within the CMS (absolutly no static-text for the site).

    The CMS will have different sections (or modules if you'd like) where each section relates to different tables (eg. main content - all "normal" content on a site, links, calendar, news, files and images. Both files and images will have eg. a description field. These sections' tables will not have any text saved in it, only references to a specific row in the language tables.

    So, as far as I've come I don't actually need any more fields than those I have (having gone through it in my head, I don't think I need atleast). As far as I'm concerned (based upon experience, not theory) I doesn't need a 'namespace' field. However, I have never worked with making searches so I doesn't know what I need when making them..

    Therefore, I'm curious what I actually will need the namespace field for..? :-)

  14. #14
    Currently Occupied; Till Sunda Andrew-J2000's Avatar
    Join Date
    Aug 2001
    Location
    London
    Posts
    2,475
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Widow Maker
    Good topic, not yet had to ponder this task of variable languages myself, though I have put some thought into it, and was wondering if anyone has used an XSL stylesheet(s) for doing the actual translation, from English to Spanish for example ?

    This I think though would be more suitable for static texts, rather than database pulled data ?

    Any thoughts on this ?
    I would strongly suggest otherwise, I looked into the topic of i18n & l10n a while back. In order to effectively acheive g11n you need to use a combination of methods there are some standardized schemas in XML, such as XLIFF, TBX, DXLT etc. If your keen on that route, however it soon gets convoluted. I suggest you have a read over the Internationalized Components for Unicode (ICU).

    Depending on your approach to this, the simplest method is to simply use Gettext(), despite its overhead.

    [I18n & L10n]
    [Internationalization and Localization]
    [ICU]
    [Multi-Lingual with XSLT]

  15. #15
    Non-Member
    Join Date
    Jan 2004
    Location
    Planet Earth
    Posts
    1,764
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your comments, will make me think some more about XSL before I consider using it

    Going to look over your links today.

  16. #16
    SitePoint Zealot
    Join Date
    Mar 2004
    Location
    Milano
    Posts
    127
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Haax: You don't need namespaces right now. Every translation is already clearly identified by an ID and there is no chance of name conflict.

  17. #17
    SitePoint Addict
    Join Date
    Mar 2003
    Location
    Germany
    Posts
    216
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You don't need namespaces right now.
    I agree.

    The problem is that i18n has many facets and not every situation is the same. I also think that in a real life situation, you will usually combine several approaches.

    @Andrew: Thanks for the links. Btw, looks like I will write my diploma thesis on XLIFF and other XML translation-related formats.

    Concerning the XSL approach, I think that's best suited for UI-like elements, like in a nav menu. It's not suited for full content pages. I think that's basically the same as having a db table with translateable strings (but the db solution might be faster?)

    If you have static pages, that's also fine, because with the right tools you can easily translate them. If the page changes, you just put it through your translation tool again. If only the structure of the page has changed (not the language content), this works even automatically (because you can store all your translations in a Translation Memory system).

  18. #18
    SitePoint Evangelist
    Join Date
    Dec 2003
    Location
    Arizona
    Posts
    411
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

  19. #19
    get into it! bigduke's Avatar
    Join Date
    May 2004
    Location
    Australia
    Posts
    847
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hmmm, I'm thinking in terms of using some translation web service but i think that'd be slow.
    I didn't know XSL could do translations as well, will have to look into this.

  20. #20
    SitePoint Enthusiast haaX's Avatar
    Join Date
    Jun 2004
    Location
    VardÝ, Norway
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Great thanks for all the answers guys... Just to be sure then ... I would have no need to even think about search issues when creating this table?

  21. #21
    Currently Occupied; Till Sunda Andrew-J2000's Avatar
    Join Date
    Aug 2001
    Location
    London
    Posts
    2,475
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Widow Maker
    Thanks for your comments, will make me think some more about XSL before I consider using it
    Hello,

    I'm not trying to put you off using XSL, I've always stated that you should be apprehensive towards the use of XML/XSL, as you end up digging yourself in a deeper hole than when you started unnecessarily. As Austin Hastings mentioned to me "I'd recommend you take a long look (it's impossible to take a short one :-( ) at the ICU library from IBM/Taligent.".

    Quote Originally Posted by mkrz
    The problem is that i18n has many facets and not every situation is the same. I also think that in a real life situation, you will usually combine several approaches.

    @Andrew: Thanks for the links. Btw, looks like I will write my diploma thesis on XLIFF and other XML translation-related formats.
    Exactly, once you read through the ICU documents, you'll realise how apparent that first statement will be and how convulted the topic of i18n gets. I've got a few more references for you to toy with, however I dont have all the links that I accumulated on this topic, anymore

    Use a similar method to this, an example would be docbooks-XSL.

    Quote Originally Posted by bigduke
    Hmmm, I'm thinking in terms of using some translation web service but i think that'd be slow.
    This could still be implemented, and effectvely; would come in handy for mediation.

    [W3C Internationalization (I18N) Activity]
    [ICU for Java / C]
    [i18nguy]
    [Unicode Technical Reports | LDML | CharMapML ]

  22. #22
    SitePoint Enthusiast haaX's Avatar
    Join Date
    Jun 2004
    Location
    VardÝ, Norway
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    And another question...

    Whenever I create a reference to this table within other tables -- should I make the fields with the reference to the translation a foregin key (index)?

  23. #23
    get into it! bigduke's Avatar
    Join Date
    May 2004
    Location
    Australia
    Posts
    847
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    well since I try to keep away from anything with 'X' in it (thus XML and XSL included), I was wondering if babelfish has an APi that could be exploited, Andrew. Havent seen it or heard about it so far.

  24. #24
    SitePoint Zealot
    Join Date
    Mar 2004
    Location
    Milano
    Posts
    127
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Haax: I think so, but your primary key on the translation table should be unique together with the language. For example:
    Code:
    id      lang   translation
    1       ita    "il mio nome Ť franco"
    1       eng    "my name is franco"
    1       ptg    "o meu nome ť franco"
    ...
    If you have a unique id for each row, you don't really know how to associate with other tables. Of course you can have both id and translation_id but I don't see a real convenience.

  25. #25
    SitePoint Enthusiast haaX's Avatar
    Join Date
    Jun 2004
    Location
    VardÝ, Norway
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thumbnail rule, always use a unique ID going for both ID and tID. The convenience might not be very obvious, but I think it's a question of structure to have a unique id for each row..


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •