SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Member
    Join Date
    Dec 2009
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    'Simply SQL' and proper element naming

    I have a question regarding the book "Simply SQL".

    On page xxv in the 3rd paragraph of the Acknowledgments section, Mr. Limeback tells us that his main reviewer (Joe Celko) was dissatisfied with Mr. Limeback's willful disregard of ISO-11179 in Mr. Limeback's choice of data element names, especially his choice of using element name "id" for many of his tables.

    I would like to know what Mr. Celko's view was on proper element naming. I know that when creating a single table, alone, calling a field "id" makes sense, but when you throw a bunch of tables together, I've seen the table "user" having an ID field called "user_id". But this gets annoyingly repetitive (though always straightforward) when it get's referenced as user.user_id but makes things obvious when it's referenced in another table (say contacts) when it's consistent, user.user_id, contracts.user_id as compared to user.id, contracts.user_id. If the "id" field in the "user" table is always called "user_id" there's no confusion, just tired fingers.

    Is this what Mr. Celko professes or something entirely different? I'm also curious about "correct" naming scheme beyond that of the field Mr. Limeback names "id". I'd hate to get into bad habits and have my current and future employers wasting time breaking me of these badly taught habits.

    Regarding Mr. Celko's use of having elements separated by a comma at the beginning of a line, it makes sense to me. You can add an element or remove an element anywhere and you won't have to jump to the end of a line to make sure the comma is (or isn't) there. It also is a visual cue that this line is a continuation of the previous. Though I will admit that it looks strange upon the first few viewings.

    -John

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,021
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    john, first of all, thanks for posting your question

    i'd like to go into some detail, so let's start with the "willful disregard of ISO-11179"

    you're right that the book's technical reviewer, joe celko, objected to my use of id versus entity_id as a column name

    specifically, he said that
    Vague generic things like "id" or "name" that are used on dozens of totally different data elements is really bad SQL and DB design
    first of all, it isn't "really bad" and i'm not sure i agree that it's even slightly bad

    but i felt i needed to acknowledge the difference in the preface, at least to joe, for whom i have great respect, because i wanted him to know that i did it on purpose, and not through negligence

    i'll explain my reasoning in a minute

    your desire not to develop bad habits is admirable, but i can assure you that this particular data naming issue isn't as serious as all that

    more likely is that a future employer will have an "in-house" naming convention, and you had better follow that instead

    as for whether my data element names are actually in violation of the ISO-11179 standard, feel free to read it yourself (there are resources here and here) but forgive me if i cannot bring myself to do so



    when i create a table with an autonumbering primary key (this would be an AUTO_INCREMENT in mysql, an IDENTITY in sql server, a SEQUENCE in oracle, and so on) then i quite happily just call it id

    this is first of all a signal to me that it is actually a surrogate key

    now, the foreign key which references an autonumbering primary key is then called entity_id

    for example, the customers table has an id, and the orders table has a customer_id

    i think this is logical and consistent

    some people say that both keys should be named identically, but even my simple book contains a good counter-example

    in the teams table, my primary key is called id, although some say it should be team_id, and if this were the end of it then i would say it's a wash

    but in the games table, there are two foreign keys to the teams table -- and they cannot both be given the same team_id name

    i suppose they might need to be called home_team_id and away_team_id but i have chosen to use just hometeam and awayteam as column names

    the point is that since they both refer to the id column in the teams table, they can't be called team_id, some adjustment to the names must be made, and i simply choose to make another adjustment in the name of the original column itself, because this is consistent with my naming convention for other columns, such as name and descr

    it seems joe and the standard think that having a column called name or descr in several different tables is wrong because a customer name is different from a team name which is different from a product name, and so on

    in one sense they aren't the same (because you don't see people called XL50 and you don't see products called Smith) but in another sense they are, because they both contain variable length strings of text

    so instead of name, i might have called it customer_name or product_name, in keeping with the pattern of using entity_id instead of id

    but in my opinion, this can be carried to ridiculous extremes

    compare the following table definitions --
    Code:
    CREATE TABLE entries
    ( id         INTEGER      NOT NULL  PRIMARY KEY
    , title      VARCHAR(99)  NOT NULL
    , created    TIMESTAMP    NOT NULL
    , updated    TIMESTAMP    NULL
    , category   VARCHAR(37)  NULL
    , content    TEXT         NULL
    );
    Code:
    CREATE TABLE entries
    ( entry_id         INTEGER      NOT NULL  PRIMARY KEY
    , entry_title      VARCHAR(99)  NOT NULL
    , entry_created    TIMESTAMP    NOT NULL
    , entry_updated    TIMESTAMP    NULL
    , entry_category   VARCHAR(37)  NULL
    , entry_content    TEXT         NULL
    );
    one of these is, to my way of thinking, silly

    finally, let's discuss the "leading commas" convention

    i developed this convention myself, independently of anyone else who may have done so, and in fact joe celko does not like it, saying
    commas at the front of a line is an old punch trick and it does not work
    well, it actually does work, and in my opinion it is better than the trailing comma convention, for some of the same reasons it was a good technique in punch card days

    for more on the leading commas convention, see this thread starting at post #25

    there is also an explanation of advantages of the leading commas convention in the book on page 51

    many people who learn of this convention see the benefits, and i'm glad that it makes sense to you too

    once again, thanks for posting your questions, and if anything is unclear or you want to ask for further information, please post again

    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    From space with love silver trophy
    SpacePhoenix's Avatar
    Join Date
    May 2007
    Location
    Poole, UK
    Posts
    4,909
    Mentioned
    96 Post(s)
    Tagged
    0 Thread(s)
    I find that the leading commas convention is easier to work, especially when playing about with a query adding and removing fields from SELECT clauses. Also when using it with queries copy and pasted into PHPMyAdmin for testing, PHPMyAdmin gives the line where the error is at and it does tally up with a line of the query, take the example of that post that Rudy links to:

    Code SQL:
    UPDATE
         tbl_board
    SET
        FIRSTNAME = '#form.FIRSTNAME#'
      , LASTNAME = '#form.LASTNAME#'
      , PROFILE = '#form.PROFILE#'
      , CONTACTEMAIL = '#form.CONTACTEMAIL#'
      , LIVE = '#form.LIVE#'
      <cfif PICFILE IS NOT "">, PICFILE = '#file.Serverfile#'</cfif>
    WHERE
         UID = #UID#

    Now lets say that you had missed a comma in the SET clause:

    Code SQL:
    UPDATE
         tbl_board
    SET
        FIRSTNAME = '#form.FIRSTNAME#'
      , LASTNAME = '#form.LASTNAME#'
      , PROFILE = '#form.PROFILE#'
        CONTACTEMAIL = '#form.CONTACTEMAIL#'
      , LIVE = '#form.LIVE#'
      <cfif PICFILE IS NOT "">, PICFILE = '#file.Serverfile#'</cfif>
    WHERE
         UID = #UID#

    Can you easily spot where the comma has been missed out from?
    Community Team Advisor
    Forum Guidelines: Posting FAQ Signatures FAQ Self Promotion FAQ
    Help the Mods: What's Fluff? Report Fluff/Spam to a Moderator


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •