SitePoint Sponsor

User Tag List

Results 1 to 11 of 11
  1. #1
    cd /pub; more beer diacono's Avatar
    Join Date
    Dec 2000
    Location
    Horsham, West Sussex, UK
    Posts
    277
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    well i was speaking to a friend the other day about wether i should chose mysql or postgresql (i've just started with php)
    he told me that mysql wasnt really a database because it didnt support transactions...! what also worries him is tha fact that data can get curropted easily in mysql.

    so what do you think about this? is it an issue? gimme your thoughts...
    DIEGO MOORE

  2. #2
    Database Jedi MattR's Avatar
    Join Date
    Jan 2001
    Location
    buried in the database shell (Washington, DC)
    Posts
    1,107
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That is certainly a legitimate concern, although I think transactional support is added into the latest version of MySQL. But Transactional support isn't a 'requirement' for a DBMS. The part that keeps MySQL from being a 'real' relational database management system is that it does NOT support foreign key constraints which is an integral part of a relational database.

    So, when they say "We're a true RDBMS" that's a lie. They're not.

    I think the second link in my profile has my annoyances with MySQL if you want to follow that.

  3. #3
    Irritability Defined
    Join Date
    Jul 2000
    Location
    80,000 feet below the surface
    Posts
    1,442
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This article from PHPBuilder addresses many of the issues raised. Very interesting reading, this.
    My 2 Cents (or is that 2.2 Cents including GST?)

  4. #4
    ********* Callithumpian silver trophy freakysid's Avatar
    Join Date
    Jun 2000
    Location
    Sydney, Australia
    Posts
    3,798
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally posted by diacono
    well i was speaking to a friend the other day about wether i should chose mysql or postgresql.
    Choose for what purpose? To store your favourite recipes? Organise a BeeGees discography? Are you setting up an online saphire and ruby trading system for the Burmese government?

    The fact is that many (most?) sites using php/mysql are using it for database-driven content management. Transactions don't come into it.

  5. #5
    chown linux:users\ /world Hartmann's Avatar
    Join Date
    Aug 2000
    Location
    Houston, TX, USA
    Posts
    6,455
    Mentioned
    10 Post(s)
    Tagged
    0 Thread(s)
    You do not need transactions unless you plan to have your program make changes to the whole database at one time because a certain criteria was met. Transactions add overhead to PostgreSQL that slow down updates and inserts. If you are doing something simple, use MySQL. If you know you are going to need transactions, use PostgreSQL.

  6. #6
    Database Jedi MattR's Avatar
    Join Date
    Jan 2001
    Location
    buried in the database shell (Washington, DC)
    Posts
    1,107
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You really need to define what it is your application will do any how many people will be connected to it at one time. After you get a solid list of features and the like then can you really decide what database management system you want to use.

  7. #7
    SitePoint Member frankie54's Avatar
    Join Date
    Feb 2001
    Location
    NYC
    Posts
    19
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally posted by MattR
    You really need to define what it is your application will do any how many people will be connected to it at one time. After you get a solid list of features and the like then can you really decide what database management system you want to use.
    It would be great for some of the more experienced heads on the board to share with us (newbeez) some of their real-life
    experiences with MySQL data corruption, in specific examples such as:

    - bulletin board / community page where all member's
    comments are stored in the database.

    - authentication / password storage and entry for login
    to site.

    - surveys/ polls and the likes.

    Obviously, it is probably reasonable to assume that someone who maintains a website would have a good working backup of the stuff
    that gets served dynamically but is not updated by users, which will probably be fairly easy to restore.
    On the other hand, what to do about the part of the data which might only be backed up once a day? The kind of data which
    is for the most part entered by daily users and changes all the time?

    What do webmasters do to prevent this kind of disaster scenario to take place?

    And isn't that what row-locking and COMMIT/ROLLBACK
    on higher-end products achieve?

    Would love to hear some real-life stories....

    Frankie
    stay groovy !

  8. #8
    Database Jedi MattR's Avatar
    Join Date
    Jan 2001
    Location
    buried in the database shell (Washington, DC)
    Posts
    1,107
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally posted by frankie54


    It would be great for some of the more experienced heads on the board to share with us (newbeez) some of their real-life
    experiences with MySQL data corruption, in specific examples such as:

    - bulletin board / community page where all member's
    comments are stored in the database.

    - authentication / password storage and entry for login
    to site.

    - surveys/ polls and the likes.
    http://sitepointforums.com/showthrea...5&pagenumber=2

    Is where I discuss exactly why MySQL failed us and how Sybase overcame MySQL's limitations (or my implementation? Dunno but Sybase has 'fixed' them) in our message board type application.

    Originally posted by frankie54
    Obviously, it is probably reasonable to assume that someone who maintains a website would have a good working backup of the stuff
    that gets served dynamically but is not updated by users, which will probably be fairly easy to restore.
    On the other hand, what to do about the part of the data which might only be backed up once a day? The kind of data which
    is for the most part entered by daily users and changes all the time?
    Sybase does this incredibly easily.
    "1> DUMP DATABASE fanhomecom TO database_dump_dev
    2> GO"

    will make a binary dump of your database. Loading it in is as easy as LOAD DATABASE fanhomecom etc.

    The second half of your question, the incremental stage, is done by what is called a 'transaction log' in Sybase (I don’t think MySQL has this). It keeps track of who does what when. Every instruction that changes data (except DELETE / DROP TABLE, etc): INSERT and UPDATE primarily is written to the log. You can have that also flushed to a file / tape drive / network drive / etc. at specified intervals -- ether as data is written to the log, when the log reaches a specified size, or on specific time intervals.

    So, if your database totally dies you run two queries from the command line and you're back to your latest incremental backup. If you lose some information that was added since your last backup you only have to load the transaction log and you're good to go.

    Originally posted by frankie54

    What do webmasters do to prevent this kind of disaster scenario to take place?
    Not sure what you mean – if you require ‘high availability’ then you can get several hard disks in a RAID array so that if one goes down you can dynamically rebuild it or use a mirrored copy. If you don’t have RAID then if a disk dies you need to replace it and use your backups to recover.

    Originally posted by frankie54
    And isn't that what row-locking and COMMIT/ROLLBACK
    on higher-end products achieve?

    Would love to hear some real-life stories....

    Frankie
    We had MySQL die quite a bit with corrupted tables. I can’t quite put my finger on ‘why’ but I can only say that it happened when we had large numbers of users online. It seems to be fairly common with vB users (which says HORRIBLE things about either vB itself or MySQL, I tend to think MySQL is to blame) so it wasn’t an isolated case.

    Row-level locking isn’t the magic trick that we can all use to make our lives easier.

    Most database support three levels of locking from ‘most rows locked’ to least:
    Table
    Page
    Row

    Table level locking:
    I’d be hard pressed to recommend this to anyone who would expect more than one person to be writing to or reading from the table at any given moment.

    Pros:
    Simpler for the DBMS lock manager to figure out who is doing what (all or nothing basically)
    Probably a bit faster for reads (if no one is locking) because of the lock manager simplicity

    Cons:
    Table locking sucks if you have more than one person trying to write to the table at the same time. Basically you either have a deadlock situation (where the RDBMS picks one of them to fail) or a time-out situation where user B gets stuck waiting for user A’s transaction to complete.
    Depending on the implementation people who are reading the table could be stuck in a time out or deadlock situation as well.

    Page level locking:
    Pages are how you physically store data on the disk. Usually they are in 2K ‘chunks’ of information. This is a MAJOR step up from table level locking and usually is sufficient for most uses. What happens is that for exclusive page locks (table inserts / updates) you lock the page of information you’re writing to.

    Pros:
    Avoids almost all problems with contention that table level locking brings

    Cons:
    Could lock adjacent rows due to the fact that you could have several rows per page (if you have small row sizes, you specify many rows per page, or one row ends on a page that another one begins on)
    A little more complex than table level locking (busy tables could have hundreds of locks on them). However it’s like adding indexes to a table – write performance is degraded a tiny bit but the huge read performance gains make it well worth it to add indexes. Same with page and row level locking. - it makes the lock manager work more but allows for WAY more transactions per second.

    Row level locking:
    Row level locking works as you would expect: when you try and get an exclusive lock on the row the RDBMS will take a look at it and see if anyone else is using it. If not, it will lock that row and not permit anyone else to touch it while you’re working.

    Pros:
    Good for when you have tiny rows so that page locking would hit several unnecessarily.
    Have many concurrent writes / reads to the table in one place (known as a hotspot) – this would ‘fix’ the page level locking issue of grabbing several simultaneously.
    Highest (or is it lowest? Too tired to think that hard! level of granularity current available. I suppose someone could invent Row/Column locking where you can lock the particular COLUMN in a row that is being updated to so you can ‘UPDATE users SET password = 24 WHERE userid = 223’ and execute ‘SELECT username FROM users WHERE userid = 223’ at the same time. Actually I think most row-level locking schemes try and determine that – if you have a nonkey attribute that you’re updating and you have SELECT’s that are looking for it, you know that the SELECT values won’t be bothered by it as long as the SELECT statements do not fetch the columns modified nor do they sort / quantify (WHERE) using it. But that seems like quite a bit of work for it to do and would slow it down significantly more than the (supposed) performance gains.

    Cons:
    Yet again more locks to deal with from the lock manager’s perspective.

    Whew. That’s a lot of typing.. I’ll try and keep the COMMIT / ROLLBACK transact stuff a bit shorter.

    “What do transactions do for me?”
    Well, transactions help you keep your database from becoming inconsistent – that is if you’re executing several queries in one pass you want to make sure that they all go through – or none of them do. For example, in a message board system such as this one you have several interrelated tables… forum <- thread <- post <- user That is to say a user owns a post which lives in a thread which resides in a forum. Say you delete something in the post table. You now have to edit the thread table and the forum table. You even need to update the user table. Some simple queries would be something like:
    PHP Code:
    <?php
    // some pseudocode for php

    connect to the database( );

    // Let's say we want to delete a thread.

    // we need to:
    $query1 "DELETE post
                WHERE threadid = 
    $some_id";

    $query2 "DELETE thread
                WHERE threadid = 
    $some_id";

    // In vB's case we denormalize some things to get
    // forum and user post totals
    $query3 "UPDATE user
                  SET postcount = ( SELECT count( * )
                                      FROM post
                                     WHERE userid = 
    $some_userid )
                WHERE userid = 
    $some_userid";

    $query4 "UPDATE forum
                  SET threadcount = ( SELECT count( * )
                                        FROM thread
                                       WHERE forumid = 
    $some_forumid )
                WHERE forumid = 
    $some_forumid";
                
    // Now we're done!
    ?>
    I admit that’s a contrived and stupid example but you wanted some message board type examples.

    Anyway, what happens if query #2 fails? The thread still exists but the posts are gone. We have an orphan thread. If they were the other way around, that is delete the thread and then the posts, then we have many orphan posts which no longer have a home.

    With a transaction you can place them all in a ‘batch’ that is executed. If one query fails, the rest are ‘rolled back’ (using the transaction log to determine what happened) and you have your database just fine like it was before.

    If you were in a financial situation and you were transferring money from one account to another you could do something like this:
    Code:
    // I want to transfer $400 from my checking to my savings account.
    UPDATE savings_account
       SET cash = cash - 400
     WHERE userid = '234-55-2345' -- (social security number in this case)
     
    UPDATE checking_account
       SET cash = cash + 400
     WHERE userid = '234-55'2345'
    What happens if the second query fails? $400 of your money vanishes into the ether. Without a transaction log (as in MySQL) it would be your word against the database that you ‘really had $400 more in there!’

    If you slap that in a transaction if it dies the query is automagically reverted.

  9. #9
    cd /pub; more beer diacono's Avatar
    Join Date
    Dec 2000
    Location
    Horsham, West Sussex, UK
    Posts
    277
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally posted by freakysid


    Choose for what purpose? To store your favourite recipes? Organise a BeeGees discography? Are you setting up an online saphire and ruby trading system for the Burmese government?

    The fact is that many (most?) sites using php/mysql are using it for database-driven content management. Transactions don't come into it.
    well i've started with php and my host (my mate who does it for free for me) developes with postgre. at the moment both are setup and running but they will eventually stick to postgre...
    so i think i might just aswell start everything with postgre (i could sitll play around with mysql in mu linux box though )
    DIEGO MOORE

  10. #10
    Database Jedi MattR's Avatar
    Join Date
    Jan 2001
    Location
    buried in the database shell (Washington, DC)
    Posts
    1,107
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally posted by diacono
    well i've started with php and my host (my mate who does it for free for me) developes with postgre. at the moment both are setup and running but they will eventually stick to postgre...
    so i think i might just aswell start everything with postgre (i could sitll play around with mysql in mu linux box though )
    I'd say in this situation it would be wise to stick with Postgres because your friend / developer is fluent in it. Granted most SQL is transferrable but there's a learning curve to each DBMS and you may want to (ab) use your friend from time to time with PostGRES.

  11. #11
    SitePoint Member frankie54's Avatar
    Join Date
    Feb 2001
    Location
    NYC
    Posts
    19
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally posted by MattR
    [B]

    http://sitepointforums.com/showthrea...5&pagenumber=2

    Is where I discuss exactly why MySQL failed us and how Sybase overcame MySQL's limitations (or my implementation? Dunno but Sybase has 'fixed' them) in our message board type application.
    Those were exactly some of the points I was talking about!!!
    A very large thanks to MattR for typing all this valuable info.

    If we could get down to some specific MySQL stories this would be great!

    For example:

    How many concurrent users updating the database at the same time can start messing things up?

    Also, if the ISP has one database server for several sites, how can we analyze how much traffic becomes critical to data integrity? (good luck ...)

    When does a simple MySQL website backend need to migrate to a
    more robust (but slower) app?

    Monday morning in the NYC snow questions.....

    Frankie
    Last edited by frankie54; Mar 5, 2001 at 08:38.
    stay groovy !


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •