SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Guru
    Join Date
    Feb 2007
    Posts
    874
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Mysql data partioning and trimming

    Our main table, tblLink, will have its primary key as foreign in a few other tables too and one of the tables will be tblMainData which over time will be millions of rows.

    We would like to maintain only 3 months of data based on the insert date in tblLink.

    The rest we would like to purge and keep to generate history reports. What is the best mechanism to go about in this scenario? Below is how our table structures looks like.

    Code:
    CREATE TABLE IF NOT EXISTS `tblLink` (
      `linkID` int(5) NOT NULL,
      `compID` int(5) NOT NULL,
      `vehicleID` int(5) NOT NULL,  
      `deviceID` int(5) NOT NULL,
      `locationFromID` int(5) NOT NULL,
      `locationToID` int(5) NOT NULL,
      `employeeIDInsert` int(5) NOT NULL,
      `dateTimeInsert` datetime NOT NULL,
      `dateTimeStartJourney` datetime NOT NULL,
      `dateTimeEnd` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
      `dateTimeEndJourney` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',   
      `linkStatus` enum('a','d','e','m') NOT NULL,
      PRIMARY KEY (`linkID`)  
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
    
    CREATE TABLE IF NOT EXISTS `tblEmailLog` (
      `emailLogID` int(11) NOT NULL AUTO_INCREMENT,
      `compID` smallint(6) NOT NULL,
      `linkID` int(11) NOT NULL DEFAULT '0',
      `userID` smallint(6) NOT NULL,
      `alertCodeID` tinyint(4) NOT NULL,
      `eventAlertID` int(11) NOT NULL,
      `userEmail` varchar(100) NOT NULL,
      `alertDateTime` datetime NOT NULL,
      `alertInsertDateTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      PRIMARY KEY (`emailLogID`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
    
    
    CREATE TABLE IF NOT EXISTS `tblEventAlert` (
      `eventAlertID` int(11) NOT NULL AUTO_INCREMENT,
      `compID` int(5) NOT NULL,
      `mainDataID` int(5) NOT NULL,
      `vehicleID` int(5) NOT NULL,
      `eventAlertSentEmail` varchar(50) DEFAULT NULL,
      `eventAlertMessage` varchar(255) NOT NULL,
      PRIMARY KEY (`eventAlertID`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
    
    
    
    CREATE TABLE IF NOT EXISTS `tblMainData` (
      `mainDataID` int(11) NOT NULL AUTO_INCREMENT,
      `linkID` int(5) NOT NULL,
      `header` varchar(3) NOT NULL,
      `deviceSerialNumber` varchar(20) NOT NULL,
      `latitude` double NOT NULL,
      `longitude` double NOT NULL,
      `speed` float NOT NULL,
      `course` int(3) NOT NULL,
      `dateTimer` datetime NOT NULL,
      `gpsDateTime` datetime NOT NULL,
      `insertDateTime` datetime NOT NULL,
      `odoMeter` float NOT NULL DEFAULT '0',
      `driverID` int(5) NOT NULL, 
      `eventAlertID` int(11) NOT NULL DEFAULT '0',
      `mainDataInsertDateTime` datetime NOT NULL,
      `gpsString` varchar(450) NOT NULL,
      PRIMARY KEY (`mainDataID`),
      KEY `dateTime` (`dateTimer`),
      KEY `linkID` (`linkID`),  
      KEY `eventAlertID` (`eventAlertID`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
    
    
    CREATE TABLE IF NOT EXISTS `tblSubData` (
      `subDataID` int(11) NOT NULL AUTO_INCREMENT,
      `mainDataID` int(11) NOT NULL,
      `linkID` int(11) NOT NULL,
      `eventAlertID` int(11) NOT NULL,
      `deviceSerialNumber` varchar(20) NOT NULL,
      `subdeviceSerialNumber` varchar(20) NOT NULL,
      `dateTimer` datetime NOT NULL,
      `eventType` varchar(2) NOT NULL
      PRIMARY KEY (`subDataID`),
      KEY `mainDataID` (`mainDataID`),
      KEY `linkID` (`linkID`),
      KEY `eventAlertID` (`eventAlertID`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;

  2. #2
    SitePoint Member dennis.g's Avatar
    Join Date
    Jun 2012
    Location
    Athens, Greece
    Posts
    13
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    One simple way is to write a small script that deletes the records in order and trigger it on a regular basis. By using cron jobs for example.

  3. #3
    SitePoint Enthusiast
    Join Date
    Jul 2007
    Location
    San Sebastian, Spain
    Posts
    93
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Using partitions allows large volumes of data to be removed from a table almost instantaneously. A script deleting entries will cause locked tables as the data is being removed.

    The main issue that I see is the creation of the partition which requires some component of the primary key. In the case of the table tblLink the primary key linkID is insufficient. If the primary key was changed to include both the linkID and the dateTimeInsert field then partitions could be created using the dateTimeInsert field.

    You could also try creating a primary key of an auto_incremented field by using both the auto_incremented field and a date field and use the date field for partitioning.

  4. #4
    SitePoint Guru
    Join Date
    Feb 2007
    Posts
    874
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Dear Rcashell,
    I have read yes some sited says that deleting huge data and at the same huge inserts are going in will cause the tables to be lock and breakdown the whole system. So I have also read a bit on partition but the problem now I insert the ID manually into the tblLink. If I just change it to autonumber will start working automatically from the last number I inserted manually is it. So next then I will make both the linkID and dataTimeInsert as primary key. So what should be my next step. The issue here is that this table is link to tblMainData and this table also linked to other tables. So the date will be the partitioning factor then we will have partition on daily basis rite ?

  5. #5
    SitePoint Enthusiast
    Join Date
    Jul 2007
    Location
    San Sebastian, Spain
    Posts
    93
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    You can continue manually adding the linkID as you currently are doing but equally this could be converted to an auto_increment value.

    When dropping a partition that has referential integrity constraints the most appropriate way is to disable the referential integrity constraints, drop the partition and the re-enable constraints.

    Your partitioning strategy should be based on some date to keep all dated information together. This makes it easier to remove a partition later on.

    Hope this helps?


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •