Mysql data partioning and trimming

Our main table, tblLink, will have its primary key as foreign in a few other tables too and one of the tables will be tblMainData which over time will be millions of rows.

We would like to maintain only 3 months of data based on the insert date in tblLink.

The rest we would like to purge and keep to generate history reports. What is the best mechanism to go about in this scenario? Below is how our table structures looks like.

CREATE TABLE IF NOT EXISTS `tblLink` (
  `linkID` int(5) NOT NULL,
  `compID` int(5) NOT NULL,
  `vehicleID` int(5) NOT NULL,
  `deviceID` int(5) NOT NULL,
  `locationFromID` int(5) NOT NULL,
  `locationToID` int(5) NOT NULL,
  `employeeIDInsert` int(5) NOT NULL,
  `dateTimeInsert` datetime NOT NULL,
  `dateTimeStartJourney` datetime NOT NULL,
  `dateTimeEnd` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `dateTimeEndJourney` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `linkStatus` enum('a','d','e','m') NOT NULL,
  PRIMARY KEY (`linkID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;


CREATE TABLE IF NOT EXISTS `tblEmailLog` (
  `emailLogID` int(11) NOT NULL AUTO_INCREMENT,
  `compID` smallint(6) NOT NULL,
  `linkID` int(11) NOT NULL DEFAULT '0',
  `userID` smallint(6) NOT NULL,
  `alertCodeID` tinyint(4) NOT NULL,
  `eventAlertID` int(11) NOT NULL,
  `userEmail` varchar(100) NOT NULL,
  `alertDateTime` datetime NOT NULL,
  `alertInsertDateTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`emailLogID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;


CREATE TABLE IF NOT EXISTS `tblEventAlert` (
  `eventAlertID` int(11) NOT NULL AUTO_INCREMENT,
  `compID` int(5) NOT NULL,
  `mainDataID` int(5) NOT NULL,
  `vehicleID` int(5) NOT NULL,
  `eventAlertSentEmail` varchar(50) DEFAULT NULL,
  `eventAlertMessage` varchar(255) NOT NULL,
  PRIMARY KEY (`eventAlertID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;



CREATE TABLE IF NOT EXISTS `tblMainData` (
  `mainDataID` int(11) NOT NULL AUTO_INCREMENT,
  `linkID` int(5) NOT NULL,
  `header` varchar(3) NOT NULL,
  `deviceSerialNumber` varchar(20) NOT NULL,
  `latitude` double NOT NULL,
  `longitude` double NOT NULL,
  `speed` float NOT NULL,
  `course` int(3) NOT NULL,
  `dateTimer` datetime NOT NULL,
  `gpsDateTime` datetime NOT NULL,
  `insertDateTime` datetime NOT NULL,
  `odoMeter` float NOT NULL DEFAULT '0',
  `driverID` int(5) NOT NULL,
  `eventAlertID` int(11) NOT NULL DEFAULT '0',
  `mainDataInsertDateTime` datetime NOT NULL,
  `gpsString` varchar(450) NOT NULL,
  PRIMARY KEY (`mainDataID`),
  KEY `dateTime` (`dateTimer`),
  KEY `linkID` (`linkID`),
  KEY `eventAlertID` (`eventAlertID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;


CREATE TABLE IF NOT EXISTS `tblSubData` (
  `subDataID` int(11) NOT NULL AUTO_INCREMENT,
  `mainDataID` int(11) NOT NULL,
  `linkID` int(11) NOT NULL,
  `eventAlertID` int(11) NOT NULL,
  `deviceSerialNumber` varchar(20) NOT NULL,
  `subdeviceSerialNumber` varchar(20) NOT NULL,
  `dateTimer` datetime NOT NULL,
  `eventType` varchar(2) NOT NULL
  PRIMARY KEY (`subDataID`),
  KEY `mainDataID` (`mainDataID`),
  KEY `linkID` (`linkID`),
  KEY `eventAlertID` (`eventAlertID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;

One simple way is to write a small script that deletes the records in order and trigger it on a regular basis. By using cron jobs for example.

Using partitions allows large volumes of data to be removed from a table almost instantaneously. A script deleting entries will cause locked tables as the data is being removed.

The main issue that I see is the creation of the partition which requires some component of the primary key. In the case of the table tblLink the primary key linkID is insufficient. If the primary key was changed to include both the linkID and the dateTimeInsert field then partitions could be created using the dateTimeInsert field.

You could also try creating a primary key of an auto_incremented field by using both the auto_incremented field and a date field and use the date field for partitioning.

Dear Rcashell,
I have read yes some sited says that deleting huge data and at the same huge inserts are going in will cause the tables to be lock and breakdown the whole system. So I have also read a bit on partition but the problem now I insert the ID manually into the tblLink. If I just change it to autonumber will start working automatically from the last number I inserted manually is it. So next then I will make both the linkID and dataTimeInsert as primary key. So what should be my next step. The issue here is that this table is link to tblMainData and this table also linked to other tables. So the date will be the partitioning factor then we will have partition on daily basis rite ?

You can continue manually adding the linkID as you currently are doing but equally this could be converted to an auto_increment value.

When dropping a partition that has referential integrity constraints the most appropriate way is to disable the referential integrity constraints, drop the partition and the re-enable constraints.

Your partitioning strategy should be based on some date to keep all dated information together. This makes it easier to remove a partition later on.

Hope this helps?