Database Backup Script - Can This Be Improved?

I’ve been using this script as the basis for a database backup script:

<?php

backup_tables('localhost','user','password','database');

/* backup the db OR just a table */
function backup_tables($host,$user,$pass,$name,$tables = '*')
{
    $return='';

    $link = mysql_connect($host,$user,$pass);
    mysql_select_db($name,$link);

    //get all of the tables
    if($tables == '*')
    {
        $tables = array();
        $result = mysql_query('SHOW TABLES');
        while($row = mysql_fetch_row($result))
        {
            $tables[] = $row[0];
        }
    }
    else
    {
        $tables = is_array($tables) ? $tables : explode(',',$tables);
    }

    //cycle through
    foreach($tables as $table)
    {
        $result = mysql_query('SELECT * FROM '.$table);
        $num_fields = mysql_num_fields($result);

        $return.= 'DROP TABLE '.$table.';';
        $row2 = mysql_fetch_row(mysql_query('SHOW CREATE TABLE '.$table));
        $return.= "\n\n".$row2[1].";\n\n";

        for ($i = 0; $i < $num_fields; $i++) 
        {
            while($row = mysql_fetch_row($result))
            {
                $return.= 'INSERT INTO '.$table.' VALUES(';
                for($j=0; $j<$num_fields; $j++) 
                {
                    $row[$j] = addslashes($row[$j]);
                    $row[$j] = preg_replace("\n","\\n",$row[$j]);
                    if (isset($row[$j])) { $return.= '"'.$row[$j].'"' ; } else { $return.= '""'; }
                    if ($j<($num_fields-1)) { $return.= ','; }
                }
                $return.= ");\n";
            }
        }
        $return.="\n\n\n";
    }

    //save file
    $handle = fopen('db-backup-'.time().'-'.(md5(implode(',',$tables))).'.sql','w+');
    fwrite($handle,$return);
    fclose($handle);
}

I’ve converted it to (database calls as separate functions - in a data mapper class):

public function backup_tables() {
        
        $return='';
        
        foreach ($_POST AS $table_name ) {
            $table_fields = $this->data_mapper->get_table_fields($table_name);
            
            $return.= "DROP TABLE $table_name";
            $row2 = $this->data_mapper->get_show_create_table($table_name);

            $return.= "\n\n".$row2['Create Table'].";\n\n";
            
            $table_data = $this->data_mapper->get_table_data($table_name);
            
            $field_names=array_column($table_fields,'Field');

            $field_list=empty($field_names)?"":"`".implode('`, `',$field_names)."`";
            $return.= "INSERT INTO $table_name ($field_list) VALUES\n";
            
            foreach ($table_data AS $table_row) {
                $return.= '('.implode(',',$table_row).')\n';
            }
            $return.="\n\n\n";
            
            $handle = fopen('db-backup-'.time().'-'.(md5(implode(',',$_POST))).'.sql','w+');
            fwrite($handle,$return);
            fclose($handle);
        }
    }

    public function get_show_create_table($table_name) {
        try {
            $sql="
                SHOW
                    CREATE TABLE
                $table_name
            ";
              $stmt = $this->db->prepare($sql);
            
              $stmt->execute();
              $create_table = $stmt->fetch(PDO::FETCH_ASSOC);
              return $create_table;
          }
        catch (PDOException $e) {
            error_log('Error when getting the list of projects!');
            error_log(' Query with error: '.$sql);
            error_log(' Reason given:'.$e->getMessage()."\n");
            return false;
        }        
        
    }
    
    public function get_table_data($table_name) {
        try {
            $sql="
                SELECT
                    *
                FROM
                    $table_name
            ";
              $stmt = $this->db->prepare($sql);
            
              $stmt->execute();
              $table_data = $stmt->fetchAll(PDO::FETCH_ASSOC);
              return $table_data;
          }
        catch (PDOException $e) {
            error_log('Error when getting the list of projects!');
            error_log(' Query with error: '.$sql);
            error_log(' Reason given:'.$e->getMessage()."\n");
            return false;
        }                
        
    }
    

    public function get_table_fields($table_name) {
        dump($table_name);
        try {
            $sql="
                SHOW COLUMNS FROM
                    $table_name
            ";
              $stmt = $this->db->prepare($sql);
            //$stmt->bindParam(':table_name', $table_name);
              $stmt->execute();
              $table_fields = $stmt->fetchAll(PDO::FETCH_ASSOC);
              return $table_fields;
          }
        catch (PDOException $e) {
            error_log('Error when getting the list of projects!');
            error_log(' Query with error: '.$sql);
            error_log(' Reason given:'.$e->getMessage()."\n");
            return false;
        }        
    }

I know that I need to change the text of what gets logged to the error log if there’s any MySQL errors (I’m going to have a custom error handler eventually). Is there any way that I could improve on it?

2 Likes

A few things come to my mind:

1 . mysql extension should no longer be used - but I suppose you know this :smile:

2 . addslashes/preg_replace… this is not a good way to escape data for SQL, there’s a function for that: mysql_real_escape_string mysqli_real_escape_string

3 . $return.= '"'.$row[$j].'"' - data should be single-quoted - double quotes are non standard

4 . if (isset($row[$j])) doesn’t make sense in this context because it will always evaluate to true - in the line above you are assigning it a value from preg_replace, which is always a string. A string, even empty, will always pass as true with isset. I suppose you wanted to treat NULLs, but then the logic should be more like this:

if ($row[$j] === null) {
  $return .= "NULL";
} else {
  $return .= "'" . mysql_real_escape_string($row[$j]) . "'";
}

5 . Importing large amounts of data will be very slow with this method, it’s a good idea to use multi-row inserts, for example allow every insert to be up to a few hundred kilobytes in size. Mysql’s default max for a single query is 1 megabyte (I think).

6 . If this code is not meant to be distributed and compatibility with every web host is not important then it will be much more efficient to simply launch mysql_dump via a cron job or via exec() in PHP.

1 Like

Is there any reason why you’re just not using mysql dump in a bash script running from the cron tab?

1 Like

@Lemon_Juice did you read further down the post at attempt 1 to bring the script in the first code block up to date?

@oddz there will be times when not all of the tables will need backing up, for example during a end of round reset of a game, only certain tables will need backing up, some tables will get truncated

I don’t follow. Isn’t the point of a backup script to back-up the data source so it can be restored from nothing. What you’re talking about is more like a data export than a backup.

You would still need to include the truncated tables in the backup - otherwise the restore would leave inconsistencies between the tables.

The backup script also generates the create table queries needed to create the tables

So possibly after the section that generates the create table query for a table, compare the list of tables on the db with the tables to be backed up, if not in the list to be backed up, end the iteration of the loop and move onto the next table?

I’m going on the assumption that exec() will be unavailable or disabled by the host. The script will be run once a day as part of a cron job that will be the “ticker” and housekeeping scripts. I also want the site admins to be able to manually create a backup

You can specify which tables mysqldump should create a dump for.

You are looking at this from the wrong angle, instead of assuming that your potential users/clients wont have exec available, you should list it as a requirement for the backup system.

When you make a custom solution for the backup, it add another potential failure location. For example your code is does not handle multibyte, can this be a potential issue?

Neither NULL, data types, nor foreign key dependency chain just to name a few.

You mean the backup_tables() method in a class? That doesn’t change anything, you are imploding table row values without any escaping or quoting so it’s even worse. My comments about forming proper inserts still hold.

I agree. mysql_dump is a proven solution and should be used whenever possible. Of course, if an application is intended for shared hosts then there’s no other way than implement custom solution, which needs to be tested extensively for reliability. However, some shared hosts provide access to exec() and mysql_dump.

I think in this case it is not a problem at all considering the proper connection character set is configured, the php code will have no problem with utf-8.

Unless you convince the shared hosting provider to allow you to use mysql_dump - the hosting provider will often prefer to allow that to having you sticking a complex custom solution on the server that might cause problems.

The site (whenever it’s eventually ready), will almost certainly be on a shared server. I probably wouldn’t use mysql_dump anyway as that might tie me down to using MySQL

The current code could have issues with multibyte charsets due to the regex not being set to handle it (though we all know preg_ has flaky utf8 support at best with the u notifier).

While I still recommend against it.

If you go down this road, you should change the script to either write to a memory stream and then dump the stream into the new file, or write line by line into the file as you progress through the code. In addition consider how you will handle different column types correctly.

I prefer using dump on my localhost, but I put together a script to create a .sql file server-side a few years ago that is similar. (but not OOP)

One difference is “string” vs. “int”

	$insert_data = '';
	$get_insert_data_query = "SELECT * FROM `" . $table . "`";
	$get_insert_data_result = $mysqli->query($get_insert_data_query);
	while ($get_insert_data_row = $get_insert_data_result->fetch_row())
	{
		$dump_str .= "INSERT INTO `" . $table . "` VALUES (";
		$value_str = '';
		foreach ( $get_insert_data_row as $get_insert_data_row_value )
		{
			if ( ctype_digit($get_insert_data_row_value) )
			{
				$value_str .= $get_insert_data_row_value . ", "; 
			}
			else
			{
				$value_str .= "'" . $get_insert_data_row_value . "', "; 
			}
		}
		$value_str = rtrim($value_str, ", ");
		$dump_str .= $value_str . ");\r\n";
	}
	$dump_str .= "\r\n";

Where is it saving the db to?

Me?

I was using it to output a downloadable file.

$filename_date = date("Ymd-Hi", time() );
$filename = 'my-database-name-' . $filename_date . '.sql';
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Length: ' . strlen($dump_str) );
header("Content-Disposition: attachment; filename=$filename");
echo $dump_str;
?>

Okay, thanks. I thought you were backing up an sql db to somewhere?

Not surprisingly also doesn’t handle NULL properly.

If you can’t even handle NULL properly I put to question the accuracy of the solution provided. That goes for you and the op. Although if your just creating a export than casting NULL to an empty string seems appropriate but that is not appropriate for a true dump.

I understand what/why SpacePhoenix isdoing what they are doing but it has the potential for a lot of problems/anomalies when compared to mysqldump.

Agreed. I haven’t used or looked at that file’s code for 6 years, but you are correct, I didn’t have anything to take possible NULLs into account. (the database didn’t have any)

For my custom limited use it didn’t need to, If it were to be used for other databases it would definitely need some improvement.

Or, as you say, not used instead of the tried and true dump.