How to modify MySQL/MariaDB .sql export file not breaking character/set?

Hello, please how to properly modify MySQL database export file .sql so the character set is not changed/content is not broken after saving/importing the modified file?

I have opened the file in Linux editor Kate and saved a modified copy of the file with default UTF-8 charset, then uploaded the file to the new database using PHPMyAdmin (while left selected default utf-8 charset of the file) or using BigDump php script. In both cases the charset was likely wrong, since CMS errored, saying that I was likely “importing your database tables using the wrong character set or collation”. Importing original/unmodified .sql file (using BidDump which had “$db_connection_charset = ‘utf8’;” set as in previous case where i have imported a modified .sql file) made the site working. So the problem is likely in a wrong charset used in saving the file or less likely I did some error in editing the file?

Original .sql export file header:

/*M!999999- enable the sandbox mode */
– MariaDB dump 10.19 Distrib 10.6.20-MariaDB, for Linux (x86_64)

– Host: localhost Database: ***


– Server version 10.6.20-MariaDB-cll-lve

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT /;
/
!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS /;
/
!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION /;
/
!40101 SET NAMES utf8mb4 /;
/
!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE /;
/
!40103 SET TIME_ZONE=‘+00:00’ /;
/
!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 /;
/
!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 /;
/
!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE=‘NO_AUTO_VALUE_ON_ZERO’ /;
/
!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

Source DB and imported DBs shows collation latin1_swedish_ci, type MyISAM. Both DBs are on same physical server with 10.6.20-MariaDB-cll-lve, Server charset: cp1252 West European (latin1), Server connection collation: utf8mb4_unicode_ci

My question is how to modify and save the copy of the file properly in this case, please?

So… uhm… what change are you actually trying to make?

If you’re trying to ask why you got that error from your CMS, we’d need to know what CMS was generating the error, in order to know what sort of charset it was expecting…

It was InvisionPB(IPB) CMS and change inside the .sql file was in domain address.

So why not import the file and then use SQL commands to change it, rather than try and modify the file? (or change it before export)

I would prefer if you tell me how instead of you asking me why. To answer you. Because it is much easier for me to just double-click and search/replace, period.

If you know a SQL command to search and replace string in all tens of SQL tables or a simple bash script (Yes it is more complicated), please let me know. Anyway, I think that my question is legitimate and I would like to hear what may be the problem in charset encoding.

And i’d prefer people take advice rather than get snippy, but we dont always get what we want.

At no point in your question did you mention you were trying to replace a value in “tens of SQL tables”, so i’m not sure how i would be expected to know that.

What my post was doing was pointing out that you’re chosing the most complicated place to try and make modfications to the data rather than the actual tools that are designed to work in the charsets that you’re trying to manipulate with tools not designed for it.

Yes, ctrl-H and type in string works great… if your editor is designed to work with the file encoding. If it’s not, you’re just corrupting the data.

For example, if I take a u with an umlaut, in the Hexidecimal it would be FC. In UTF-8 it would be C3BC and in UTF-16 it would be FEFF00FC. and if your editor writes the wrong one in, it will be misinterpreted (or cause the string to be the wrong length) when something else reads it.

I could get out a hex editor, investigate known characters, figure out the encoding for the various letters i want, and hex them in… or I could say “Import as is” and then issue 10 or 20 commands of the form UPDATE table SET field1 = REPLACE(field1,"olddomain","newdomain") and let the database engine do all the hard work for me…

I am unable to figure out any helpful information which I can use to solve this issue, I would welcome others to chime in.

When you just change the character set value in the sql file, you might end up in misinterpreted special chars in your application. This is no good solution

You can change the character set and collation of the database by using some queries.

For example to change all tables to utf8mb4 collation utf8mb4_unicode_ci

First you create some commands


SELECT CONCAT('ALTER TABLE `', table_name, '` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;')
FROM information_schema.tables
WHERE table_schema = 'your database name here'
  AND table_type = 'BASE TABLE';

The output of this query you can run again to change the tables

I have not changed it, maybe unintentionally (I do not know how and how to avoid that and I would like to know it). Avoidance would be better than learning/remembering(impossible for a layman like me) yours mentioned SQL command?

Interesting is that original working database and modified non working database has same charset/collation per my layman understanding (“Source DB and imported DBs shows collation latin1_swedish_ci, type MyISAM.”).

But yours mentioned utf8mb4 makes some sense to a layman like me, because header of the working .sql export file contains this (as quoted in 1st post), though as mentioned, restored .sql file results in tables with latin1_swedish_ci collation, same as non working modified export.

Since you have not pointed that I did anything wrong, maybe it is also problem of the CMS which can not work with my modifications. Thank you