How to Use JSON Data Fields in MySQL Databases

By Craig Buckler

JSON in MySQLIn my article SQL vs NoSQL: The Differences, I mentioned the line between SQL and NoSQL databases has become increasingly blurred with each camp adopting features from the other. MySQL 5.7 InnoDB and PostgreSQL 9.4 databases both directly support JSON document types in a single field. In this article, we’ll examine MySQL’s JSON implementation in more detail.

(PostgreSQL supported JSON before version 9.4 and any database will accept JSON documents as a single string blob. However, MySQL and PostgreSQL now directly support validated JSON data in real key/value pairs rather than a basic string.)

Just Because You Can Store JSON …

… it doesn’t follow you should.

Normalization is a technique used to optimize the database structure. The First Normal Form (1NF) rule governs that every column should hold a single value — which is broken by storing multi-value JSON documents.

If you have clear relational data requirements, use appropriate single-value fields. JSON should be used sparingly as a last resort. JSON value fields cannot be indexed, so avoid using it on columns which are updated or searched regularly. In addition, fewer client applications support JSON, and the technology is newer and possibly less stable than other types.

That said, there are good JSON use-cases for sparsely-populated data or custom attributes.

Create a Table with a JSON Field

Consider a shop selling books. A book table will have an ID, ISBN, title, publisher, number of pages and other relational data which applies to all books. Presume we want to add any number of category tags to any book. We could achieve this in SQL using:

  1. a tag table which stored each tag name against a unique ID, and
  2. a tagmap table with many-to-many records mapping book IDs to tag IDs

It’ll work, but it’s cumbersome and considerable effort for a minor feature. Therefore, we’ll define a tags JSON field in our MySQL database’s book table:

CREATE TABLE `book` (
  `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `title` varchar(200) NOT NULL,
  `tags` json DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

Note that JSON columns cannot have a default value, be used as a primary key, be referenced as a foreign key or have an index. You can create secondary indexes on generated virtual columns, but it’s possibly easier to retain an indexed value in a separate field.

Adding JSON Data

Whole JSON documents can be passed in INSERT or UPDATE statements. For example, our book tags can be passed as an array:

INSERT INTO `book` (`title`, `tags`)
VALUES (
  'ECMAScript 2015: A SitePoint Anthology',
  '["JavaScript", "ES2015", "JSON"]'
);

JSON can also be created with the:

  • JSON_ARRAY() function which creates arrays, e.g.
    -- returns [1, 2, "abc"]:
    SELECT JSON_ARRAY(1, 2, 'abc');
  • JSON_OBJECT() function which creates objects, e.g.
    -- returns {"a": 1, "b": 2}:
    SELECT JSON_OBJECT('a', 1, 'b', 2);
  • JSON_MERGE() function to merge documents, e.g.
    -- returns ["a", 1, {"key": "value"}]:
    SELECT JSON_MERGE('["a", 1]', '{"key": "value"}');
  • or you can (CAST anyValue AS JSON).

The JSON_TYPE() function allows you to check JSON value types. It should return OBJECT, ARRAY or an error, e.g.

-- returns ARRAY:
SELECT JSON_TYPE('[1, 2, "abc"]');

-- returns OBJECT:
SELECT JSON_TYPE('{"a": 1, "b": 2}');

-- returns an error:
SELECT JSON_TYPE('{"a": 1, "b": 2');

Similarly, the JSON_VALID() function returns 1 when the JSON is valid:

-- returns 1:
SELECT JSON_TYPE('[1, 2, "abc"]');

-- returns 1:
SELECT JSON_TYPE('{"a": 1, "b": 2}');

-- returns 0:
SELECT JSON_TYPE('{"a": 1, "b": 2');

Attempting to insert an invalid JSON document will raise an error, and the whole record will not be inserted/updated.

Searching JSON Data

The JSON_CONTAINS() function accepts the JSON document being searched and another to compare against. It returns 1 when a match is found, e.g.

-- all books with the 'JavaScript' tag:
SELECT * FROM `book` 
WHERE JSON_CONTAINS(tags, '["JavaScript"]');

The similar JSON_SEARCH() function returns the path to the given match or NULL when there’s no match. It is passed the JSON document being searched, 'one' to find the first match or 'all' to find all matches, and a search string, e.g.

-- all books with tags starting 'Java':
SELECT * FROM `book` 
WHERE JSON_SEARCH(tags, 'one', 'Java%') IS NOT NULL;

JSON Paths

A JSON path targets values and can be used to extract or modify parts of a document. The JSON_EXTRACT() function demonstrates this by extracting one or more values:

-- returns "SitePoint":
SELECT JSON_EXTRACT(
  '{"id": 1, "website": "SitePoint"}', 
  '$.website'
);

All JSON path definitions start with a $ followed by other selectors:

  • a period followed by a name, e.g. $.website
  • [N] where N is the position in a zero-indexed array
  • a .[*] wildcard which evaluates all members of an object
  • a [*] wildcard which evaluates all members of an array
  • a prefix**suffix wildcard which evaluates to all paths that begin with the named prefix and end with the named suffix.

Referring to this JSON document:

{
  "a": 1,
  "b": 2,
  "c": [3, 4],
  "d": {
    "e": 5,
    "f": 6
  }
}

the following paths:

  • $.a returns 1
  • $.c returns [3, 4]
  • $.c[1] returns 4
  • $.d.e returns 5
  • $**.e returns [5]

Extracting JSON Paths in Queries

We can extract the first tag from our book table using the query:

SELECT
  name,
  tags->"$[0]" AS `tag1`
FROM `book`;

For a more complex example, presume we have a user table with JSON profile data, e.g.

id name profile
1 Craig {
“twitter”: “@craigbuckler”,
“facebook”: “craigbuckler”,
“googleplus”: “craigbuckler”
}
2 SitePoint {
“twitter”: “@sitepointdotcom”
}

We can extract the Twitter name using a JSON path, e.g.

SELECT
  name, profile->"$.twitter" AS `twitter`
FROM `user`;

We can also reference a JSON path in the WHERE clause to only return users with a Twitter account:

SELECT
  name, profile->"$.twitter" AS `twitter`
FROM `user`
WHERE
  profile->"$.twitter" IS NOT NULL;

Modifying Part of a JSON Document

There are several MySQL functions to modify parts of a JSON document using path notation. These include:

  • JSON_SET(doc, path, val[, path, val]...)
    inserts or updates data in the document
  • JSON_INSERT(doc, path, val[, path, val]...)
    inserts data into the document
  • JSON_REPLACE(doc, path, val[, path, val]...)
    replaces data in the document
  • JSON_MERGE(doc, doc[, doc]...)
    merges two or more documents
  • JSON_ARRAY_APPEND(doc, path, val[, path, val]...)
    appends values to the end of an array
  • JSON_ARRAY_INSERT(doc, path, val[, path, val]...)
    inserts an array within the document
  • JSON_REMOVE(doc, path[, path]...)
    removes data from the document.

We can therefore add a ‘programming’ tag to any book which already has a ‘JavaScript’ tag:

UPDATE `book`
  SET tags = JSON_MERGE(tags, '["technical"]')
WHERE
  JSON_SEARCH(tags, 'one', 'technical') IS NULL AND
  JSON_SEARCH(tags, 'one', 'JavaScript') IS NOT NULL;

Further Information

The MySQL manual provides further information about the JSON data type and the associated JSON functions.

Again, I urge you not to use JSON unless it’s absolutely necessary. You could emulate an entire document-oriented NoSQL database in MySQL, but it would negate many benefits of SQL, and you might as well switch to a real NoSQL system. That said, JSON data types could save some effort for more obscure data requirements within a SQL-based application.

  • stoker

    You can use generated columns to create SQL search indexes on JSON documents in a JSON column in a MYSQL table.

    • Craig Buckler

      You can but, as mentioned above, it’s possibly more practical to put the indexed value into its own field.

  • Todd Zmijewski

    I didn’t know any of this but like you said in most cases a noSQL database would be more appropriate. Interesting none the less.

    • andythebouncer

      I’ve got a good use case for this. It’s a terabytes-large database of small structured documents that are organized by complex, heavily relational metadata. The documents themselves need only be interpreted by a client. Normalizing them into different tables and fields only to convert them back to JSON to send to the client would be absurd. Definitely seems like an edge case, but very useful to me.

  • Oscar Blank

    It’s like an evil temptation, and likely to be abused by many newbs.

    • Craig Buckler

      Possibly. It’s very tempting to use this as a quick fix instead of a more robust schema update. Before you know it, every field is in a JSON document!

    • Chris

      I receive API results in json format. I’d like to leave them that way since I can display json docs quickly in HTML but converting them to columns and back again only to format into HTML is really wasteful.
      How is it evil temptation to store docs when you’re not really building a query from the doc (or rarely)?

      • Craig Buckler

        If your database only stores JSON documents, would you be better using a NoSQL database such as MongoDB?

        If it contains other mostly-relational data, then storing JSON documents in a SQL DB is absolutely fine. However, I would recommend extracting key values if you intend using them in queries.

  • Would this apply to SQLite too? Just wondering..

  • Clark Winkelmann

    In the JSON_MERGE() section, you probably intended to use it in the code example instead of JSON_OBJECT() I think ? Looks like copy-paste :D

    • Craig Buckler

      Well spotted Clark – any SP editors around to fix it?!

    • Nilson Jacques

      Thanks Clark, I’ve fixed the example.

  • Would this be a good way to store attachments (file names) associated with an entry?

    • Craig Buckler

      I hate to say “it depends” but it depends. If it’s a fairly infrequent occurrence and you’d never need to look for a specific filename, then go for it. A more robust solution would be a separate table of filenames which linked to the original entry.

      • OK, then in my case separate tables is the better choice. Thanks.

  • Luis Aguilar

    In the JSON_VALID example, it seems to be wrongly typed JSON_TYPE instead. Good tutorial though.

  • superman

    How to use json data type with hibernate?

  • Suba Nagarasa

    can we do the above in h2 database?? Could you please give some guides to implement the same in H2 database?

    • Craig Buckler

      H2 doesn’t have a JSON data type but it does have ARRAY. As mentioned, any database can store data as a string so you can convert to/from JSON if necessary. Just use it sparingly.

  • In real-life, it’s quite likely that you’ll deal with structures like: `[{“name”: “field1″,”value”: “Value 1”},{“name”: “field 2″,”value”:”Value 2″},…]`, not with `{“field1″:”Value 1″,”field2”:”Value 2,…}`. Is there any way to extract values, knowing the name, directly from a first-type structure?

  • gihrig

    CAST syntax seems to be all wrong “(CAST anyValue AS JSON)”. MySQL manual has it as “CAST(expr AS type)”.

    SELECT CAST(99 AS JSON) returns 99
    SELECT CAST(“99” AS JSON) returns 99
    SELECT CAST(‘a’ AS JSON) returns “Invalid value.” at position 0
    SELECT CAST(‘a:1’ AS JSON) returns “Invalid value.” at position 0

    And yet:
    SELECT CAST(NOW() AS JSON) returns “2017-03-22 17:47:48.000000”

    Would be god to have some explanation on how this works and where it might be used.

Recommended
Sponsors
Get the latest in Front-end, once a week, for free.