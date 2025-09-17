What is structured data? And what type of data is unstructured? Can you give any examples?
Structured data has a structure that it complies with.
A CSV is structured.
A word document is not.
good examples
although i’ve seen CSVs used as powerpoint pages (unstructured)
but the idea is solid
You can use structured data in the place of unstrucutred (it just happens to be in a structure). You cant use unstructured data as structured.
JSON functions?
i dont… understand your implication?
JSON data is unstructured – to SQL it’s just a big VARCHAR
until JSON functions came along, and turned it into structured data
JSON data is not unstructured - but until you parse it into JSON data (arguably, you’re doing this in Javascript, so it ceases to be “JSON” data, and just… Javascript Data in the form of an Object), it’s just text.
Context is, obviously, important - to one system, a structure may be oblique, and to others, not.
In the right system, the word document can be parsed into structured data as well - if there existed a system to which the word document must fit a rigid structure. (And to anyone reading this who thinks this is a suggestion: It is not. Do not do this. It is foolish.)
I cannot find a clear definition of unstructured data. I find multiple sources saying that XML and JSON are semi-structured data.
Google AI says much the same thing. If you are copying and pasting from AI then an administrator might have something to say about that. I admit I sometimes use AI for guidance and confirmation about answers.
Google AI says much the same thing.
In short: structured = neatly organized (like spreadsheets), unstructured = messy/varied formats (like social posts or images).
Perhaps I’m a out of touch with current terminology, but I’m surprised to see this thread get to 13 replies without using the word “schema”. I was under the impression (perhaps past tense is appropriate?) that structured data required a schema (loosely, not a specific technology) that defined every data element (type, bounds, etc), the main purpose being to allow an application to access and interpret the data correctly without making assumptions.
Anything less would be “not very well structured data” until you end up with a flat file or blob of “stuff” that would be completely unstructured.
Delimited files such as CSV have no defined schema but are considered structured. Unless you consider the rigid two dimensions to be the schema. The consistent two dimensions make delimited files structured.
Interesting. I guess “structured” means far less in this context than I thought.
This is a paper from IBM about Structured and Unstructured data. Where are use, pro and cons.
Interesting they put CSV, JSON, XML in a SemiStructured data. The called it a “bridge”
I guess you both are correct.
https://www.ibm.com/think/topics/structured-vs-unstructured-data
I think the structural part of CSV files might be the comma. It certainly is a schema. TMK there is no length constraint on elements in a CSV file (but that might be my ignorance).
You gave a perfect example.