Handling topic tags in an SQL DB

Just asking out of general curiosity.
In an SQL/MySQL DB, how could a DB/table be structured ( and what would be the search string) so that a specific row( record) can be found using TAGs ( as in a blog)?

To clarify my question, here are some FAILED attempt at a solution:

STRUCTURE: the record has a column for TAGS, where a comma separated string of tags is stored.
SEARCH: SELECT * FROM theTable WHERE tags LIKE %some,string,of,tags%

syntax aside, this method fails simply because it would impossible to place the tagnames in the same order fro all records so as to match the %…% condition

I was thinking that another solution would have been SEARCH: SELECT * FROM theTable WHERE tags IN (‘some’,‘string’,‘of’,‘tags’)

but that fails because it would mean the tag could could only contain one tag name at a time, thus defeating the purpose of tags.

This might work, but I wondered if there was a more graceful solution

  1. store the list of tags with a starting and trailing comma: ,this,is,a,list,of,tags,
  2. SELECT * FROM theTable WHERE tags LIKE %,this% AND tags LIKE %,is,% AND tags LIKE %,list,% …etc…
    it just seems that the sequence of LIKE %whatever% statement could get pretty convoluted

As always thanks to everyone, in advance, for sharing your wisdom :slight_smile:

[quote=“dresden_phoenix, post:1, topic:200880, full:true”]I was thinking that another solution would have been SEARCH: SELECT * FROM theTable WHERE tags IN (‘some’,‘string’,‘of’,‘tags’)

but that fails because it would mean the tag could could only contain one tag name at a time, thus defeating the purpose of tags.[/quote]you were absolutely correct right up untill you got to the “defeating the purpose” part – that’s wrong

the “more graceful solution” as you put it, is to have a table where each entity gets as many rows as tags

for example –

foods
23 steamed broccoli
24 boiled carrots
25 ice cream
26 gala apples
27 cilantro

food_tags
23 vegetable
23 side dish
24 vegetable
24 side dish
25 dessert
26 fruit
26 dessert
26 snack
27 herb

A graceful solution would be to take the delimited string and put it into a separate joined table.

Other than that, you’ll have to some trickery to split the string into a temp table and query from the temp table…

I don’t typically like to link to other sites, but this is the basic idea.

Thanks for your responses.

r937 (and Dave, I assume you meant the same solution)
would that not be considered storage intensive… I mean the TAG table would be orders of magnitude bigger than the DATA table? Again just curious if that’s a normal ‘cost of operation’ , if you will.

In terms of rows, sure, but in terms of storage, not so much. The tags table would be carrying a handful of characters plus a numeric field for each tag. In your way, you’d have a varchar field (assuming you’re smart enough to use that) which would have a larger varchar field which would hold the length PLUS the commas to separate them, so the larger you go, the less the difference would be.

Like in Rudy’s case, gala apples has three rows with no more than 7 characters plus an integer in a row. In your example, you’d have one field which would have 20 characters in it (the seventeen combined characters for the tags, plus the commas to separate them). So the difference would be negligible in the long run.

i suspect you might be suggesting that the food_tags table in my previous post be replaced by these two tables –

tags
71 vegetable
72 side dish
73 dessert
74 fruit
75 snack
76 herb

food_tags
23 71
23 72
24 71
24 72
25 73
26 74
26 73
26 75
27 76

i strongly urge you not to do this

leave the actual tags themselves in the food_tags table

you will thank me later

:slight_smile:

No…I didn’t say that. I was supporting your layout (in fact I even used your example to explain it).

Only way I would even consider using a lookup table would be if he wanted to have a finite set of tags.

Thanks Dave. I never thought of over all character count.

Rudy,
I can imagine the readably issue if the tag name were removed… just in general. but what if tag names were 100% moderator controlled?

you would do that with a tags table

[code]CREATE TABLE tags
( tag VARCHAR(25) NOT NULL PRIMARY KEY
);

CREATE TABLE food_tags
( food_id INTEGER NOT NULL REFERENCES foods ( id )
, tag VARCHAR(25) NOT NULL REFERENCES tags ( tag )
, PRIMARY KEY ( food_id , tag )
);[/code]

tags
vegetable
side dish
dessert
fruit
snack
herb

food_tags
23 vegetable
23 side dish
24 vegetable
24 side dish
25 dessert
26 fruit
26 dessert
26 snack
27 herb

you need the FOREIGN KEY reference in order to implement 100% moderator control

but in queries, you never join to the tags table

ah I, get it. :slight_smile: and I can also see that I can get a list of comprehensive tags using:

SELECT DISTINCT tag FROM food_tags";

which is what I would have needed next.

Thanks, again!

um, no… not if you implement the tags table for referential integrity

then all you’d need is

SELECT tag FROM tags

duh… sorry brain f-rt. I realize now that there are 3 tables involved:

an article table a food_tag table with reference for each tag applied to each row( the first column in the example), and the tags table. I thought in the example that the tag table was were the articles would be stored…

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.