Searching "difficult" data?

So I have a database of products.

These often may contain things people are looking for, such as:

$100 (price)
3.5" (3.5" hard drive)
6’ (6 foot cable)

And it goes on with many other potential cases. My current implementation is a fulltext which is very problematic since all 3 examples contain stopwords breaking up your search. If I did a LIKE-type search, it would be more inefficient and difficult to sort the results.

For the most parts, I am dealing with these types of searches rather than english language text searches.

Any ideas how to better approach this problem with searching the data?

do some analysis as you insert the data, and populate one or more searchable/indexable columns, e.g. one column for the number portion, so that you can search for 100 or 3.5 or 6

Even 3.5 won’t work - isn’t that a stopword too?

An idea I had was to populate a keywords table and taking those examples, translate them to (something like):

$100 = CHRDOLLAR100
3.5" 3CHRPERIOD5CHRQUOTE
6’ 6CHRQUOTESINGLE

Is something like that a viable approach that would work? I did a quick test and it does in fact work. Wondering if there are some drawbacks I’m unaware of

yeah, but you’re overcomplicating it with an extra table

try just creating the numeric column i suggested (which you wouldn’t use MATCH on, obviously)

I think I’m having some difficulty understanding your suggestion.

Say I want to search for $5, or $10

if I just have a 5 or 10 in some other field, it wouldn’t work too well? Could be 5% or 10%? Or some number as part of something else…?

actually, no, you understood my suggestion quite well

:slight_smile:

I’ve never had to deal with a problem like this but it is interesting to think about. So instead of having a field contain $3.67 or 3.5" I could use 2 fields, say “unit” and “sign” (or something much better I’m sure) and have
3.67 … dollars
3.5 … inches
etc. then my query could be something like
SELECT unit, sign FROM table WHERE sign = ‘dollars’

Sadly our data isn’t quite… that’s simplistic

We don’t have no text data - in fact we actually have a decent amount… and it gets searched. It’s just that we are also heavy on searches of that type, which causes some issues.

There could be multiple instances of provided examples and others within a single row item and so it couldn’t just go into another column - it would mix the data.

Yes I end up with an extra table but I think this translated lookup table might actually work out pretty well…