I have a table of songs and I want to be able to quickly search for songs by artist, title or lyrics fragment. The title and the lyrics fragment must keep their original punctuation, diacritics and capitalization, but the user will ignore them during the search.
I want to use the same search field because the user may not know the title or lyrics. Furthermore, the user may make mistakes. The pluralization may be wrong, a particle may be forgotten (show [an] example), a letter may be different (sing/song), ignore a missing letter (ma’am/madam), useless distinction (it’s/it is), orthographic difference (color/colour).
So far, I have created the column search_index and an index for it. It is automatically filled out by a trigger that concatenates the title and the lyrics fragment fragment, convert to lower case and remove diacritics and punctuation. The app was very slow before the index column.
The search term that the user inputs is converted to lower case without diacritics and punctuation. The comparison in the query is:
search_index REGEXP CONCAT('(.+\\s)?', :search_term, '(\\s.+)?')
The next steps is to ignore pluralization, missing articles and alternate orthography, but it may be very complex and slow.
I need to improve my search engine. Then I have to know what the search engines of most web sites do.