Sql Server Full-Text Search Protips Part 3: Getting RANKed

In Part 1 of this series we explored the methodology for enabling a Sql Server database for Full-Text Indexing and enabling Full-Text indices on specific tables. And in Part 2 of this series we explored the methodology for accessing these Full-Text indices using the CONTAINS and FREETEXT functions. In this final part of the series we will examine the usage of the CONTAINSTABLE and FREETEXTTABLE functions to provide ranked search results to the calling application.

As previously mentioned, the CONTAINSTABLE and FREETEXTTABLE functions query the selected Full-Text index for the provided search terms. The difference between these functions and their non-TABLE cousins—that is CONTAINS and FREETEXT—is that these functions take slightly different parameters and, most important, they return a table of values rather than a simple true/false answer.

These table functions take a slightly different set of parameters than their cousins. These parameters are:

Free-Text indexed table to query.
Columns within the table to query.
Search term. For FREETEXTTABLE, this would generally just be the term(s) you wish to find. For CONTAINSTABLE, this term could be anything like the ones described in Part 2.
Top_N_Results. An optional integer to set a limit to the number of results to get. This field is very important—limiting to the amount of results you actually wish to work with greatly lessens load on the database server, especially when dealing with large result sets.

The table these functions return contains two columns:

KEY: this contains the primary key of the row in the indexed table. Obviously, but importantly, its data type is the same as the primary key in that table.
RANK: this is a positive integer representing the relative strength of the match to your search term. This number is not an absolute value—it does not start at some fixed value and work down to near zero. The specific value is only relative compared to other results from that Free-Text query. Comparing the RANK value across different Full-Text queries does not necessarily get meaningful results.

Now, that two-field table might sound very simple, but with it you can do a lot of very fancy tricks using a very simple technique—just INNER JOIN the table returned from your Free-Text function on the KEY field to your desired results, then sort by the RANK column for sorting effects. Or even get those neat percentage matches you see on some site search engines.

For the following examples, we will be using the Microsoft Pubs database, like we setup in Part 2 of this series.

First, let’s examine getting ranked search results from using FREETEXTTABLE:


SELECT 
	ftt.RANK,
	titles.title_id, 
	titles.title
FROM Titles
INNER JOIN 
FREETEXTTABLE(titles, notes, 'recipe cuisine') as ftt
ON
ftt.[KEY]=titles.title_id
ORDER BY ftt.RANK DESC

Nothing too tricky there, as you can see—just a simple INNER JOIN to the FREETEXTTABLE and ORDER BY the FREETEXTTABLE’s rank.

Now, to get percentages, you need to get a little fancier. For a match percentage, you need to divide the RANK by the top RANK value. You cannot, unfortunately, do this with just one Free-Text query; it requires selecting the TOP value then selecting the desired results and creating a calculated field for the percentages. Confused yet? Well, let’s see it in SQL.


DECLARE @topRank int

set @topRank=(SELECT MAX(RANK) FROM
FREETEXTTABLE(titles, notes, 'recipe cuisine', 1))

SELECT 
	ftt.RANK, 
	(CAST(ftt.RANK as DECIMAL)/@topRank) as matchpercent, 
	titles.title_id, 
	titles.title
FROM Titles
INNER JOIN 
FREETEXTTABLE(titles, notes, 'recipe cuisine') as ftt
ON
ftt.[KEY]=titles.title_id
ORDER BY ftt.RANK DESC

One note—I very intentionally used the optional Top_N_Results parameter on the first FREETEXTTABLE call to limit the number of results returned. The reason for this is to make the query a lot less expensive on the database server.

One final key trick: as I mentioned in Part 2, one can get very fancy with CONTAINS and search terms. This really, really applies to CONTAINSTABLE. For example, using weighted search terms plus the RANK column lets one get the right results for end users. For example, let’s say we wanted to find books on recipies or cuisies, but especially recipies:


SELECT 
	ftt.RANK,
	titles.title_id, 
	titles.title,
	notes
FROM Titles
INNER JOIN 
CONTAINSTABLE
	(
	titles, 
	notes, 
	'ISABOUT("recipe*" weight (.7), "cuisine*" weight (.2))'
	) 
as ftt
ON
ftt.[KEY]=titles.title_id
ORDER BY ftt.RANK DESC

One thing to note in the above example is that, because we are using a CONTAINS-type query, the search term (ISABOUT . . .) needed to include other forms to catch plurals and such. In this cases I used the “SearchTerm*” construct to tell the Full-Text engine to find all terms starting with ‘SearchTerm’.

Finally, to review:

In Part 1 we examined how to create Full-Text indices and considerations for maintaining those indexes.
In Part 2 we examined how to query your Full-Text indices using the CONTAINS and FREETEXT predicates, and the difference between those functions.
In this part, we looked at using the CONTAINSTABLE and FREETEXTTABLE functions to get ranked search results from your Free-Text indices. In addition, we examined some limitations of and workarounds for using Free-Text queries.

I hope you all have enjoyed, and please do post any comments or questions. And finally, feel free to kick it if you like it.

Frequently Asked Questions (FAQs) on SQL Server Full-Text Search and Ranking

What is the significance of ranking in SQL Server Full-Text Search?

Ranking in SQL Server Full-Text Search is a crucial feature that helps in organizing the search results based on their relevance. It assigns a rank value to each row in the result set. The rank value indicates the relevance of a row to the search condition in the WHERE clause. The higher the rank, the more relevant the row is. This feature is particularly useful when dealing with large databases where the search results can be overwhelming.

How does the CONTAINSTABLE function work in SQL Server Full-Text Search?

The CONTAINSTABLE function is a rowset function in SQL Server Full-Text Search that returns a table of zero, one, or more rows for those columns containing precise or fuzzy (less precise) matches to single words and phrases, the proximity of words within a certain distance of one another, or weighted matches. The returned table includes columns: KEY and RANK. The KEY column value is the unique key of the row and the RANK column value indicates how well a row matched the selection criteria.

How can I improve the performance of Full-Text Search in SQL Server?

There are several ways to improve the performance of Full-Text Search in SQL Server. One way is by creating a full-text index on the columns that you frequently search. This can significantly speed up the search process. Another way is by optimizing the query. Avoid using the LIKE operator in the WHERE clause as it can slow down the search. Instead, use the CONTAINS or FREETEXT predicates. Also, consider using the CONTAINSTABLE or FREETEXTTABLE functions for more complex queries.

What is the difference between RANK and DENSE_RANK functions in SQL Server?

Both RANK and DENSE_RANK functions are used to provide a rank to each unique record present in a partition. The difference lies in how they handle the ranking of duplicate values. The RANK function gives the same rank to the duplicate values and leaves a gap for the next rank. On the other hand, the DENSE_RANK function gives the same rank to the duplicate values but does not leave a gap for the next rank.

How can I use the RANK function in SQL Server?

The RANK function in SQL Server is used to provide a unique rank for each distinct row within a partition of a result set. The syntax for the RANK function is: RANK ( ) OVER ( [ partition_by_clause ] order_by_clause ). The partition_by_clause divides the result set into partitions to which the RANK function is applied. The order_by_clause determines the order of the data rows in each partition.

What is the role of the FREETEXT function in SQL Server Full-Text Search?

The FREETEXT function in SQL Server Full-Text Search is used to perform a full-text query that returns any rows that match the free-text expression. It is less precise than the CONTAINS function as it uses the OR operator by default. It also includes inflectional forms of the words in the search condition.

How can I use the FREETEXTTABLE function in SQL Server Full-Text Search?

The FREETEXTTABLE function in SQL Server Full-Text Search is used to return a table of zero, one, or more rows for those columns containing words or phrases that match the meaning, but not necessarily the exact wording, of the search condition. The returned table includes columns: KEY and RANK. The KEY column value is the unique key of the row and the RANK column value indicates how well a row matched the selection criteria.

What is the difference between the CONTAINS and FREETEXT functions in SQL Server Full-Text Search?

The CONTAINS and FREETEXT functions in SQL Server Full-Text Search are both used to perform full-text queries. The main difference between them is the precision of the search. The CONTAINS function is more precise as it uses the AND operator by default and includes only the exact wording of the search condition. On the other hand, the FREETEXT function is less precise as it uses the OR operator by default and includes inflectional forms of the words in the search condition.

How can I use the CONTAINS function in SQL Server Full-Text Search?

The CONTAINS function in SQL Server Full-Text Search is used to perform a full-text query that returns any rows that match the exact wording of the search condition. The syntax for the CONTAINS function is: CONTAINS ( column, ‘search_condition’ ). The column parameter specifies the column to be searched and the search_condition parameter specifies the text to be searched for.

What is the role of the FULLTEXTSERVICE property in SQL Server Full-Text Search?

The FULLTEXTSERVICE property in SQL Server Full-Text Search is used to get or set the properties of the Full-Text Engine for SQL Server. It can be used to configure various aspects of the Full-Text Engine, such as the word breakers and stemmers used for indexing and querying, the noise words (stopwords) to be ignored during indexing and querying, and the thesaurus files to be used for expanding queries.