Sql Server Full-Text Search Protips Part 1: SetupBy Wyatt Barnett
As I have previously mentioned, Sql Server’s Full-Text search can let a developer create some very slick features disturbingly easily. And, unlike many other Full-Text implementations, it is not limited to plain text fields. It can also search within binary fields with the proper setup considerations. Needless to say, there are a few protips to making Full-Text indexes work and taking advantage of them. In this post, I will tell you how to get Full-Tex
First, you are going to need a copy of Sql Server 2000 or 2005 Standard edition. MSDE or Sql Express do not have Full-Text capabilities. In order to enable Full-Text search, you must do a few things.
- Enable Full-Text search on the database.
- Create a Full-Text catalog.
- Enable Full-Text searching on specific columns in your data.
To do this, we can run some system stored procedures. I should note that these are depereciated in 2005 in favor of DDL statements, but they still work and are a bit simpler to digest. Here is an example of the SQL necessary to create a Full-Text index in a database:
exec sp_fulltext_database 'enable' exec sp_fulltext_catalog 'Catalog_Name', 'create'
Table Design Considerations
Once there is a Full-Text index in place, you can start indexing columns. But lets not get ahead of ourselves. First and foremost, using fulltext indexing requires a single unique key on the table. Second, for any table where you are using Full-Text indexes, you really should have a TIMESTAMP column. This is a small binary column that is automatically updated whenever the row is changed. It is necessary as it acts as a flag for the indexing engine to allow it to incrementally index the table.
There is one other considerations for using Full-Text indexing on binary fields, such as word documents stored in a database. And that consideration is you must have a text field with the file extension (‘doc’ for word, for example). This is because the indexing needs to know what kind of file it is dealing with in order to parse the file and find the data. Out of the box, Sql Server can index text files, html files and Word documents. But it can be extended using IFilters. For example, Adobe provides a PDF IFilter for your use.
Enabling Full-Text Indexing on a Table & Columns
Lets say we have a rather simple table we wish to index, document_attachments. It has six fields:
- Id (INT IDENTITY NOT NULL, primary key)
- FileName (varchar(255) NOT NULL)
- FileDescription(varchar(1000) NULL)
- FileExtension (varchar(10) NOT NULL)
- SaveStamp (TIMESTAMP)
- FileData (IMAGE)
In order to allow for fulltext queries on your table, you must first enable the table for fulltext indexing, then enable fulltext indicies on specific columns. If we wished to create a fulltext index on the FileDescription & FileData fields, we would use the following T-SQL:
exec sp_fulltext_table @tabname='document_attachments', @action='create', @ftcat=[CatalogName], @keyname=[Name of Primary Key index] exec sp_fulltext_column @tabname='document_attachments', @colname='FileName', @action='add' exec sp_fulltext_column @tabname='document_attachments', @colname='FileData', @action='add', @type_colname='FileExtension'
Now, before you use any fulltext catalog, it must be indexed. There are two separate scenarios here. For “normal fields”, that is fields that are not IMAGE or TEXT, Sql Server is capable of automatically tracking changes to columns for you. And, if you are working with Sql 2005, you really should be using VARCHAR(MAX) and VARBINARY(MAX), so this will not be a problem for some. But those of us still working on Sql 2000 will be required to setup an indexing schedule on any of the old school long text or binary columns in order to keep things fresh. Exactly how often to index depends on the application and your requirements. Do remember it is a bit of an expensive process, so you do not want to fire it off every 15 seconds to keep things “fresh.”
In any case, to enable change tracking (automated indexing) on a table, use the following T-SQL:
exec sp_fulltext_table 'document_attachments', 'start_change_tracking'
As for setting up the schedule, the best way is to fire up enterprise manager, find your Full-Text catalog, right click on it and choose Schedules. Then setup whatever schedule you feel appropriate.
But Wait Wyatt! Isn’t there more!
Why yes, there is more. Stay tuned for the next installment of Sql Server Full-Text Protips: Using Full-Text Queries for Fun and Profit.
Thanks and kick it if you like it.