Sql Server Full-Text Search Protips Part 1: Setup

As I have previously mentioned, Sql Server’s Full-Text search can let a developer create some very slick features disturbingly easily. And, unlike many other Full-Text implementations, it is not limited to plain text fields. It can also search within binary fields with the proper setup considerations. Needless to say, there are a few protips to making Full-Text indexes work and taking advantage of them. In this post, I will tell you how to get Full-Tex

Getting Started

First, you are going to need a copy of Sql Server 2000 or 2005 Standard edition. MSDE or Sql Express do not have Full-Text capabilities. In order to enable Full-Text search, you must do a few things.

  • Enable Full-Text search on the database.
  • Create a Full-Text catalog.
  • Enable Full-Text searching on specific columns in your data.

To do this, we can run some system stored procedures. I should note that these are depereciated in 2005 in favor of DDL statements, but they still work and are a bit simpler to digest. Here is an example of the SQL necessary to create a Full-Text index in a database:


exec sp_fulltext_database 'enable'
exec sp_fulltext_catalog 'Catalog_Name', 'create'

Table Design Considerations

Once there is a Full-Text index in place, you can start indexing columns. But lets not get ahead of ourselves. First and foremost, using fulltext indexing requires a single unique key on the table. Second, for any table where you are using Full-Text indexes, you really should have a TIMESTAMP column. This is a small binary column that is automatically updated whenever the row is changed. It is necessary as it acts as a flag for the indexing engine to allow it to incrementally index the table.

There is one other considerations for using Full-Text indexing on binary fields, such as word documents stored in a database. And that consideration is you must have a text field with the file extension (‘doc’ for word, for example). This is because the indexing needs to know what kind of file it is dealing with in order to parse the file and find the data. Out of the box, Sql Server can index text files, html files and Word documents. But it can be extended using IFilters. For example, Adobe provides a PDF IFilter for your use.

Enabling Full-Text Indexing on a Table & Columns

Lets say we have a rather simple table we wish to index, document_attachments. It has six fields:

  • Id (INT IDENTITY NOT NULL, primary key)
  • FileName (varchar(255) NOT NULL)
  • FileDescription(varchar(1000) NULL)
  • FileExtension (varchar(10) NOT NULL)
  • SaveStamp (TIMESTAMP)
  • FileData (IMAGE)

In order to allow for fulltext queries on your table, you must first enable the table for fulltext indexing, then enable fulltext indicies on specific columns. If we wished to create a fulltext index on the FileDescription & FileData fields, we would use the following T-SQL:


exec sp_fulltext_table @tabname='document_attachments', @action='create', @ftcat=[CatalogName], @keyname=[Name of Primary Key index]
exec sp_fulltext_column @tabname='document_attachments', @colname='FileName', @action='add'
exec sp_fulltext_column @tabname='document_attachments', @colname='FileData', @action='add', @type_colname='FileExtension'

Indexing Considerations

Now, before you use any fulltext catalog, it must be indexed. There are two separate scenarios here. For “normal fields”, that is fields that are not IMAGE or TEXT, Sql Server is capable of automatically tracking changes to columns for you. And, if you are working with Sql 2005, you really should be using VARCHAR(MAX) and VARBINARY(MAX), so this will not be a problem for some. But those of us still working on Sql 2000 will be required to setup an indexing schedule on any of the old school long text or binary columns in order to keep things fresh. Exactly how often to index depends on the application and your requirements. Do remember it is a bit of an expensive process, so you do not want to fire it off every 15 seconds to keep things “fresh.”

In any case, to enable change tracking (automated indexing) on a table, use the following T-SQL:


exec sp_fulltext_table 'document_attachments', 'start_change_tracking'

As for setting up the schedule, the best way is to fire up enterprise manager, find your Full-Text catalog, right click on it and choose Schedules. Then setup whatever schedule you feel appropriate.

But Wait Wyatt! Isn’t there more!

Why yes, there is more. Stay tuned for the next installment of Sql Server Full-Text Protips: Using Full-Text Queries for Fun and Profit.

Thanks and kick it if you like it.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://www.errewf.it RaS!

    Nice article, thanks!

  • whyulil

    the second code example has exex instead of exec on the second line.

    Cheers thats a good post. looking forward to part 2

  • VijayaDeepan

    Very Excellent Article. Thanks.

  • wwb_99

    Fixed the syntax error, good catch whyulil.

  • Barbley

    Great Article. I’m setting up Full Text Indexing (SQL 2000) and I had a question about the very end of your article in reference to setting up a Schedule. I’m setting Full Text Indexing basically with the same step you mention above, the one exception being that I do not have a TIMESTAMP column in my table.

    I execute
    sp_fulltext_table @tabname = ‘FT_resumes’ , @action = ‘start_change_tracking’

    just as you had mentioned, to enable change tracking but then instead of creating a schedule I execute the following…

    exec sp_fulltext_table @tabname = ‘FT_resumes’
    , @action = ‘start_background_updateindex’

    So that it can start propagating tracked changes to the full-text index as they occur.

    I thought that this second step covered creating a schedule. Testing has shown that the index is being updated. But when I check EM – no schedule appears. Am I covered with the way I’m doing this or do I also need to create a schedule on top of this. Any insight is appreciated.

    exec sp_fulltext_table @tabname = ‘FT_resumes’
    , @action = ‘start_background_updateindex’

  • wwb_99

    For something like this, you need to create a schedule in enterprise manager. I think you could also schedule a job through T-SQL to do it.

    I would really recommend adding a TIMESTAMP field to your tables. They don’t hurt anything in most cases, and allow the Full-Text engine do incremental populations as it can now tell which rows have been updated.

  • Barbley

    Thanks for the information, but now I think I’m more confused than before.

    I thought that by executing:

    exec sp_fulltext_table @tabname = ‘FT_resumes’
    , @action = ’start_background_updateindex’

    That tracked changes to the full-text index would be propagated as they occured, and it appeared to be doing this in our testing environment.

    Do in need to also schedule incremental updating if I’m already updating as changes occur? And then I’m not sure how the timestamp column would fit in if updates are being propagated as they occur.

    Thank you for your time,

    B

  • wwb_99

    I think I misread the question myself. Anyhow, to clairify:

    Change tracking only works on CHAR, BINARY, VARCHAR and VARBINARY columns. IMAGE and TEXT need not apply. So, if your columns are not of the ‘long’ types, you need not worry about schedules. If they are of the long type, you will need to worry about schedules. The TIMESTAMP column is required should you wish to use incrimental population, which is highly recommended. Well, at least it is better than blowing the whole thing away and doing a full population.

    As for TIMESTAMP itself, it is a very handy item. It is basically an 8-bit timestamp that gets automatically updated whenever the row is updated. I use it for concurrency management at times.

  • Barbley

    Thanks, so just to make sure I’m getting it…

    Since the column my full text index is on is of “TEXT” type then by running

    exec sp_fulltext_table @tabname = ‘FT_resumes’
    , @action = ’start_background_updateindex’

    I should be all set?

    Thanks again!!!

  • wwb_99

    I think that will get an initial population going. But it will not continue to update things. Remember that TEXT and IMAGE columns are funny beasts insofar as Sql 2000 is concerned. The data in the field is actually merely a pointer to another page. In effect, the data is not really part of the row.

    To quote the Sql 2000 books online:

    “Full-text change tracking does not track any WRITETEXT or UPDATETEXT operations performed on full-text indexed columns that are of type image, text, or ntext.”

    What that means is that you need to manually create a schedule to run an incremental population on the table with a full-text index on the TEXT column.

  • Barbley

    Wyatt – thanks so much for taking the tie to responde to all of my posts. The information you’ve provided will come in handy.
    Thanks again,
    B

  • Sanjeev Kumar

    Table design consideration section is really ver much helpful.

  • Damen

    Good stuff, but using those system stored procedures should be avoided if using SQL 2005 as they are deprecated.
    Would be great to see an example using the correct / latest SQL.

  • wwb_99

    I intentionally used the SQL 2000-compatible commands, mainly because more people could use them. They do work in Sql 2005 as well, and will likely work in Sql 2008. Even if they are depreciated.

  • Bas

    SQL Server 2005 Express Edition does support Full-Text Search, contrary to what the article states. Be sure to install the SQL Server 2005 Express Edition with Advanced Services SP2.

  • Trent

    This article and discussion seems a bit misleading…

    From my expirements, it seems that turning on both “Change Tracking” and “Update Index in Background” (as found on the context menu for the table in Enterprise Manager) does result in the catalog getting updated when text/image columns are updated. This works when calling UPDATE/INSERT on the row in question AND when simply typing into the relevant column via Enterprise Manager.

    It is true that WRITETEXT and UPDATETEXT operations don’t cause updates to the catalog, however this is an unusual case. I expect most users would be using the (less performant) UPDATE/INSERT operations and they’d find that their catalogs do update when expected, without the need for a scheduled update.

  • sankar

    Dear sir
    Iam using sqlserver management express edition,when iam creating full text search in a table it shows a message called fulltext search already enabled.how to know for which catalog it is assigned and what fields full text is enabled. how to change catalog for that table. Actually i want to crate a catalog for a table and have to assign all fields in that table. Please help me in this

  • Anonymous

  • Anonymous

    Protip of the Day:

    Use spell check.

    To do this, we can run some system stored procedures. I should note that these are

    depereciated

    in 2005

  • http://www.mabotse.co.za Jakes

    Very nice article. Useful even in 2011.