find a result from a .doc type that store in a varbinary(max) column

坚强是说给别人听的谎言 提交于 2019-12-11 06:37:08


i want to write a query with Full-Text-Search on a column with varbinary(max) type that stored a .doc/.docx(MS-Word) file. my query must returns records that contain a word in stored file.

is this possible?

if yes,how?(please write an example)

if yes,can we write that for other language(e.g Arabic,Persian or a UniCode characters)?

thank you beforehand.


What you're looking for is fulltext indexing, which has been greatly improved in SQL Server 2008.

For an introduction, I would recommend checking out these articles here:

  • SQL Server 2008 - Creating Full Text Catalog and Search
  • Understanding Full-Text Indexing in SQL Server
  • Fulltext-Indexing Workbench

Once you understand this and have created your own fulltext catalog, you should be able to search something like this:

SELECT ID, (other fields), DocumentColumn
FROM dbo.YourTable
WHERE CONTAINS(*, 'Microsoft Word')

And yes, Fulltext indexing and searching does support lots of languages - check out the links I've sent you and the SQL Server 2008 Books Online for details!



If you have SQL Server 2005 or later, yes, you just need the filters:

If you have SQL Server 2000, doc files can be indexed, but not the newer Office 2007 format as far as I know (I've heard you may be able to borrow the IFilter by installing Word 2007 on the server).

