I am attempting to use a full text search over a series of names in my database. This is my first attempt at using full text search. Currently I take the search string enter
Full Text is going to work off of the search criteria you provide it. You can remove the noise word from the file, but you really risk bloating your index size by doing that. Robert Cain has a lot of good information on his blog regarding this:
http://arcanecode.com/2008/05/29/creating-and-customizing-noise-words-in-sql-server-2005-full-text-search/
To save some time you can look at how this method removes them and copy the code and words:
public string PrepSearchString(string sOriginalQuery)
{
string strNoiseWords = @" 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | $ | ! | @ | # | $ | % | ^ | & | * | ( | ) | - | _ | + | = | [ | ] | { | } | about | after | all | also | an | and | another | any | are | as | at | be | because | been | before | being | between | both | but | by | came | can | come | could | did | do | does | each | else | for | from | get | got | has | had | he | have | her | here | him | himself | his | how | if | in | into | is | it | its | just | like | make | many | me | might | more | most | much | must | my | never | now | of | on | only | or | other | our | out | over | re | said | same | see | should | since | so | some | still | such | take | than | that | the | their | them | then | there | these | they | this | those | through | to | too | under | up | use | very | want | was | way | we | well | were | what | when | where | which | while | who | will | with | would | you | your | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z ";
string[] arrNoiseWord = strNoiseWords.Split("|".ToCharArray());
foreach (string noiseword in arrNoiseWord)
{
sOriginalQuery = sOriginalQuery.Replace(noiseword, " ");
}
sOriginalQuery = sOriginalQuery.Replace(" ", " ");
return sOriginalQuery.Trim();
}
however, I would probably go with a Regex.Replace for this which should be much faster than looping. I just don't have a quick example to post.
Here's a working function. The file noiseENU.txt
is copied as-is from \Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\FTData
.
Public Function StripNoiseWords(ByVal s As String) As String
Dim NoiseWords As String = ReadFile("/Standard/Core/Config/noiseENU.txt").Trim
Dim NoiseWordsRegex As String = Regex.Replace(NoiseWords, "\s+", "|") ' about|after|all|also etc.
NoiseWordsRegex = String.Format("\s?\b(?:{0})\b\s?", NoiseWordsRegex)
Dim Result As String = Regex.Replace(s, NoiseWordsRegex, " ", RegexOptions.IgnoreCase) ' replace each noise word with a space
Result = Regex.Replace(Result, "\s+", " ") ' eliminate any multiple spaces
Return Result
End Function