Showing posts with label punctuation. Show all posts
Showing posts with label punctuation. Show all posts

Wednesday, March 21, 2012

Punctuation marks

Does anyone have a list of all punctuation marks ignored by the full-text
indexing service by default. Noise files (i.e. noise.dat) only explicitly
list the dollar sign ($) and the underscore (_) as noise "words".
And another observation - the Windows implementation of MS Search (compared
to the MS SQL Server implementation) yields different results - try searching
for files with the "|" character in the file name. Ok, it's an illegal
character, but the result is at least 'interesting'.
As far as the rest of the characters ignored in SQL FTS are concerned, they
don't bother Windows search. Has anyone else come across these (or other)
discrepancies?
ML
I take it you are only talking about SQL FTS, you mention Indexing Services
and MSSearch in here which are two separate products although SQL FTS uses
the MSSearch engine.
SQL FTS indexes alphanumeric characters. Most other characters are not
indexed but the engine is aware that something existed there. So a search on
AT&T will match with AT&T, AT!T, AT*T, AT$T, and AT T, if A, T, and At are
not in your noise word list.
..,!:; are discarded.
"ML" <ML@.discussions.microsoft.com> wrote in message
news:F406D8AB-E6AA-4C21-BF8E-51010B809459@.microsoft.com...
> Does anyone have a list of all punctuation marks ignored by the full-text
> indexing service by default. Noise files (i.e. noise.dat) only explicitly
> list the dollar sign ($) and the underscore (_) as noise "words".
> And another observation - the Windows implementation of MS Search
> (compared
> to the MS SQL Server implementation) yields different results - try
> searching
> for files with the "|" character in the file name. Ok, it's an illegal
> character, but the result is at least 'interesting'.
> As far as the rest of the characters ignored in SQL FTS are concerned,
> they
> don't bother Windows search. Has anyone else come across these (or other)
> discrepancies?
>
> ML
|||Thank you, very much. Yes, mainly I'm referring to SQL FTS and I'm aware of
the fact tha SQL FTS and Windows Indexing Services two are separate products.
I'm just baffled by the fact that the two implementations of the MSSearch
engines differ in such a way. Any idea why?
Thanks for the list as well.
ML

punctuation in fulltext searches

I am using Fulltext search in a web application I wrote for our
company that is bascially a knowledge base search engine - the app
searches and returns matches against our SQL 2000 database. The
problem I am having is this:
I cannot seem to figure out how to search for phrases, booleans, and
search terms with punctuation, etc.
For example:
searching for the filename history.dbf becomes historydbf
searching for "this exact phrase" becomes the 3 seperate keywords
this, exact, and phrase.
I currently have code in the web app that strips out punctuation and
anything other than letters/numbers, because when I was allowing them,
my searches were blowing up with errors.
Can anyone help me to understand how can I allow these kinds of
searches without blowing up the app or crashing the sql search?
The app is located at http://www.resourcesystem.com/kb
Thanks in advance for any help or advice anyone might have for me!
Brad Miller
bmiller@.tdpi.com
I'm a little confused by your question. When I search on history.dbf I get
hits to history.dbf not historydbf.
If I search on "this exact phrase" using contains, I get hits to only "this
exact phrase"
If I search on boolean phrases, ie "this and that" the boolean operator is
ignored. It is effective when you do contains searches though.
Can you perhaps post your exact query here?
"Brad Miller" <bmiller@.tdpi.com> wrote in message
news:37725671.0403241019.7c4a8270@.posting.google.c om...
> I am using Fulltext search in a web application I wrote for our
> company that is bascially a knowledge base search engine - the app
> searches and returns matches against our SQL 2000 database. The
> problem I am having is this:
> I cannot seem to figure out how to search for phrases, booleans, and
> search terms with punctuation, etc.
> For example:
> searching for the filename history.dbf becomes historydbf
> searching for "this exact phrase" becomes the 3 seperate keywords
> this, exact, and phrase.
> I currently have code in the web app that strips out punctuation and
> anything other than letters/numbers, because when I was allowing them,
> my searches were blowing up with errors.
> Can anyone help me to understand how can I allow these kinds of
> searches without blowing up the app or crashing the sql search?
> The app is located at http://www.resourcesystem.com/kb
> Thanks in advance for any help or advice anyone might have for me!
> Brad Miller
> bmiller@.tdpi.com
sql

Punctuation ?

Hi
I have this simple and working well SQL-row:

MyCommand = New SqlDataAdapter("SELECT * From Tbl_Table where Type = '" & TestType.tostring & "' ", MyConnection)

----
Then I have this one which I am very proud of because I can retrieve any word or part of word from the Item-field:

Test_Text=(Session("session-Text").tostring)

Dim val1, val2, val3 as string
val1 = Test_Text
val2 = "%"
val3= val1 + val2

MyCommand = New SqlDataAdapter("SELECT * From Tbl_Table where Item Like '%" & val1 & " %' OR Item Like '%" & val3 & " %' ", MyConnection)

----
The problem is:
When I try to merge these two in one big "AND" row as shown below, It only cares of picking up the Test_Text string and ignores totaly the TestType one!
I guess it is a matter of punctuation, can any one help me? Thanks.

MyCommand = New SqlDataAdapter("SELECT * From Tbl_Table where Type = '" & TestType.tostring & "' AND Item Like '%" & val1 & " %' OR Item Like '%" & val3 & " %'", MyConnection)I fixed it like this:

MyCommand = New SqlDataAdapter("SELECT * From tbl_Table where Item Like '" & val1 & " %' OR Item Like '%" & val3 & "' AND Type = '" & TestType.tostring & "' ", MyConnection)

Thanks|||when i'm debugging a concatenated sql string like that, i usually dim a var and set the var to the string. then i can see the result of the concatenation before passing it to the command object. it helps me to pick out the syntax errs quicker...

i.e.

dim mySQL as string = "SELECT * From tbl_Table where Item Like '" & val1 & " %' OR Item Like '%" & val3 & "' AND Type = '" & TestType.tostring & "' "

MyCommand = New SqlDataAdapter(mySQL, MyConnection)|||That´s a good Idea, thanks!

Punctuation

I am trying to improve searching performance ... but I am storing data that
contains punctuation marks ... such as "E.L.O." and "R.E.M." (names of song
artists/groups).
Does this mean that I cannot use full-text searching at all for searching
for these artist names?
Is there a work-breaker that will allow the punctuation marks (fullstops in
particular), or is this a search issue rather than a word-breaker issue (i.e.
CONTAINS clause does not allow punctuation anyway)?
Wozza,
Can you post the full output of -- SELECT @.@.version -- where you have this
problem?
Have you removed all single letter from the language-specific noise word
files (under \FTDATA\SQLServer\Config where you have SQL Server installed)
and ran a Full Population after these modifications? If not, then please do
this. The default wordbreaker behavior for punctuation is dependent upon the
OS-supplied wordbreaker and the @.@.version info will provide that.
Thanks,
John
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
"Wozza" <Wozza@.discussions.microsoft.com> wrote in message
news:4FCD6297-AF76-4DE0-A43F-9FE2B667BAB2@.microsoft.com...
>I am trying to improve searching performance ... but I am storing data that
> contains punctuation marks ... such as "E.L.O." and "R.E.M." (names of
> song
> artists/groups).
> Does this mean that I cannot use full-text searching at all for searching
> for these artist names?
> Is there a work-breaker that will allow the punctuation marks (fullstops
> in
> particular), or is this a search issue rather than a word-breaker issue
> (i.e.
> CONTAINS clause does not allow punctuation anyway)?
>
|||Hi John,
select @.@.version produces ...
Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 1)
"John Kane" wrote:

> Wozza,
> Can you post the full output of -- SELECT @.@.version -- where you have this
> problem?
> Have you removed all single letter from the language-specific noise word
> files (under \FTDATA\SQLServer\Config where you have SQL Server installed)
> and ran a Full Population after these modifications? If not, then please do
> this. The default wordbreaker behavior for punctuation is dependent upon the
> OS-supplied wordbreaker and the @.@.version info will provide that.
> Thanks,
> John
> --
> SQL Full Text Search Blog
> http://spaces.msn.com/members/jtkane/
>
> "Wozza" <Wozza@.discussions.microsoft.com> wrote in message
> news:4FCD6297-AF76-4DE0-A43F-9FE2B667BAB2@.microsoft.com...
>
>
|||John,
I have also cleared the Noise.dat file (my index set up to use the Neutral
language).
If I have done this ... how do I serach for "r.e.m." for instance.
Warren
"John Kane" wrote:

> Wozza,
> Can you post the full output of -- SELECT @.@.version -- where you have this
> problem?
> Have you removed all single letter from the language-specific noise word
> files (under \FTDATA\SQLServer\Config where you have SQL Server installed)
> and ran a Full Population after these modifications? If not, then please do
> this. The default wordbreaker behavior for punctuation is dependent upon the
> OS-supplied wordbreaker and the @.@.version info will provide that.
> Thanks,
> John
> --
> SQL Full Text Search Blog
> http://spaces.msn.com/members/jtkane/
>
> "Wozza" <Wozza@.discussions.microsoft.com> wrote in message
> news:4FCD6297-AF76-4DE0-A43F-9FE2B667BAB2@.microsoft.com...
>
>
|||Wozza,
Ok, as you're using Win2003 (Windows NT 5.2) and therefore using the
langwrbk.dll wordbreaker (vs. Win2K's infosoft.dll), you can search for the
three single letters using CONTAINS, for example: Note, the use of double
quotes to contain all single letters:
SELECT * FROM MyTable where CONTAINS(*,'"R.E.M"')
If you continue to get an error, then add back a single space character in
the noise.dat file under \FTDATA where SQL Server 2000 is installed and run
a Full Population, then re-run the above query.
Thanks,
John
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
"Wozza" <Wozza@.discussions.microsoft.com> wrote in message
news:28143E7D-BA1E-47EE-9C86-CB95D4A428ED@.microsoft.com...[vbcol=seagreen]
> John,
> I have also cleared the Noise.dat file (my index set up to use the Neutral
> language).
> If I have done this ... how do I serach for "r.e.m." for instance.
> Warren
> "John Kane" wrote:
|||ok, I tried
SELECT * FROM Track where CONTAINS(*,'"R.E.M."')
and
SELECT * FROM Track where CONTAINS(*,'"R.E.M"')
and got the same error each time ...
Server: Msg 7619, Level 16, State 1, Line 1
Execution of a full-text operation failed. A clause of the query contained
only ignored words.
... so I added a space to Noise.dat and am repopulating.
"John Kane" wrote:

> Wozza,
> Ok, as you're using Win2003 (Windows NT 5.2) and therefore using the
> langwrbk.dll wordbreaker (vs. Win2K's infosoft.dll), you can search for the
> three single letters using CONTAINS, for example: Note, the use of double
> quotes to contain all single letters:
> SELECT * FROM MyTable where CONTAINS(*,'"R.E.M"')
> If you continue to get an error, then add back a single space character in
> the noise.dat file under \FTDATA where SQL Server 2000 is installed and run
> a Full Population, then re-run the above query.
> Thanks,
> John
> --
> SQL Full Text Search Blog
> http://spaces.msn.com/members/jtkane/
>
> "Wozza" <Wozza@.discussions.microsoft.com> wrote in message
> news:28143E7D-BA1E-47EE-9C86-CB95D4A428ED@.microsoft.com...
>
>