Stop Words
Stop Words refer to a list of some of the most commonly words used in the English language. When a communication is indexed, Stop Words are ignored and not included in the index; consequently, no search can be conducted for those specific words.
If a Stop Word is used as part of a phrase in a search or policy lexicon entry, the Enterprise Archive substitutes an ANY token as a placeholder for that word, and any word that appears in that placeholder will still allow a match to occur if the other Non-Stop Words are present.
The following table lists 31 words considered as StopWords by Enterprise Archive and thus are removed from Indexing.
a |
an |
and |
are |
as |
at |
be |
but |
by |
for |
if |
in |
into |
is |
it |
no |
not |
of |
on |
or |
such |
that |
the |
their |
then |
there |
these |
they |
this |
to |
was |
will |
with |
|
If search phrases "are you sure" is used, a match will occur as long as any character, number, or word appears before “you sure”. Valid matches for "are you sure" include the following:
"aren’t you sure"
"boy you sure"
"but you sure"
"I hear you sure" and so on.
Additional Examples
Search Phrase |
"Cannot accept this" |
Considering "this" as a Stop Word, the search returns documents containing "Cannot accept ANY." All the following would be flagged:
“Cannot accept this”
“Cannot accept money”
“Cannot accept anymore,” and so on.
The addition of ‘this’ is superfluous and “Cannot accept” would therefore be flagged every time it appears in the document regardless of what word appears after “accept.”
Search Phrase |
"Totally in disbelief" |
Considering "in" as a Stop Word, the search returns documents containing "Totally ANY disbelief.” All the following would be flagged:
“Totally in disbelief”
“Totally without disbelief”
“Totally Absolute Disbelief,” and so on.
“Totally disbelief" will not be flagged, since the search is also looking for something to replace the “ANY” token placeholder in between “Totally” and “disbelief”. The inclusion of the Stop Word in this case enforces the phrasing structure, but not the individual word.
Also see, Stop Words Hit Highlighting behavior in Enterprise Archive.