Regex match entire words only
Use word boundaries:
/\b($word)\b/i
Or if you're searching for "S.P.E.C.T.R.E." like in Sinan Ünür's example:
/(?:\W|^)(\Q$word\E)(?:\W|$)/i
Match only entire words with LIKE?
How about split it into four parts -
[MyColumn] Like '% doc %'
OR [MyColumn] Like '% doc'
OR [MyColumn] Like 'doc %'
OR [MyColumn] = 'doc'
Edit: An alternate approach (only for ascii chars) could be:
'#'+[MyColumn]+'#' like '%[^a-z0-9]doc[^a-z0-9]%'
(You may want to take care of any special char as well)
It doesn't look like, but you may want to explore Full Text Search and Contains, in case that's more suitable for your situation.
See:
- MSDN: [ ] (Wildcard - Character(s) to Match) (Transact-SQL)
Make MySQL LIKE Return Full Word matches Only
You can use RLIKE
, the regular expression version of LIKE
to get more flexibility with your matching.
SELECT ID from VideoGames
WHERE Title RLIKE "[[:<:]]GI[[:>:]]" AND Title RLIKE "[[:<:]]JOE[[:>:]]"
The [[:<:]]
and [[:>:]]
markers are word boundaries marking the start and and of a word respectively. You could build a single regex rather than the AND
but I have made this match your original question.
how to search for specific whole words within a string , via SQL, compatible with both HIVE/IMPALA
You can add word boundary \\b
to match only exact words:
rlike '(?i)\\bFECHADO\\b|\\bCIERRE\\b|\\bCLOSED\\b'
(?i)
means case insensitive, no need to use UPPER.
And the last alternative in your regex pattern is REVISTO. NORMAL.
If dots in it should be literally dots, use \\.
Like this: REVISTO\\. NORMAL\\.
Dot in regexp means any character and should be shielded with two backslashes to match dot literally.
Above regex works in Hive. Unfortunately I have no Impala to test it
Regex.Match whole words
You should add the word delimiter to your regex:
\b(shoes|shirt|pants)\b
In code:
Regex.Match(content, @"\b(shoes|shirt|pants)\b");
Regular expression for match all words without numbers
You can use this regex:
/\b[^\d\W]+\b/g
to match all words with no digits.
RegEx Demo
[^\d\W]
will match any non-digit and (non-non-word) i.e. a word character.
Search for “whole word match” with SQL Server LIKE pattern
Full text indexes is the answer.
The poor cousin alternative is
'.' + column + '.' LIKE '%[^a-z]pit[^a-z]%'
FYI unless you are using _CS collation, there is no need for a-zA-Z
regex match whole word and punctuation with it using re.search()
There are two issues here.
- In regex
.
is special. It means "match one of any character". However, you are trying to use it to match a regular period. (It will indeed match that, but it will also match everything else.) Instead, to match a period, you need to use the pattern\.
. And to change that to match either a period or a hyphen, you can use a class, like[-.]
. - You are using
\b
at the end of your pattern to match the word boundary, but\b
is defined as being the boundary between a word character and a non-word character, and periods and spaces are both non-word characters. This means that Python won't find a match. Instead, you could use a lookahead assertion, which will match whatever character you want, but won't consume the string.
Now, to match a whole word - any word - you can do something like \w+
, which matches one or more word characters.
Also, it is quite possible that there won't be a match anyway, so you should check whether a match occurred using an if
statement or a try
statement. Putting it all together:
txt = "The indian in. Spain."
pattern = r"\w+[-.]"
x = re.search(r"\b" + pattern + r"(?=\W)", txt)
if x:
print(x.start(), x.end())
Edit
There is one problem with the lookahead assertion above - it won't match the end of the string. This means that if your text is The rain in Spain.
then it won't match Spain.
, as there is no non-word character following the final period.
To fix this, you can use a negative lookahead assertion, which matches when the following text does not include the pattern, and also does not consume the string.
x = re.search(r"\b" + pattern + r"(?!\w)", txt)
This will match when the character after the word is anything other than a word character, including the end of the string.
Related Topics
Postgres Not Using Index When Index Scan Is Much Better Option
SQL - Does the Order of Where Conditions Matter
How to Get the Number of Days Between 2 Dates in Oracle 11G
How to Extract Week Number in SQL
Entity Framework - Attribute in Clause Usage
Why Can't You Mix Aggregate Values and Non-Aggregate Values in a Single Select
Cte Error: "Types Don't Match Between the Anchor and the Recursive Part"
SQL Use Alias in Where Statement
How Does This Case Expression Reach the Else Clause
Split Words with a Capital Letter in SQL
Is There a Product Function Like There Is a Sum Function in Oracle SQL
Get Month from Datetime in SQLite
Pivoting of Data Using Two Columns
Name Database Design Notation You Prefer and Why
Column Name or Number of Supplied Values Does Not Match Table Definition
Mysql: How to Select Groups Having Certain Values
Window Functions: Last_Value(Order by ... Asc) Same as Last_Value(Order by ... Desc)