Regular Expressions Inside SQL Server

Using RegEx in SQL Server

You do not need to interact with managed code, as you can use LIKE:

CREATE TABLE #Sample(Field varchar(50), Result varchar(50))
GO
INSERT INTO #Sample (Field, Result) VALUES ('ABC123 ', 'Do not match')
INSERT INTO #Sample (Field, Result) VALUES ('ABC123.', 'Do not match')
INSERT INTO #Sample (Field, Result) VALUES ('ABC123&', 'Match')
SELECT * FROM #Sample WHERE Field LIKE '%[^a-z0-9 .]%'
GO
DROP TABLE #Sample

As your expression ends with + you can go with '%[^a-z0-9 .][^a-z0-9 .]%'

EDIT:

To make it clear: SQL Server doesn't support regular expressions without managed code. Depending on the situation, the LIKE operator can be an option, but it lacks the flexibility that regular expressions provides.

Regular expressions inside SQL Server

stored value in DB is: 5XXXXXX [where x can be any digit]

You don't mention data types - if numeric, you'll likely have to use CAST/CONVERT to change the data type to [n]varchar.

Use:

WHERE CHARINDEX(column, '5') = 1
AND CHARINDEX(column, '.') = 0 --to stop decimals if needed
AND ISNUMERIC(column) = 1

References:

  • CHARINDEX
  • ISNUMERIC

i have also different cases like XXXX7XX for example, so it has to be generic.

Use:

WHERE PATINDEX('%7%', column) = 5
AND CHARINDEX(column, '.') = 0 --to stop decimals if needed
AND ISNUMERIC(column) = 1

References:

  • PATINDEX

Regex Support

SQL Server 2000+ supports regex, but the catch is you have to create the UDF function in CLR before you have the ability. There are numerous articles providing example code if you google them. Once you have that in place, you can use:

  • 5\d{6} for your first example
  • \d{4}7\d{2} for your second example

For more info on regular expressions, I highly recommend this website.

SQL Regex Expression

select *
from
(
values ('ABC-12345'), ('AB-123'), ('A-12345'), ('A1B-1234'), ('ABC12345'), ('xyz-12'), ('klmno-12345'), ('abc.12345')
) as t(v)
where v like '[A-Z]%-%[0-9]' --values start with letter & contain a hyphen & end with a digit
--left part
and len(substring(v, 1, charindex('-', v)-1)) between 1 and 3 --the left part is 1-3 chars long
and substring(v, 1, charindex('-', v)-1) not like '%[^A-Z]%'--the left part has only letters
--right part
and len(stuff(v, 1, charindex('-', v), '')) between 3 and 5 --right part is 3-5 chars long
and stuff(v, 1, charindex('-', v), '') not like '%[^0-9]%'; --and it contains only digits

MSSQL Regular expression

This is what I have used in the end:

SELECT *, 
CASE WHEN [url] NOT LIKE '%[^-A-Za-z0-9/.+$]%'
THEN 'Valid'
ELSE 'No valid'
END [Validate]
FROM
*table*
ORDER BY [Validate]

Complex RegEx in T-SQL

I would recommend writing a user defined function using SQLCLR. Since .Net supports Regex you can port it to T-SQL. First link in Google gave this implementation, but there may be other (better) implementations.

Caveat - use of SQLCLR requires elevated permissions and may lead to security issues or performance issues or even issues with stability of the SQL Server if not implemented correctly. But if you know what you are doing this may lead to significant enhancements of T-SQL specific for your use cases.

SQL Server 2016 How to use a simple Regular Expression in T-SQL?

First, case sensitivity depends on the collation of the DB, though with LIKE you can specify case comparisons. With that... here is some Boolean logic to take care of the cases you stated. Though, you may need to add additional clauses if you discover some bogus input.

declare @table table (Person varchar(64), is_correct_format varchar(3) default 'NO')
insert into @table (Person)
values
('LowerCase, Here'),
('CORRECTLY, FORMATTED'),
('CORRECTLY,FORMATTEDTWO'),
('ONLY FIRST UPPER, LowerLast'),
('WEGOT, FormaNUMB3RStted'),
('NoComma Formatted'),
('CORRECTLY, TWOCOMMA, A'),
(',COMMA FIRST'),
('COMMA LAST,'),
('SPACE BEFORE COMMA , GOOD'),
(' SPACE AT BEGINNING, GOOD')

update @table
set is_correct_format = 'YES'
where
Person not like '%[^A-Z, ]%' --check for non characters, excluding comma and spaces
and len(replace(Person,' ','')) = len(replace(replace(Person,' ',''),',','')) + 1 --make sure there is only one comma
and charindex(',',Person) <> 1 --make sure the comma isn't at the beginning
and charindex(',',Person) <> len(Person) --make sure the comma isn't at the end
and substring(Person,charindex(',',Person) - 1,1) <> ' ' --make sure there isn't a space before comma
and left(Person,1) <> ' ' --check preceeding spaces
and UPPER(Person) = Person collate Latin1_General_CS_AS --check collation for CI default (only upper cases)

select * from @table

Translating regex into LIKE statements for SQL Server

If you really have to do this with pure T-SQL, you may write something like this:

SELECT
id,
pcode,
CASE
WHEN pcode LIKE '[456]%' THEN CAST(SUBSTRING(pcode, 1, 1) AS INT)
WHEN pcode LIKE REPLICATE('[123ABC]', CHARINDEX('4', pcode) - 1) + '4%' THEN 4
WHEN pcode LIKE REPLICATE('[123ABC]', CHARINDEX('5', pcode) - 1) + '5%' THEN 5
WHEN pcode LIKE REPLICATE('[123ABC]', CHARINDEX('6', pcode) - 1) + '6%' THEN 6
ELSE 0
END AS desired_val
FROM inventory;

The trick is to use CHARINDEX() to get the position of the first occurrence of a [456] char, and then use REPLICATE() to repeat the [123ABC] pattern a number of times equal to the number of characters that precede the desired character. So, in the first example, where the '5' char is found at the fifth position, the constructed pattern will be:

[123ABC][123ABC][123ABC][123ABC]5%


Related Topics



Leave a reply



Submit