Perform Regex (Replace) in an SQL Query

Regex pattern inside SQL Replace function?

You can use PATINDEX
to find the first index of the pattern (string's) occurrence. Then use STUFF to stuff another string into the pattern(string) matched.

Loop through each row. Replace each illegal characters with what you want. In your case replace non numeric with blank. The inner loop is if you have more than one illegal character in a current cell that of the loop.

DECLARE @counter int

SET @counter = 0

WHILE(@counter < (SELECT MAX(ID_COLUMN) FROM Table))
BEGIN

WHILE 1 = 1
BEGIN
DECLARE @RetVal varchar(50)

SET @RetVal = (SELECT Column = STUFF(Column, PATINDEX('%[^0-9.]%', Column),1, '')
FROM Table
WHERE ID_COLUMN = @counter)

IF(@RetVal IS NOT NULL)
UPDATE Table SET
Column = @RetVal
WHERE ID_COLUMN = @counter
ELSE
break
END

SET @counter = @counter + 1
END

Caution: This is slow though! Having a varchar column may impact. So using LTRIM RTRIM may help a bit. Regardless, it is slow.

Credit goes to this StackOverFlow answer.

EDIT
Credit also goes to @srutzky

Edit (by @Tmdean)
Instead of doing one row at a time, this answer can be adapted to a more set-based solution. It still iterates the max of the number of non-numeric characters in a single row, so it's not ideal, but I think it should be acceptable in most situations.

WHILE 1 = 1 BEGIN
WITH q AS
(SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n
FROM Table)
UPDATE Table
SET Column = STUFF(Column, q.n, 1, '')
FROM q
WHERE Table.ID_Column = q.ID_Column AND q.n != 0;

IF @@ROWCOUNT = 0 BREAK;
END;

You can also improve efficiency quite a lot if you maintain a bit column in the table that indicates whether the field has been scrubbed yet. (NULL represents "Unknown" in my example and should be the column default.)

DECLARE @done bit = 0;
WHILE @done = 0 BEGIN
WITH q AS
(SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n
FROM Table
WHERE COALESCE(Scrubbed_Column, 0) = 0)
UPDATE Table
SET Column = STUFF(Column, q.n, 1, ''),
Scrubbed_Column = 0
FROM q
WHERE Table.ID_Column = q.ID_Column AND q.n != 0;

IF @@ROWCOUNT = 0 SET @done = 1;

-- if Scrubbed_Column is still NULL, then the PATINDEX
-- must have given 0
UPDATE table
SET Scrubbed_Column = CASE
WHEN Scrubbed_Column IS NULL THEN 1
ELSE NULLIF(Scrubbed_Column, 0)
END;
END;

If you don't want to change your schema, this is easy to adapt to store intermediate results in a table valued variable which gets applied to the actual table at the end.

Perform regex (replace) in an SQL query

Some hacking required but we can do this with LIKE, PATINDEX, LEFT AND RIGHT and good old string concatenation.

create table test
(
id int identity(1, 1) not null,
val varchar(25) not null
)

insert into test values ('< <- ok, < <- nok')

while 1 = 1
begin
update test
set val = left(val, patindex('%<[^;]%', val) - 1) +
'<' +
right(val, len(val) - patindex('%<[^;]%', val) - 2)
from test
where val like '%<[^;]%'

IF @@ROWCOUNT = 0 BREAK
end

select * from test

Better is that this is SQL Server version agnostic and should work just fine.

RegEx for multiline search and replace in SQL query code

We can make this work using a regex replace like this:

(AND\ +op\.(OP\d0?OPVerfahren)\ *>\ *0\s+AND\ +p\.Testzwecke\ *=\ *0\s+AND\ +NOT\ +EXISTS\ *\(SELECT\ +DISTINCT\ +1\ +FROM\ +ods[0123][0-9]\.dat_optherapie\ +op2\ +WHERE\ +op2\.patID\ *=\ *p\.ID\ +AND\ +op2\.revision\ *>\ *op\.revision\))(\s+UNION\s+ALL)

Demo

It's sticks rather tight to the original string and mostly only introduces variable-length quantifiers for whitespace characters. When there is a \ * an optional space may occur, if the space is mandatory \ + is used. Otherwise the whitespace shorthand character \s is used to allow not only spaces but newlines and alike. To make it work, enable the s|singleline flag (or add (?s) in front of the pattern).

Replace string using RegEx?

It seems from the documentation https://learn.microsoft.com/en-us/sql/t-sql/functions/replace-transact-sql?view=sql-server-ver15 that replace is not to be used with regular expression.

string_pattern

Is the substring to be found. string_pattern can be of a character or binary data type. string_pattern cannot be an empty string (''), and must not exceed the maximum number of bytes that fits on a page.

Alternative solutions here: Regex pattern inside SQL Replace function?

SQL Server 2014 replace with regex

First of all you need this user defined function to search for replacing a pattern with string:

CREATE FUNCTION dbo.PatternReplace
(
@InputString VARCHAR(4000),
@Pattern VARCHAR(100),
@ReplaceText VARCHAR(4000)
)
RETURNS VARCHAR(4000)
AS
BEGIN
DECLARE @Result VARCHAR(4000) SET @Result = ''
-- First character in a match
DECLARE @First INT
-- Next character to start search on
DECLARE @Next INT SET @Next = 1
-- Length of the total string -- 8001 if @InputString is NULL
DECLARE @Len INT SET @Len = COALESCE(LEN(@InputString), 8001)
-- End of a pattern
DECLARE @EndPattern INT

WHILE (@Next <= @Len)
BEGIN
SET @First = PATINDEX('%' + @Pattern + '%', SUBSTRING(@InputString, @Next, @Len))
IF COALESCE(@First, 0) = 0 --no match - return
BEGIN
SET @Result = @Result +
CASE --return NULL, just like REPLACE, if inputs are NULL
WHEN @InputString IS NULL
OR @Pattern IS NULL
OR @ReplaceText IS NULL THEN NULL
ELSE SUBSTRING(@InputString, @Next, @Len)
END
BREAK
END
ELSE
BEGIN
-- Concatenate characters before the match to the result
SET @Result = @Result + SUBSTRING(@InputString, @Next, @First - 1)
SET @Next = @Next + @First - 1

SET @EndPattern = 1
-- Find start of end pattern range
WHILE PATINDEX(@Pattern, SUBSTRING(@InputString, @Next, @EndPattern)) = 0
SET @EndPattern = @EndPattern + 1
-- Find end of pattern range
WHILE PATINDEX(@Pattern, SUBSTRING(@InputString, @Next, @EndPattern)) > 0
AND @Len >= (@Next + @EndPattern - 1)
SET @EndPattern = @EndPattern + 1

--Either at the end of the pattern or @Next + @EndPattern = @Len
SET @Result = @Result + @ReplaceText
SET @Next = @Next + @EndPattern - 1
END
END
RETURN(@Result)
END

Read more here

After creating this function you can try this:

DECLARE @x VARCHAR(max)
SET @x = 'sample text min(my value) continue with sample text with ) and ('

DECLARE @val VARCHAR(max)

SET @val = SUBSTRING(SUBSTRING(@x,CHARINDEX('min(',@x)+4,LEN(@x)-CHARINDEX('min(',@x)),1,CHARINDEX(')',@x)-(CHARINDEX('min(',@x)+4))

SELECT REPLACE(dbo.PatternReplace(@x,'%min(','min(max('),'min(max('+@val+')','min(max('+@val+'))')

And you can see that the output is:

Sample Image

Occurrence of min(xxx) more than once:

Sample Image

Finally you can simply update your table as below:

UPDATE  YourTable
SET YourColumn = REPLACE(dbo.PatternReplace(YourColumn, '%min(', 'min(max('),
'min(max(' + SUBSTRING(SUBSTRING(YourColumn,CHARINDEX('min(', YourColumn)+ 4,LEN(YourColumn)- CHARINDEX('min(',YourColumn)), 1,CHARINDEX(')', YourColumn)- ( CHARINDEX('min(', YourColumn) + 4 ))+ ')',
'min(max(' + SUBSTRING(SUBSTRING(YourColumn,CHARINDEX('min(', YourColumn)+ 4,LEN(YourColumn)- CHARINDEX('min(',YourColumn)), 1,CHARINDEX(')', YourColumn)- ( CHARINDEX('min(', YourColumn) + 4 ))+ '))');


Related Topics



Leave a reply



Submit