Using Patindex to Find Varying Length Patterns in T-Sql

Using PATINDEX to find varying length patterns in T-SQL

I blogged about this a while ago.
Extracting numbers with SQL server

Declare @Temp Table(Data VarChar(100))

Insert Into @Temp Values('some text 456.09 other text')
Insert Into @Temp Values('even more text 98273.453 la la la')
Insert Into @Temp Values('There are no numbers in this one')

Select Left(
SubString(Data, PatIndex('%[0-9.-]%', Data), 8000),
PatIndex('%[^0-9.-]%', SubString(Data, PatIndex('%[0-9.-]%', Data), 8000) + 'X')-1)
From @Temp

SQL SUBSTRING & PATINDEX of varying lengths

here is one way :

DECLARE  @test TABLE( fileString varchar(500))

INSERT INTO @test VALUES
('_318_CA_DCA_2020_12_11-01_00_01_VM6.log')
,('_319_CA_DCA_2020_12_12-01_00_01_VM17.log')
,('_333_KF_DCA_2020_12_15-01_00_01_VM232.log')

-- 5 is the length of file extension + 1 which is always the same size '.log'
SELECT
REVERSE(SUBSTRING(REVERSE(fileString),5,CHARINDEX('_',REVERSE(fileString))-5))
FROM @test AS t

Extracting a string using SQL PATINDEX, substring of varying sizes

If you're dealing with a pair of numeric values, but are also dealing with dirty data, and lack the power of Regex, here's what you can do in TSQL.

Essentially, it looks like you're wanting to break the string in half at 'x', then whittle down the outputs until you have numeric only values. Using a set of derived tables, this becomes relatively easy (and not as hard to read)

declare @placements table (Placement varchar(10))
insert into @placements values
('720x60'),
('720x600'),
('720 x 60'),
('720_x_60'),
('1x1')

SELECT LEFT(LeftOfX,PATINDEX('%[^0-9]%',LeftOfX) - 1) + 'x' + RIGHT(RightOfX, LEN(RightOfX) - PATINDEX('%[0-9]%', RightOfX) + 1)
FROM (
SELECT RIGHT(LeftOfX, LEN(LeftOfX) - PATINDEX('%[0-9]%', LeftOfX) + 1) AS LeftOfX, LEFT(RightOfX, LEN(RightOfX) - PATINDEX('%[0-9]%', REVERSE(RightOfX)) + 1) AS RightOfX
FROM (
SELECT LEFT(p.Placement,x) AS LeftOfX, RIGHT(p.Placement,LEN(p.Placement) - x + 1) AS RightOfX
FROM (
SELECT
p.Placement
, CHARINDEX('x',p.Placement) AS x
FROM @placements p
) p
) p
) p

Here's the SQLFiddle example.

First, select your placement, the location of your 'x' in Placement, and other columns you want from the table. Pass the other columns up through the derived tables.

Next, Split the string into Left and Right.

Process left and right in two more queries, the first to take the right of results starting at the numeric portion, then the left of the results ending at the non-numeric portion.

EDIT: Fixed the outputs, both numbers now selected.

PATINDEX to detect letters and six numbers

Simply add a where clause:

update table a 
set number = SUBSTRING(name, PATINDEX('py/u/[0-9]', name) + 6, 6)
where name like '%py/u/[0-9]%'

Parsing a URL to find a varying length string

Not quite elaborated, but as a hint to get you started:

patindex = PATINDEX('%SO[0-9]%',URL) -> Index of the start of the pattern
charindex = CHARINDEX('.html', URL, patindex ) -> Index of the first '.html' after the start of the pattern.
patternLen = charindex - patindex

So something like the following may work:

SELECT
CHARINDEX('.html', URL,
PATINDEX('%SO[0-9]%',URL)
) -
PATINDEX('%SO[0-9]%',URL)
FROM ...

Not all of the URLs have this string, and if that's the case then I
still need the query to produce 'Null'.

-> Outer (self) join:

SELECT
allUrls.URL,
CHARINDEX('.html', u.URL, PATINDEX('%SO[0-9]%',u.URL) ) - PATINDEX('%SO[0-9]%', u.URL) -- Same as above
FROM MyTable allUrls
LEFT OUTER JOIN MyTable u
ON allUrls.URL = u.URL
AND u.URL LIKE '%SO[0-9]%'

PATINDEX pattern to replace character that is *not* first character

After our talk in the comments above I'd suggest this approach:

DECLARE @Word varchar(8) = 'truth';

DECLARE @toBeReplaced VARCHAR(10)='abeilost';
DECLARE @replaceWith VARCHAR(10)='@83!10$+';

DECLARE @position INT=PATINDEX(CONCAT('%[',@toBeReplaced,']%'),SUBSTRING(@word,2,8000))+1;

SELECT STUFF(@word,@position,1,TRANSLATE(SUBSTRING(@word,@position,1),@toBeReplaced,@replaceWith));

The idea in short:

  • We define your translate parameters.
  • We find the position using PATINDEX() behind the first character.
  • Now we can use STUFF() to replace exactly one character at the given position by its translation.

For the next time: It would help a lot if you'd provided some samples with the expected result.

UPDATE

Using this at a tabular result, you can avoid the declared variable and do this inline:

DECLARE @WordTable TABLE(SomeText varchar(8));
INSERT INTO @WordTable VALUES('truth'),('loveable');

DECLARE @toBeReplaced VARCHAR(10)='abeilost';
DECLARE @replaceWith VARCHAR(10)='@83!10$+';

--the new query

SELECT STUFF(wt.SomeText,pos,1,TRANSLATE(SUBSTRING(wt.SomeText,pos,1),@toBeReplaced,@replaceWith))
FROM @WordTable wt
CROSS APPLY(SELECT PATINDEX(CONCAT('%[',@toBeReplaced,']%'),SUBSTRING(wt.SomeText,2,8000))+1) A(pos);

Extract up to three strings of varying length with regex pattern on Microsoft SQL Server

Something like this?

CREATE TABLE table1 (diagnosis varchar(100), diagnosis_1 varchar(10), diagnosis_2 varchar(10), diagnosis_3 varchar(10));
INSERT INTO table1 (diagnosis)
VALUES
('T038MFRACTURE'),
('M719BOCHCM531'),
('F900CF334M75');

--The query uses a recursive CTE first to find the positions of a character followed by a number (the start of an ICD code)

WITH PosOfNonNumer AS
(
SELECT diagnosis AS Original
,diagnosis
,1 AS PartIndex
,PATINDEX('%[A-Z][0-9]%',diagnosis) AS PosFound
,SUBSTRING(diagnosis,1,5) AS PartFound
,SUBSTRING(diagnosis,PATINDEX('%[A-Z][0-9]%',diagnosis)+2,1000) AS RestString
FROM table1

UNION ALL

SELECT p.Original
,p.RestString
,p.PartIndex+1
,PATINDEX('%[A-Z][0-9]%',p.RestString) AS PosFound
,SUBSTRING(p.RestString,PATINDEX('%[A-Z][0-9]%',p.RestString),5) AS PartFound
,SUBSTRING(p.RestString,PATINDEX('%[A-Z][0-9]%', p.RestString)+2,1000) AS RestString
FROM PosOfNonNumer AS p
WHERE PATINDEX('%[A-Z][0-9]%',p.RestString)>0
)

--The main query uses conditional aggregation to pivot your results

SELECT Original
,MAX(CASE WHEN PartIndex=1 THEN PartFound END) AS diag1
,MAX(CASE WHEN PartIndex=2 THEN PartFound END) AS diag2
,MAX(CASE WHEN PartIndex=3 THEN PartFound END) AS diag3
,MAX(CASE WHEN PartIndex=4 THEN PartFound END) AS diag4
FROM PosOfNonNumer
GROUP BY Original
GO

--clean-up
--DROP TABLE table1;

The result

Original        diag1   diag2   diag3   diag4
F900CF334M75 F900C F334M M75 NULL
M719BOCHCM531 M719B M531 NULL NULL
T038MFRACTURE T038M NULL NULL NULL

You will have to cut away some characters at the end... Hope you can manage this yourself...



Related Topics



Leave a reply



Submit