How to Extract Certain Nth Character from a String in Sql

T-SQL - How to search for nth character in a string

Use SUBSTRING ( expression ,start , length )

SELECT mycolumn 
FROM mytable
WHERE SUBSTRING ( mycolumn ,6 , 2 ) = '73'

Extract text up to the Nth character in a string

You can use the pattern matching function in Postgres.

First figure out a pattern to capture everything up to the fourth > character.

To start your pattern you should create a sub-group that captures non > characters, and one > character:

([^>]*>)

Then capture that four times to get to the fourth instance of >

([^>]*>){4}

Then, you will need to wrap that in a group so that the match brings back all four instances:

(([^>]*>){4})

and put a start of string symbol for good measure to make sure it only matches from the beginning of the String (not in the middle):

^(([^>]*>){4})

Here's a working regex101 example of that!

Once you have the pattern that will return what you want in the first group element (which you can tell at the online regex on the right side panel), you need to select it back in the SQL.

In Postgres, the substring function has an option to use a regex pattern to extract text out of the input using a 'from' statement in the substring.

To finish, put it all together!

select substring(filter_type from '^(([^>]*>){4})')
from filter_table

See a working sqlfiddle here


If you want to match the entire string whenever there are less than four instances of >, use this regular expression:

 ^(([^>]*>){4}|.*)

SQL Server - find nth occurrence in a string

One way (2k8);

select 'abc_1_2_3_4.gif  ' as img into #T
insert #T values ('zzz_12_3_3_45.gif')

;with T as (
select 0 as row, charindex('_', img) pos, img from #T
union all
select pos + 1, charindex('_', img, pos + 1), img
from T
where pos > 0
)
select
img, pos
from T
where pos > 0
order by img, pos

>>>>

img pos
abc_1_2_3_4.gif 4
abc_1_2_3_4.gif 6
abc_1_2_3_4.gif 8
abc_1_2_3_4.gif 10
zzz_12_3_3_45.gif 4
zzz_12_3_3_45.gif 7
zzz_12_3_3_45.gif 9
zzz_12_3_3_45.gif 11

Update

;with T(img, starts, pos) as (
select img, 1, charindex('_', img) from #t
union all
select img, pos + 1, charindex('_', img, pos + 1)
from t
where pos > 0
)
select
*, substring(img, starts, case when pos > 0 then pos - starts else len(img) end) token
from T
order by img, starts

>>>

img starts pos token
abc_1_2_3_4.gif   1 4 abc
abc_1_2_3_4.gif   5 6 1
abc_1_2_3_4.gif   7 8 2
abc_1_2_3_4.gif   9 10 3
abc_1_2_3_4.gif   11 0 4.gif  
zzz_12_3_3_45.gif 1 4 zzz
zzz_12_3_3_45.gif 5 7 12
zzz_12_3_3_45.gif 8 9 3
zzz_12_3_3_45.gif 10 11 3
zzz_12_3_3_45.gif 12 0 45.gif

Get text after second occurrence of a specific character

Call the CHARINDEX function twice:

SELECT SUBSTRING(
code,
NULLIF(CHARINDEX('_', code, NULLIF(CHARINDEX('_', code), 0) + 1), 0) + 1,
LEN(code)
)
FROM (VALUES
('a_b_c'),
('a_b')
) x(code)

Extracting nth field of string delimited by : stored in a SQL column

Perhaps a little XML as the parser

Example

Select A.Format
,B.*
From YourTable A
Cross Apply (
Select Pos2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
,Pos3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
,Pos4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
From (Select Cast('<x>' + replace((Select replace(A.Format,':','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) as xDim) as A
) B

Returns

Format                  Pos2    Pos3    Pos4
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PGT:PID:PL AD DP GQ
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PGT:PID:PL AD DP GQ
GT:AD:DP:GQ:PGT:PID:PL AD DP GQ
GT:AD:DP:GQ:PGT:PID:PL AD DP GQ
GT:AD:DP:GQ:PGT:PID:PL AD DP GQ

Or a Simple version

Select A.Format
,Pos2 = Cast('<x>' + replace(Format,':','</x><x>')+'</x>' as xml).value('/x[2]','varchar(max)')
,Pos3 = Cast('<x>' + replace(Format,':','</x><x>')+'</x>' as xml).value('/x[3]','varchar(max)')
,Pos4 = Cast('<x>' + replace(Format,':','</x><x>')+'</x>' as xml).value('/x[4]','varchar(max)')
From YourTable A

Or if Open to a UDF

Take a peek at TSQL/SQL Server - table function to parse/split delimited string to multiple/separate columns

EDIT - Update for Sample

Select A.Format
,GT = Cast('<x>' + replace(Sample,':','</x><x>')+'</x>' as xml).value('/x[1]','varchar(max)')
,AD = Cast('<x>' + replace(Sample,':','</x><x>')+'</x>' as xml).value('/x[2]','varchar(max)')
,DP = Cast('<x>' + replace(Sample,':','</x><x>')+'</x>' as xml).value('/x[3]','varchar(max)')
,GQ = Cast('<x>' + replace(Sample,':','</x><x>')+'</x>' as xml).value('/x[4]','varchar(max)')
From YourTable A

Extract right of 2nd occurrence of character in string

Answered by Joe J in comments:

If you want it all in one select statement, you could use a horrid combination of CHARINDEX and SUBSTRING. Something like this

select SUBSTRING(addr, charindex(' ', addr, charindex(' ', addr, 1) + 1) + 1,charindex(' ', addr, charindex(' ', addr, charindex(' ', addr, 1) + 1) + 1) - charindex(' ', addr, charindex(' ', addr, 1) + 1) -1) 

will get your numbers between the 2nd and 3rd ' '.

Thanks @JoeJ - how do I modify it to say from 3rd ' ' to first ';' ? – Michael

@Michael,

select SUBSTRING(addr,charindex(' ', addr, charindex(' ', addr, charindex(' ', addr, 1) + 1) + 1) + 1,charindex(';', addr, 1) - charindex(' ', addr, charindex(' ', addr, charindex(' ', addr, 1) + 1) + 1) -1) will do between the third ' ' and the first ';'. 

Like I said, it's pretty horrid though, and is strictly reliant on the spaces etc. being exactly in the same place in every record.

find nth position of a specific character in a string in Postgres

This overall problem is best handled with regexp_replace():

select regexp_replace('Raw:Resin:TPA', '(^.*:.*):', '\1');
regexp_replace
----------------
Raw:ResinTPA
(1 row)

select regexp_replace('Equip:Mold', '(^.*:.*):', '\1');
regexp_replace
----------------
Equip:Mold
(1 row)

select regexp_replace('FG', '(^.*:.*):', '\1');
regexp_replace
----------------
FG
(1 row)

If you want something that finds the nth occurrence of a substring, then something like this could be made into a function:

with invar as (
select 'Raw:Resin:TPA' as a, ':' as d
)
select case
when length(array_to_string((string_to_array(a, d))[1:2], d)) = length(a) then -1
else length(array_to_string((string_to_array(a, d))[1:2], d)) + 1
end
from invar;


Related Topics



Leave a reply



Submit