How to Extract Group from Regular Expression in Oracle

How to extract group from regular expression in Oracle?

The third parameter of the REGEXP_SUBSTR function indicates the position in the target string (de_desc in your example) where you want to start searching. Assuming a match is found in the given portion of the string, it doesn't affect what is returned.

In Oracle 11g, there is a sixth parameter to the function, that I think is what you are trying to use, which indicates the capture group that you want returned. An example of proper use would be:

SELECT regexp_substr('abc[def]ghi', '\[(.+)\]', 1,1,NULL,1) from dual;

Where the last parameter 1 indicate the number of the capture group you want returned. Here is a link to the documentation that describes the parameter.

10g does not appear to have this option, but in your case you can achieve the same result with:

select substr( match, 2, length(match)-2 ) from (
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]') match FROM dual
);

since you know that a match will have exactly one excess character at the beginning and end. (Alternatively, you could use RTRIM and LTRIM to remove brackets from both ends of the result.)

REGEXP_SUBSTR find groups

Modified pattern:

^(\$E\[([A-Z|a-z|0-9|_]+)\]#)(\$D\[(.*)\]#)?$

Use REGEXP_REPLACE with back reference (works on Oracle 10g):

SELECT 
REGEXP_REPLACE('$E[some_key]#$D[some value]#',pattern, '\2') AS MATCH1,
REGEXP_REPLACE('$E[some_key]#$D[some value]#',pattern, '\4') AS MATCH2
FROM DUAL;

On Oracle 11g you can do:

SELECT 
REGEXP_SUBSTR('$E[some_key]#$D[some value]#',pattern, 1,1,null,2) AS MATCH1,
REGEXP_SUBSTR('$E[some_key]#$D[some value]#',pattern, 1,1,null,4) AS MATCH2
FROM DUAL;

Documentation: http://docs.oracle.com/cd/B28359_01/server.111/b28286/functions138.htm

regex oracle sql return all capturing groups

Query 1:

-- Sample data
WITH your_table ( value ) AS (
SELECT 'some stuff TOTAL_SCORE<518>some stuff OTHER_VALUE<456> foo <after>' FROM DUAL
)
-- Query
SELECT REGEXP_REPLACE(
value,
'.*TOTAL_SCORE<(\d{3})>.*OTHER_VALUE<(\d{3})>.*',
'\1,\2'
) As scores
FROM your_table

Output:

SCORES
-------
518,456

Query 2:

-- Sample data
WITH your_table ( value ) AS (
SELECT 'some stuff TOTAL_SCORE<518> TOTAL_SCORE<123> some stuff OTHER_VALUE<456> foo <after>' FROM DUAL
)
-- Query
SELECT l.column_value As scores
FROM your_table t,
TABLE(
CAST(
MULTISET(
SELECT TO_NUMBER(
REGEXP_SUBSTR(
t.value,
'TOTAL_SCORE<(\d{3})>',
1,
LEVEL,
NULL,
1
)
)
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( t.value, 'TOTAL_SCORE<(\d{3})>' )
) AS SYS.ODCINUMBERLIST
)
) l;

Output:

SCORES
-------
518
123

Oracle SQL regexp_substr non-capturing/optional group

This is a little bit too much to put into a comment so I'll write it here. If it doesn't make sense, I'll just remove it.

If you're always looking for digits in those strings (not related to what surrounds them), then it could be simplified to

SQL> with
2 input_string as
3 (
4 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
5 union all
6 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
7 )
8 select regexp_substr(test_string, '\d+', 1, 1) grp1,
9 regexp_substr(test_string, '\d+', 1, 2) grp2,
10 regexp_substr(test_string, '\d+', 1, 3) grp3,
11 regexp_substr(test_string, '\d+', 1, 4) grp4
12 from input_string;

GRP1 GRP2 GRP3 GRP4
---------- ---------- ---------- ----------
494801 8280955 8336297 494767
494801 8280955 8336297

SQL>

Or, option which doesn't have fixed number of groups (though, layout is different from what you wanted):

SQL> with
2 input_string as
3 (
4 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
5 union all
6 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
7 )
8 select column_value grp_rn,
9 regexp_substr(test_string, '\d+', 1, column_value) grp
10 from input_String cross join
11 table(cast(multiset(select level from dual
12 connect by level <= regexp_count(test_string, '\d+')
13 ) as sys.odcinumberlist));

GRP_RN GRP
------- ----------
1 494801
2 8280955
3 8336297
4 494767
1 494801
2 8280955
3 8336297

7 rows selected.

How to apply `REGEXP_SUBSTR` to extract specific substring from string?

You can use a regex capture group () to get only the part you need from a pattern.

SELECT REGEXP_SUBSTR(sample, '^(.*)-\d+$',1,1,'',1) AS sample2
FROM sample_table


















SAMPLE2
011/2020-PL00-70
011/2020-PL00-70
011/202-PL00-70

Oracle REGEX_REPLACE function and capture groups

The regex is fine, you are confused with the replacement pattern.

Oracle regex engine is POSIX based and replacement patterns only support backreferences from 1 through 9. What can't be parsed as a backreference is parsed as literal text.

Hence, \10-\11-\12 is parsed as Group 1 value, 0-, Group 1 value, 1-, Group 1 value, 2.

Also, see regexp_replace documentation:

The replace_string can contain up to 500 backreferences to subexpressions in the form \n, where n is a number from 1 to 9.

Oracle REGEXP_SUBSTR - extract only matched string

If it doesn't have to be a regular expression solution, use substr + instr combination (result_1).

Or, if it has to be regexp, and if string really looks as you posted it, fetch 4th word from it (result_2).

select trim(substr(col, instr(col, '>', 1, 3) + 1,
instr(col, '>', 1, 4) - instr(col, '>', 1, 3) - 1
)) result_1,
--
regexp_substr(col, '\w+', 1, 4) result_2
from your_table;

Oracle SQL : Extract all matched string in each row

You may use (\$\{.+?\}) - using ? makes it a non-greedy match of one or more characters between {}

() captures the group.

A connect by loop (with PRIOR and SYS_GUID()) with level is used to extract all possible matches in a line.

LISTAGG does the concatenation.

WITH message AS (
SELECT
'ITEM1' AS item,
'Dear ${username}, you have changed your address to ${address}' AS content
FROM
dual
UNION ALL
SELECT
'ITEM2',
'Hi ${username}, thank you for attending this event.'
FROM
dual
)
SELECT item,LISTAGG(content,',') WITHIN GROUP ( ORDER BY lvl )
FROM
(
SELECT
item,
regexp_substr(content,'(\$\{.+?\})',1,level,NULL,1) as content,
level as lvl
FROM
message
CONNECT BY level <= regexp_count(content,'\$\{.+?\}')
AND PRIOR item = item
AND PRIOR sys_guid() IS NOT NULL
) GROUP BY item;

Demo



Related Topics



Leave a reply



Submit