How to extract group from regular expression in Oracle?
The third parameter of the REGEXP_SUBSTR function indicates the position in the target string (de_desc
in your example) where you want to start searching. Assuming a match is found in the given portion of the string, it doesn't affect what is returned.
In Oracle 11g, there is a sixth parameter to the function, that I think is what you are trying to use, which indicates the capture group that you want returned. An example of proper use would be:
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]', 1,1,NULL,1) from dual;
Where the last parameter 1
indicate the number of the capture group you want returned. Here is a link to the documentation that describes the parameter.
10g does not appear to have this option, but in your case you can achieve the same result with:
select substr( match, 2, length(match)-2 ) from (
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]') match FROM dual
);
since you know that a match will have exactly one excess character at the beginning and end. (Alternatively, you could use RTRIM and LTRIM to remove brackets from both ends of the result.)
REGEXP_SUBSTR find groups
Modified pattern:
^(\$E\[([A-Z|a-z|0-9|_]+)\]#)(\$D\[(.*)\]#)?$
Use REGEXP_REPLACE with back reference (works on Oracle 10g):
SELECT
REGEXP_REPLACE('$E[some_key]#$D[some value]#',pattern, '\2') AS MATCH1,
REGEXP_REPLACE('$E[some_key]#$D[some value]#',pattern, '\4') AS MATCH2
FROM DUAL;
On Oracle 11g you can do:
SELECT
REGEXP_SUBSTR('$E[some_key]#$D[some value]#',pattern, 1,1,null,2) AS MATCH1,
REGEXP_SUBSTR('$E[some_key]#$D[some value]#',pattern, 1,1,null,4) AS MATCH2
FROM DUAL;
Documentation: http://docs.oracle.com/cd/B28359_01/server.111/b28286/functions138.htm
regex oracle sql return all capturing groups
Query 1:
-- Sample data
WITH your_table ( value ) AS (
SELECT 'some stuff TOTAL_SCORE<518>some stuff OTHER_VALUE<456> foo <after>' FROM DUAL
)
-- Query
SELECT REGEXP_REPLACE(
value,
'.*TOTAL_SCORE<(\d{3})>.*OTHER_VALUE<(\d{3})>.*',
'\1,\2'
) As scores
FROM your_table
Output:
SCORES
-------
518,456
Query 2:
-- Sample data
WITH your_table ( value ) AS (
SELECT 'some stuff TOTAL_SCORE<518> TOTAL_SCORE<123> some stuff OTHER_VALUE<456> foo <after>' FROM DUAL
)
-- Query
SELECT l.column_value As scores
FROM your_table t,
TABLE(
CAST(
MULTISET(
SELECT TO_NUMBER(
REGEXP_SUBSTR(
t.value,
'TOTAL_SCORE<(\d{3})>',
1,
LEVEL,
NULL,
1
)
)
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( t.value, 'TOTAL_SCORE<(\d{3})>' )
) AS SYS.ODCINUMBERLIST
)
) l;
Output:
SCORES
-------
518
123
Oracle SQL regexp_substr non-capturing/optional group
This is a little bit too much to put into a comment so I'll write it here. If it doesn't make sense, I'll just remove it.
If you're always looking for digits in those strings (not related to what surrounds them), then it could be simplified to
SQL> with
2 input_string as
3 (
4 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
5 union all
6 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
7 )
8 select regexp_substr(test_string, '\d+', 1, 1) grp1,
9 regexp_substr(test_string, '\d+', 1, 2) grp2,
10 regexp_substr(test_string, '\d+', 1, 3) grp3,
11 regexp_substr(test_string, '\d+', 1, 4) grp4
12 from input_string;
GRP1 GRP2 GRP3 GRP4
---------- ---------- ---------- ----------
494801 8280955 8336297 494767
494801 8280955 8336297
SQL>
Or, option which doesn't have fixed number of groups (though, layout is different from what you wanted):
SQL> with
2 input_string as
3 (
4 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
5 union all
6 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
7 )
8 select column_value grp_rn,
9 regexp_substr(test_string, '\d+', 1, column_value) grp
10 from input_String cross join
11 table(cast(multiset(select level from dual
12 connect by level <= regexp_count(test_string, '\d+')
13 ) as sys.odcinumberlist));
GRP_RN GRP
------- ----------
1 494801
2 8280955
3 8336297
4 494767
1 494801
2 8280955
3 8336297
7 rows selected.
How to apply `REGEXP_SUBSTR` to extract specific substring from string?
You can use a regex capture group ()
to get only the part you need from a pattern.
SELECT REGEXP_SUBSTR(sample, '^(.*)-\d+$',1,1,'',1) AS sample2
FROM sample_table
SAMPLE2 |
---|
011/2020-PL00-70 |
011/2020-PL00-70 |
011/202-PL00-70 |
Oracle REGEX_REPLACE function and capture groups
The regex is fine, you are confused with the replacement pattern.
Oracle regex engine is POSIX based and replacement patterns only support backreferences from 1 through 9. What can't be parsed as a backreference is parsed as literal text.
Hence, \10-\11-\12
is parsed as Group 1 value, 0-
, Group 1 value, 1-
, Group 1 value, 2
.
Also, see regexp_replace
documentation:
The
replace_string
can contain up to 500 backreferences to subexpressions in the form\n
, wheren
is a number from 1 to 9.
Oracle REGEXP_SUBSTR - extract only matched string
If it doesn't have to be a regular expression solution, use substr
+ instr
combination (result_1
).
Or, if it has to be regexp, and if string really looks as you posted it, fetch 4th word from it (result_2
).
select trim(substr(col, instr(col, '>', 1, 3) + 1,
instr(col, '>', 1, 4) - instr(col, '>', 1, 3) - 1
)) result_1,
--
regexp_substr(col, '\w+', 1, 4) result_2
from your_table;
Oracle SQL : Extract all matched string in each row
You may use (\$\{.+?\})
- using ?
makes it a non-greedy match of one or more characters between {}
()
captures the group.
A connect by
loop (with PRIOR
and SYS_GUID()
) with level
is used to extract all possible matches in a line.
LISTAGG
does the concatenation.
WITH message AS (
SELECT
'ITEM1' AS item,
'Dear ${username}, you have changed your address to ${address}' AS content
FROM
dual
UNION ALL
SELECT
'ITEM2',
'Hi ${username}, thank you for attending this event.'
FROM
dual
)
SELECT item,LISTAGG(content,',') WITHIN GROUP ( ORDER BY lvl )
FROM
(
SELECT
item,
regexp_substr(content,'(\$\{.+?\})',1,level,NULL,1) as content,
level as lvl
FROM
message
CONNECT BY level <= regexp_count(content,'\$\{.+?\}')
AND PRIOR item = item
AND PRIOR sys_guid() IS NOT NULL
) GROUP BY item;
Demo
Related Topics
Does Postgres Support Nested or Autonomous Transactions
How to Compare SQLite Timestamp Values
How to Do a Before Updated Trigger with SQL Server
What Can Cause an Oracle Rowid to Change
Separate Comma Separated Values and Store in Table in SQL Server
Convert a String Date into Datetime in Oracle
Case When Statement for Order by Clause
Count Based on Condition in SQL Server
How to Run a Stored Procedure Every Day in SQL Server Express Edition
Oracle SQL - Identify Sequential Value Ranges
Not Null Constraint Over a Set of Columns
Explain Join VS. Left Join and Where Condition Performance Suggestion in More Detail
Disable Rails SQL Logging in Console