Ora-12728: Invalid Range in Regular Expression

ORA-12728: invalid range in regular expression

Regexp don't use \ to protect - in a bracket expression. You only have to put - as the first character, just after the opening bracket:

IF REGEXP_LIKE('--,,::', '[\-,:]*')
...

=> ORA-12728: invalid range in regular expression

If you're curious, when encountering [\-,:] Oracle understand: "any character in the range from \ to , or the character :". The reason why this raises an exception is \ appears to be after , according to their ASCII value. And Oracle don't accept range having a starting value after the ending one.

On the other hand:

 IF REGEXP_LIKE('--,,::', '[-,:]*')

Works as expected.


As a side note, [-,:]{0,1} meaning "zero or one occurrence of - or , or :" could be written [-,:]?.

ORA-12728: invalid range in regular expression Inserts

You need to wrap insert_clientes procedure in the package specification part of the package with CREATE OR REPLACE PACKAGE INSERTS AS in the beginning and END INSERTS; in the end. END INSERTS; is also missing for the package body.

SQL> CREATE OR REPLACE PACKAGE INSERTS AS
PROCEDURE insert_clientes(w_DNI_CIF clientes.dni_cif%TYPE,
w_Contrasena clientes.contrasena%TYPE,
w_Telefono clientes.telefono%TYPE,
w_Email clientes.email%TYPE,
w_TipoCliente clientes.tipocliente%TYPE,
w_Nombre clientes.nombre%TYPE,
w_FormaPago clientes.formapago%TYPE,
w_NumeroCuenta clientes.numerocuenta%TYPE,
w_CancelacionesIndebidas clientes.cancelacionesindebidas%TYPE);
END INSERTS;
/
SQL> CREATE OR REPLACE PACKAGE BODY INSERTS AS
PROCEDURE insert_clientes(w_DNI_CIF clientes.dni_cif%TYPE,
w_Contrasena clientes.contrasena%TYPE,
w_Telefono clientes.telefono%TYPE,
w_Email clientes.email%TYPE,
w_TipoCliente clientes.tipocliente%TYPE,
w_Nombre clientes.nombre%TYPE,
w_FormaPago clientes.formapago%TYPE,
w_NumeroCuenta clientes.numerocuenta%TYPE,
w_CancelacionesIndebidas clientes.cancelacionesindebidas%TYPE) IS

BEGIN

INSERT INTO Clientes
(DNI_CIF,
Contrasena,
Telefono,
Email,
TipoCliente,
Nombre,
FormaPago,
NumeroCuenta,
CancelacionesIndebidas)
VALUES
(w_DNI_CIF,
w_Contrasena,
w_Telefono,
w_Email,
w_TipoCliente,
w_Nombre,
w_FormaPago,
w_NumeroCuenta,
w_CancelacionesIndebidas);

END insert_clientes;
END inserts;
/

And execute as the following (So, there's no problem with the string you provided for w_dni_cif column):

SQL> exec inserts.insert_clientes('12312389P','12345678',666666666,'una@muno.com','Particular','Miguel de Unamuno','Transferencia','ES7119225879874039280971',0);
PL/SQL procedure successfully completed

SQL> commit;

Commit complete

Find out if a string contains only ASCII characters

I think I will go for one of these two

IF CONVERT(str, 'US7ASCII') = str THEN
DBMS_OUTPUT.PUT_LINE('Pure ASCII');
END IF;

IF ASCIISTR(REPLACE(str, '\', '/')) = REPLACE(str, '\', '/') THEN
DBMS_OUTPUT.PUT_LINE('Pure ASCII');
END IF;

Finding and removing Non-ASCII characters from an Oracle Varchar2

In a single-byte ASCII-compatible encoding (e.g. Latin-1), ASCII characters are simply bytes in the range 0 to 127. So you can use something like [\x80-\xFF] to detect non-ASCII characters.

Oracle - Extract multiple substrings from a string

Looking for the preceding { and the following }; which seems to appear in your string you can use:

SELECT REGEXP_REPLACE(
'"User_1" {user_1@domain.com};"User_2" {user_2@domain.com};"User_3" {user_3@domain.com};"User_4" {user_4@domain.com};',
'.*?\{(.*?)\};',
'\1;'
) AS emails
FROM DUAL;

Output:

EMAILS                                                                
------------------------------------------------------------------------
user_1@domain.com;user_2@domain.com;user_3@domain.com;user_4@domain.com;

How to find the exact match of a string and replace in Oracle?

Do it in two steps:

  1. First replace the strings with the ids in some wrapper that is not going to appear in your text (i.e. testing maps to ${2})
  2. Then, once all the replacements have been done, replace the wrapped ids with the urls (i.e. ${2} maps to http://localhost/2/<u>testing</u>)

Oracle Setup:

Create table temp(
id NUMBER,
word VARCHAR2(1000),
Sentence VARCHAR2(2000)
);

insert into temp
SELECT 1,'automation testing', 'automtestingation testing is popular kind of testing' FROM DUAL UNION ALL
SELECT 2,'testing','manual testing' FROM DUAL UNION ALL
SELECT 3,'manual testing','this is an old method of testing' FROM DUAL UNION ALL
SELECT 4,'punctuation','automation testing,manual testing,punctuation,automanual testing-testing' FROM DUAL UNION ALL
SELECT 5,'B-number analysis','B-number analysis table' FROM DUAL UNION ALL
SELECT 6,'B-number analysis table','testing B-number analysis' FROM DUAL UNION ALL
SELECT 7,'Not Matched','testing testing testing' FROM DUAL;

Merge:

MERGE INTO temp dst
USING (
WITH ordered_words ( rn, id, word ) AS (
SELECT ROW_NUMBER() OVER ( ORDER BY LENGTH( word ) ASC, word DESC ),
id,
word
FROM temp
),
sentences_with_ids ( rid, sentence, rn ) AS (
SELECT ROWID,
sentence,
( SELECT COUNT(*) + 1 FROM ordered_words )
FROM temp
UNION ALL
SELECT s.rid,
REGEXP_REPLACE(
REGEXP_REPLACE(
s.sentence,
'(^|\W)' || w.word || '($|\W)',
'\1${'|| w.id ||'}\2'
),
'(^|\W)' || w.word || '($|\W)',
'\1${' || w.id || '}\2'
),
s.rn - 1
FROM sentences_with_ids s
INNER JOIN ordered_words w
ON ( s.rn - 1 = w.rn )
),
sentences_with_words ( rid, sentence, rn ) AS (
SELECT rid,
sentence,
( SELECT COUNT(*) + 1 FROM ordered_words )
FROM sentences_with_ids
WHERE rn = 1
UNION ALL
SELECT s.rid,
REPLACE(
s.sentence,
'${' || w.id || '}',
'http://localhost/' || w.id || '/<u>' || w.word || '</u>'
),
s.rn - 1
FROM sentences_with_words s
INNER JOIN ordered_words w
ON ( s.rn - 1 = w.rn )
)
SELECT rid, sentence
FROM sentences_with_words
WHERE rn = 1
) src
ON ( dst.ROWID = src.RID )
WHEN MATCHED THEN
UPDATE
SET sentence = src.sentence;

Output:


ID | WORD | SENTENCE
-: | :---------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | automation testing | automtestingation http://localhost/2/<u>testing</u> is popular kind of http://localhost/2/<u>testing</u>
2 | testing | http://localhost/3/<u>manual testing</u>
3 | manual testing | this is an old method of http://localhost/2/<u>testing</u>
4 | punctuation | http://localhost/1/<u>automation testing</u>,http://localhost/3/<u>manual testing</u>,http://localhost/4/<u>punctuation</u>,automanual http://localhost/2/<u>testing</u>-http://localhost/2/<u>testing</u>
5 | B-number analysis | http://localhost/6/<u>B-number analysis table</u>
6 | B-number analysis table | http://localhost/2/<u>testing</u> http://localhost/5/<u>B-number analysis</u>
7 | Not Matched | http://localhost/2/<u>testing</u> http://localhost/2/<u>testing</u> http://localhost/2/<u>testing</u>

db<>fiddle here


Update:

Escape any special regular expression characters in the words:

MERGE INTO temp dst
USING (
WITH ordered_words ( rn, id, word, regex_safe_word ) AS (
SELECT ROW_NUMBER() OVER ( ORDER BY LENGTH( word ) ASC, word DESC ),
id,
word,
REGEXP_REPLACE( word, '([][)(}{|^$\.*+?])', '\\\1' )
FROM temp
),
sentences_with_ids ( rid, sentence, rn ) AS (
SELECT ROWID,
sentence,
( SELECT COUNT(*) + 1 FROM ordered_words )
FROM temp
UNION ALL
SELECT s.rid,
REGEXP_REPLACE(
REGEXP_REPLACE(
s.sentence,
'(^|\W)' || w.regex_safe_word || '($|\W)',
'\1${'|| w.id ||'}\2'
),
'(^|\W)' || w.regex_safe_word || '($|\W)',
'\1${' || w.id || '}\2'
),
s.rn - 1
FROM sentences_with_ids s
INNER JOIN ordered_words w
ON ( s.rn - 1 = w.rn )
),
sentences_with_words ( rid, sentence, rn ) AS (
SELECT rid,
sentence,
( SELECT COUNT(*) + 1 FROM ordered_words )
FROM sentences_with_ids
WHERE rn = 1
UNION ALL
SELECT s.rid,
REPLACE(
s.sentence,
'${' || w.id || '}',
'http://localhost/' || w.id || '/<u>' || w.word || '</u>'
),
s.rn - 1
FROM sentences_with_words s
INNER JOIN ordered_words w
ON ( s.rn - 1 = w.rn )
)
SELECT rid, sentence
FROM sentences_with_words
WHERE rn = 1
) src
ON ( dst.ROWID = src.RID )
WHEN MATCHED THEN
UPDATE
SET sentence = src.sentence;

Output:


ID | WORD | SENTENCE
-: | :---------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | automation testing | automtestingation http://localhost/2/<u>testing</u> is popular kind of http://localhost/2/<u>testing</u>
2 | testing | http://localhost/3/<u>manual testing</u>
3 | manual testing | this is an old method of http://localhost/2/<u>testing</u>
4 | punctuation | http://localhost/1/<u>automation testing</u>,http://localhost/3/<u>manual testing</u>,http://localhost/4/<u>punctuation</u>,automanual http://localhost/2/<u>testing</u>-http://localhost/2/<u>testing</u>
5 | B-number analysis | http://localhost/6/<u>B-number analysis table</u>
6 | B-number analysis table | http://localhost/2/<u>testing</u> http://localhost/5/<u>B-number analysis</u>
7 | Not Matched | http://localhost/2/<u>testing</u> http://localhost/2/<u>testing</u> http://localhost/2/<u>testing</u>
8 | ^[($ | http://localhost/2/<u>testing</u> characters http://localhost/8/<u>^[($</u> that need escaping in a regular expression

db<>fiddle here

How to select rows with 4-byte UTF-8 chars in Oracle DB?

You can use the UNISTR function; the character is codepoint U+2070E, which in UTF-16 is D841DF0E. As the documentation notes:

Supplementary characters are encoded as two code units, the first from the high-surrogates range (U+D800 to U+DBFF), and the second from the low-surrogates range (U+DC00 to U+DFFF).

Which means you can represent it with:

select unistr('\D841\DF0E') from dual;

UNISTR('\D841\DF0E')
--------------------
br>

You can then use UNISTR to construct your range:

select REGEXP_REPLACE('asdaasd', 
'['
|| UNISTR('\D800\DC00')
|| '-'
|| UNISTR('\DBFF\DFFF')
|| ']', '')
from dual;

REGEXP_REPLACE('ASDAASD','['||UNISTR('\D800\DC00')||'-'||UNISTR('\DBFF\DFFF')||']','')
----------------------------------------------------------------------------------------
asdaasd

Assuming you want to exclude all supplementary characters; you can adjust the range if you have a more narrow focus.



Related Topics



Leave a reply



Submit