Finding Rows That Don't Contain Numeric Data in Oracle

Oracle SQL: Filtering rows with non-numeric characters

I know you specifically asked for a regex solution, but translate can solve these kind of questions as well (and usually faster because regexes use more processing power):

select count(1)
from tbl
where translate(a, 'x0123456789', 'x') is null
and translate(b, 'x0123456789', 'x') is null;

What this does: translate the characters 0123456789 to null, and if the result is null, then the input must have been all digits. The 'x' is just there because the third argument to translate can not be null.

Thought I should add this here, might be helpful to other readers.

How to find non-numeric columns containing only numeric data?

Create a function as this:

create or replace function isNumeric(val in VARCHAR2) return INTEGER AS
res NUMBER;
begin
res := TO_NUMBER(val);
RETURN 1;
EXCEPTION
WHEN OTHERS THEN
RETURN 0;
END;

Then you can use it like this:

DECLARE
r integer;
BEGIN
For aCol in (SELECT TABLE_NAME, COLUMN_NAME FROM user_tab_cols WHERE DATA_TYPE = 'VARCHAR2') LOOP
-- What about CHAR and CLOB data types?
execute immediate 'select count(*) from '||aCol.TABLE_NAME||' WHERE isNumeric('||aCol.COLUMN_NAME||') = 0' into r;
if r = 0 then
DBMS_OUTPUT.put_line(aCol.TABLE_NAME ||' '||aCol.COLUMN_NAME ||' contains numeric values only');
end if;
end loop;
end;

Note, the performance of this PL/SQL block will be poor. Hopefully this is a one-time-job only.

Finding non-numeric values in varchar column

The precision means that you want at most allowed_precision digits in the number (strictly speaking, not counting leading zeros, but I'll ignore that). The scale means that at most allowed_scale can be after the decimal point.

This suggests a regular expression such as:

[-]?[0-9]{1,<before>}[.]?[0-9]{0,<after>}

You can construct the regular expression:

NOT REGEXP_LIKE(COLUMN_NAME,
REPLACE(REPLACE('[-]?[0-9]{1,<before>}[.]?[0-9]{0,<after>}', '<before>', allowed_precision - allowed_scale
), '<after>', allowed_scale)

Now, variable regular expressions are highly inefficient. You can do the logic using like and other functions as well. I think the conditions are:

(column_name not like '%.%.%' and
column_name not like '_%-%' and
translate(column_name, '0123456789-.x', 'x') is null and
length(translate(column_name, '-.x', 'x') <= allowed_precision and
length(translate(column_name, '-.x', 'x') >= 1 and
instr(translate(column_name, '-.x', 'x'), '.') <= allowed_precision - allowed_scale
)

How can you tell if a value is not numeric in Oracle?


REGEXP_LIKE(column, '^[[:digit:]]+$')

returns TRUE if column holds only numeric characters

how to query a table to find rows where there are no numbers in a column

The carat (^) inside a [] bracket means match any character except for those in the list. So if you want to match any character except for 0-9, you use [^0-9] and the LIKE (without the NOT) will match all other characters.

This should work:

SELECT TOP 10 *
FROM @WorkOrderHistory WO
WHERE WO.WorkOrder LIKE '%[^0-9]%'

Of course, you'll match all punctuation, and unprintable characters as well.

Via http://technet.microsoft.com/en-us/library/ms174214(v=sql.110).aspx



Related Topics



Leave a reply



Submit