How Replace Accented Letter in a Varchar2 Column in Oracle

how replace accented letter in a varchar2 column in oracle

Use convert function with the appropriate charset

select CONVERT('JUAN ROMÄN', 'US7ASCII') from dual;

below are the charset which can be used in oracle:

US7ASCII: US 7-bit ASCII character set
WE8DEC: West European 8-bit character set
WE8HP: HP West European Laserjet 8-bit character set
F7DEC: DEC French 7-bit character set
WE8EBCDIC500: IBM West European EBCDIC Code Page 500
WE8PC850: IBM PC Code Page 850
WE8ISO8859P1: ISO 8859-1 West European 8-bit character set

Remove accents from string in Oracle

You can use TRANSLATE(your_string, from_chars, to_chars) https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions196.htm

Just put all chars with accents in from_chars string and their corresponding replacement chars in to_chars.

how search accented letter in a varchar2 column in oracle

Unfortunately, I totally don't know the alphabet and language you use to be able to generate additional test cases. But what you are looking for is an accent-insensitive comparison, which is designated by _AI suffix for NLS_SORT parameter. For more information see Globalization support guide.

Assuming this sample data:

select *
from t


















IDUNAME
1آصفی
2اصفی

Strip non English characters in Oracle SQL

have you tried Translate() ?

translate(text,
'ÂÃÄÀÁÅÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝŸàáâãäåçèéêëìíîïñòóôõöøùúûüýÿ',
'AAAAAACEEEEIIIINOOOOOOUUUUYYaaaaaaceeeeiiiinoooooouuuuyy')

How can I remove accents on a string?

Try using COLLATE:

select 'áéíóú' collate SQL_Latin1_General_Cp1251_CS_AS

For Unicode data, try the following:

select cast(N'áéíóú' as varchar(max)) collate SQL_Latin1_General_Cp1251_CS_AS

I am not sure what you may lose in the translation when using the second approach.

Update

It looks like œ is a special case, and we have to handle upper and lower case separately. You can do it like this (this code is a good candidate for a user-defined function):

declare @str nvarchar(max) = N'ñaàeéêèioô; Œuf un œuf'
select cast(
replace((
replace(@str collate Latin1_General_CS_AS, 'Œ' collate Latin1_General_CS_AS, 'OE' collate Latin1_General_CS_AS)
) collate Latin1_General_CS_AS, 'œ' collate Latin1_General_CS_AS, 'oe' collate Latin1_General_CS_AS) as varchar(max)
) collate SQL_Latin1_General_Cp1251_CS_AS
-- Output:
-- naaeeeeioo; Oeuf un oeuf

User Defined Function

create function dbo.fnRemoveAccents(@str nvarchar(max))  
returns varchar(max) as
begin
return cast(
replace((
replace(@str collate Latin1_General_CS_AS, 'Œ' collate Latin1_General_CS_AS, 'OE' collate Latin1_General_CS_AS)
) collate Latin1_General_CS_AS, 'œ' collate Latin1_General_CS_AS, 'oe' collate Latin1_General_CS_AS) as varchar(max)
) collate SQL_Latin1_General_Cp1251_CS_AS
end

regexp for all accented characters in Oracle

After some more experimenting, I have found that this seems to work ok:

select *
from xml_tmp
where regexp_like(XMLTYpe.getClobVal(xml_data),'[^[:graph:][:space:]]')

I had thought that [:graph:] would include all upper and lower case characters, with or without accents, but it seems that it only matches unaccented characters.


Further experimentation shows that this might not work in all cases. Try these queries:

select *
from dual
where regexp_like (unistr('\0090'),'[^[:graph:][:space:]]');

DUMMY
-------
X
(the match succeeded)

So it looks like the character that's been causing me trouble matches this pattern.

select *
from dual
where regexp_like ('É','[^[:graph:][:space:]]');

DUMMY
-------

(the match failed)

When I try to run this query with the accented E as copied-and-pasted, the match fails! I guess whatever I copied-and-pasted is actually different. Ugh, I think I now hate working with changing character encodings.



Related Topics



Leave a reply



Submit