Remove Accents Db2

Remove Accents DB2

Based on the mapping in this post https://stackoverflow.com/a/9667817/9525344
the following Db2 function will replace most(?) of the possible Unicode characters with diacritic marks with their simple Latin equivalent (which may, or may not be what is actually used as a replacement in a given language. E..g in German, ü is usually replaced with ue, not u)

CREATE OR REPLACE FUNCTION DB_STRIP_DIACRITICS(string VARCHAR(32000))
RETURNS VARCHAR(32000)
LANGUAGE SQL CONTAINS SQL DETERMINISTIC NO EXTERNAL ACTION
RETURN
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
string,
'[ÁĂẮẶẰẲẴǍÂẤẬẦẨẪÄǞȦǠẠȀÀẢȂĀĄÅǺḀȺÃⱯᴀ]', 'A'),
'[Ꜳ]', 'AA'),
'[ÆǼǢᴁ]', 'AE'),
'[Ꜵ]', 'AO'),
'[Ꜷ]', 'AU'),
'[ꜸꜺ]', 'AV'),
'[Ꜽ]', 'AY'),
'[ḂḄƁḆɃƂʙᴃ]', 'B'),
'[ĆČÇḈĈĊƇȻꜾᴄ]', 'C'),
'[ĎḐḒḊḌƊḎDzDžĐƋꝹᴅ]', 'D'),
'[DZDŽ]', 'DZ'),
'[ÉĔĚȨḜÊẾỆỀỂỄḘËĖẸȄÈẺȆĒḖḔĘɆẼḚƐƎᴇⱻ]', 'E'),
'[Ꝫ]', 'ET'),
'[ḞƑꝻꜰ]', 'F'),
'[ǴĞǦĢĜĠƓḠǤꝽɢʛ]', 'G'),
'[ḪȞḨĤⱧḦḢḤĦʜ]', 'H'),
'[ÍĬǏÎÏḮİỊȈÌỈȊĪĮƗĨḬɪ]', 'I'),
'[IJ]', 'IJ'),
'[Ꝭ]', 'IS'),
'[ĴɈᴊ]', 'J'),
'[ḰǨĶⱩꝂḲƘḴꝀꝄᴋ]', 'K'),
'[ĹȽĽĻḼḶḸⱠꝈḺĿⱢLjŁꞀʟᴌ]', 'L'),
'[LJ]', 'LJ'),
'[ḾṀṂⱮƜᴍ]', 'M'),
'[ŃŇŅṊṄṆǸƝṈȠNjÑɴᴎ]', 'N'),
'[NJ]', 'NJ'),
'[ÓŎǑÔỐỘỒỔỖÖȪȮȰỌŐȌÒỎƠỚỢỜỞỠȎꝊꝌŌṒṐƟǪǬØǾÕṌṎȬƆᴏᴐ]', 'O'),
'[Œɶ]', 'OE'),
'[Ƣ]', 'OI'),
'[Ꝏ]', 'OO'),
'[Ȣᴕ]', 'OU'),
'[ṔṖꝒƤꝔⱣꝐᴘ]', 'P'),
'[ꝘꝖ]', 'Q'),
'[ꞂŔŘŖṘṚṜȐȒṞɌⱤʁʀᴙᴚ]', 'R'),
'[ꞄŚṤŠṦŞŜȘṠṢṨꜱ]', 'S'),
'[ꞆŤŢṰȚȾṪṬƬṮƮŦᴛ]', 'T'),
'[Ꜩ]', 'TZ'),
'[ÚŬǓÛṶÜǗǙǛǕṲỤŰȔÙỦƯỨỰỪỬỮȖŪṺŲŮŨṸṴᴜ]', 'U'),
'[ɅꝞṾƲṼᴠ]', 'V'),
'[Ꝡ]', 'VY'),
'[ẂŴẄẆẈẀⱲᴡ]', 'W'),
'[ẌẊ]', 'X'),
'[ÝŶŸẎỴỲƳỶỾȲɎỸʏ]', 'Y'),
'[ŹŽẐⱫŻẒȤẔƵᴢ]', 'Z'),
'[áăắặằẳẵǎâấậầẩẫäǟȧǡạȁàảȃāąᶏẚåǻḁⱥãɐₐ]', 'a'),
'[ꜳ]', 'aa'),
'[æǽǣᴂ]', 'ae'),
'[ꜵ]', 'ao'),
'[ꜷ]', 'au'),
'[ꜹꜻ]', 'av'),
'[ꜽ]', 'ay'),
'[ḃḅɓḇᵬᶀƀƃ]', 'b'),
'[ćčçḉĉɕċƈȼↄꜿ]', 'c'),
'[ďḑḓȡḋḍɗᶑḏᵭᶁđɖƌꝺ]', 'd'),
'[dzdž]', 'dz'),
'[éĕěȩḝêếệềểễḙëėẹȅèẻȇēḗḕⱸęᶒɇẽḛɛᶓɘǝₑ]', 'e'),
'[ꝫ]', 'et'),
'[ḟƒᵮᶂꝼ]', 'f'),
'[ff]', 'ff'),
'[ffi]', 'ffi'),
'[ffl]', 'ffl'),
'[fi]', 'fi'),
'[fl]', 'fl'),
'[ǵğǧģĝġɠḡᶃǥᵹɡᵷ]', 'g'),
'[ḫȟḩĥⱨḧḣḥɦẖħɥʮʯ]', 'h'),
'[ƕ]', 'hv'),
'[ıíĭǐîïḯịȉìỉȋīįᶖɨĩḭᴉᵢ]', 'i'),
'[ij]', 'ij'),
'[ꝭ]', 'is'),
'[ȷɟʄǰĵʝɉⱼ]', 'j'),
'[ḱǩķⱪꝃḳƙḵᶄꝁꝅʞ]', 'k'),
'[ĺƚɬľļḽȴḷḹⱡꝉḻŀɫᶅɭłꞁ]', 'l'),
'[lj]', 'lj'),
'[ḿṁṃɱᵯᶆɯɰ]', 'm'),
'[ńňņṋȵṅṇǹɲṉƞᵰᶇɳñ]', 'n'),
'[nj]', 'nj'),
'[ɵóŏǒôốộồổỗöȫȯȱọőȍòỏơớợờởỡȏꝋꝍⱺōṓṑǫǭøǿõṍṏȭɔᶗᴑᴓₒ]', 'o'),
'[ᴔœ]', 'oe'),
'[ƣ]', 'oi'),
'[ꝏ]', 'oo'),
'[ȣ]', 'ou'),
'[ṕṗꝓƥᵱᶈꝕᵽꝑ]', 'p'),
'[ꝙʠɋꝗ]', 'q'),
'[ꞃŕřŗṙṛṝȑɾᵳȓṟɼᵲᶉɍɽɿɹɻɺⱹᵣ]', 'r'),
'[ꞅſẜẛẝśṥšṧşŝșṡṣṩʂᵴᶊȿ]', 's'),
'[st]', 'st'),
'[ꞇťţṱțȶẗⱦṫṭƭṯᵵƫʈŧʇ]', 't'),
'[ᵺ]', 'th'),
'[ꜩ]', 'tz'),
'[ᴝúŭǔûṷüǘǚǜǖṳụűȕùủưứựừửữȗūṻųᶙůũṹṵᵤ]', 'u'),
'[ᵫ]', 'ue'),
'[ꝸ]', 'um'),
'[ʌⱴꝟṿʋᶌⱱṽᵥ]', 'v'),
'[ꝡ]', 'vy'),
'[ʍẃŵẅẇẉẁⱳẘ]', 'w'),
'[ẍẋᶍₓ]', 'x'),
'[ʎýŷÿẏỵỳƴỷỿȳẙɏỹ]', 'y'),
'[źžẑʑⱬżẓȥẕᵶᶎʐƶɀ]', 'z')

How can I remove accents on a string?

Try using COLLATE:

select 'áéíóú' collate SQL_Latin1_General_Cp1251_CS_AS

For Unicode data, try the following:

select cast(N'áéíóú' as varchar(max)) collate SQL_Latin1_General_Cp1251_CS_AS

I am not sure what you may lose in the translation when using the second approach.

Update

It looks like œ is a special case, and we have to handle upper and lower case separately. You can do it like this (this code is a good candidate for a user-defined function):

declare @str nvarchar(max) = N'ñaàeéêèioô; Œuf un œuf'
select cast(
replace((
replace(@str collate Latin1_General_CS_AS, 'Œ' collate Latin1_General_CS_AS, 'OE' collate Latin1_General_CS_AS)
) collate Latin1_General_CS_AS, 'œ' collate Latin1_General_CS_AS, 'oe' collate Latin1_General_CS_AS) as varchar(max)
) collate SQL_Latin1_General_Cp1251_CS_AS
-- Output:
-- naaeeeeioo; Oeuf un oeuf

User Defined Function

create function dbo.fnRemoveAccents(@str nvarchar(max))  
returns varchar(max) as
begin
return cast(
replace((
replace(@str collate Latin1_General_CS_AS, 'Œ' collate Latin1_General_CS_AS, 'OE' collate Latin1_General_CS_AS)
) collate Latin1_General_CS_AS, 'œ' collate Latin1_General_CS_AS, 'oe' collate Latin1_General_CS_AS) as varchar(max)
) collate SQL_Latin1_General_Cp1251_CS_AS
end

Translate or convert French accented characters to base ASCII characters

you can use the translate function if you want :

  translate(upper(ColName),'AAAEEEIIIOOOUUU','ÁÀÄÉÈËÍÌÏÓÒÖÚÙÜ')

How to remove accents and all chars a..z in sql-server?

You can avoid hard-coded REPLACE statements by using a COLLATE clause with an accent-insensitive collation to compare the accented alphabetic characters to non-alphabetic ones:

DECLARE 
@s1 NVARCHAR(200),
@s2 NVARCHAR(200)

SET @s1 = N'aèàç=.32s df'

SET @s2 = N''
SELECT @s2 = @s2 + no_accent
FROM (
SELECT
SUBSTRING(@s1, number, 1) AS accent,
number
FROM master.dbo.spt_values
WHERE TYPE = 'P'
AND number BETWEEN 1 AND LEN(@s1)
) s1
INNER JOIN (
SELECT NCHAR(number) AS no_accent
FROM master.dbo.spt_values
WHERE type = 'P'
AND (number BETWEEN 65 AND 90 OR number BETWEEN 97 AND 122)
) s2
ON s1.accent COLLATE LATIN1_GENERAL_CS_AI = s2.no_accent
ORDER BY number

SELECT @s1
SELECT @s2

/*
aèàç=.32s df
aeacsdf
*/

DB2 accent insensitive queries

You could change the case of the column and the value you are looking for.

For example

Select upper(lastname)
From myTable
where upper(name) = upper('John')

You can use:

  • Upper
  • Translate

  • Ucase

However, I am not sure if these functions are available in your very old DB2 version. Plan to upgrade!

DB2 accent insensitive queries

You could change the case of the column and the value you are looking for.

For example

Select upper(lastname)
From myTable
where upper(name) = upper('John')

You can use:

  • Upper
  • Translate

  • Ucase

However, I am not sure if these functions are available in your very old DB2 version. Plan to upgrade!

DB2 remove trailing 0 and

cast (0.1 as varchar (x)) returns .1, not 0.1.

But cast (decfloat (0.1) as varchar (x)) returns 0.1.

So try the following:

cast (decfloat (i) / 10000 as varchar (20))

How to find special characters in DB2?

You can use the DB2 TRANSLATE() function to isolate non-alphanumeric characters. Note that this will not work in the Oracle compatibility mode, because in that case DB2 will treat empty strings as NULLs, as Oracle would do.

SELECT *
FROM yourtable
WHERE LENGTH(TRANSLATE(
yourcolumn,
'', -- empty string
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'
)) > 0 -- after translating ASCII characters to empty strings
-- there's still something left


Related Topics



Leave a reply



Submit