Remove Accents DB2
Based on the mapping in this post https://stackoverflow.com/a/9667817/9525344
the following Db2 function will replace most(?) of the possible Unicode characters with diacritic marks with their simple Latin equivalent (which may, or may not be what is actually used as a replacement in a given language. E..g in German, ü
is usually replaced with ue
, not u
)
CREATE OR REPLACE FUNCTION DB_STRIP_DIACRITICS(string VARCHAR(32000))
RETURNS VARCHAR(32000)
LANGUAGE SQL CONTAINS SQL DETERMINISTIC NO EXTERNAL ACTION
RETURN
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
string,
'[ÁĂẮẶẰẲẴǍÂẤẬẦẨẪÄǞȦǠẠȀÀẢȂĀĄÅǺḀȺÃⱯᴀ]', 'A'),
'[Ꜳ]', 'AA'),
'[ÆǼǢᴁ]', 'AE'),
'[Ꜵ]', 'AO'),
'[Ꜷ]', 'AU'),
'[ꜸꜺ]', 'AV'),
'[Ꜽ]', 'AY'),
'[ḂḄƁḆɃƂʙᴃ]', 'B'),
'[ĆČÇḈĈĊƇȻꜾᴄ]', 'C'),
'[ĎḐḒḊḌƊḎDzDžĐƋꝹᴅ]', 'D'),
'[DZDŽ]', 'DZ'),
'[ÉĔĚȨḜÊẾỆỀỂỄḘËĖẸȄÈẺȆĒḖḔĘɆẼḚƐƎᴇⱻ]', 'E'),
'[Ꝫ]', 'ET'),
'[ḞƑꝻꜰ]', 'F'),
'[ǴĞǦĢĜĠƓḠǤꝽɢʛ]', 'G'),
'[ḪȞḨĤⱧḦḢḤĦʜ]', 'H'),
'[ÍĬǏÎÏḮİỊȈÌỈȊĪĮƗĨḬɪ]', 'I'),
'[IJ]', 'IJ'),
'[Ꝭ]', 'IS'),
'[ĴɈᴊ]', 'J'),
'[ḰǨĶⱩꝂḲƘḴꝀꝄᴋ]', 'K'),
'[ĹȽĽĻḼḶḸⱠꝈḺĿⱢLjŁꞀʟᴌ]', 'L'),
'[LJ]', 'LJ'),
'[ḾṀṂⱮƜᴍ]', 'M'),
'[ŃŇŅṊṄṆǸƝṈȠNjÑɴᴎ]', 'N'),
'[NJ]', 'NJ'),
'[ÓŎǑÔỐỘỒỔỖÖȪȮȰỌŐȌÒỎƠỚỢỜỞỠȎꝊꝌŌṒṐƟǪǬØǾÕṌṎȬƆᴏᴐ]', 'O'),
'[Œɶ]', 'OE'),
'[Ƣ]', 'OI'),
'[Ꝏ]', 'OO'),
'[Ȣᴕ]', 'OU'),
'[ṔṖꝒƤꝔⱣꝐᴘ]', 'P'),
'[ꝘꝖ]', 'Q'),
'[ꞂŔŘŖṘṚṜȐȒṞɌⱤʁʀᴙᴚ]', 'R'),
'[ꞄŚṤŠṦŞŜȘṠṢṨꜱ]', 'S'),
'[ꞆŤŢṰȚȾṪṬƬṮƮŦᴛ]', 'T'),
'[Ꜩ]', 'TZ'),
'[ÚŬǓÛṶÜǗǙǛǕṲỤŰȔÙỦƯỨỰỪỬỮȖŪṺŲŮŨṸṴᴜ]', 'U'),
'[ɅꝞṾƲṼᴠ]', 'V'),
'[Ꝡ]', 'VY'),
'[ẂŴẄẆẈẀⱲᴡ]', 'W'),
'[ẌẊ]', 'X'),
'[ÝŶŸẎỴỲƳỶỾȲɎỸʏ]', 'Y'),
'[ŹŽẐⱫŻẒȤẔƵᴢ]', 'Z'),
'[áăắặằẳẵǎâấậầẩẫäǟȧǡạȁàảȃāąᶏẚåǻḁⱥãɐₐ]', 'a'),
'[ꜳ]', 'aa'),
'[æǽǣᴂ]', 'ae'),
'[ꜵ]', 'ao'),
'[ꜷ]', 'au'),
'[ꜹꜻ]', 'av'),
'[ꜽ]', 'ay'),
'[ḃḅɓḇᵬᶀƀƃ]', 'b'),
'[ćčçḉĉɕċƈȼↄꜿ]', 'c'),
'[ďḑḓȡḋḍɗᶑḏᵭᶁđɖƌꝺ]', 'd'),
'[dzdž]', 'dz'),
'[éĕěȩḝêếệềểễḙëėẹȅèẻȇēḗḕⱸęᶒɇẽḛɛᶓɘǝₑ]', 'e'),
'[ꝫ]', 'et'),
'[ḟƒᵮᶂꝼ]', 'f'),
'[ff]', 'ff'),
'[ffi]', 'ffi'),
'[ffl]', 'ffl'),
'[fi]', 'fi'),
'[fl]', 'fl'),
'[ǵğǧģĝġɠḡᶃǥᵹɡᵷ]', 'g'),
'[ḫȟḩĥⱨḧḣḥɦẖħɥʮʯ]', 'h'),
'[ƕ]', 'hv'),
'[ıíĭǐîïḯịȉìỉȋīįᶖɨĩḭᴉᵢ]', 'i'),
'[ij]', 'ij'),
'[ꝭ]', 'is'),
'[ȷɟʄǰĵʝɉⱼ]', 'j'),
'[ḱǩķⱪꝃḳƙḵᶄꝁꝅʞ]', 'k'),
'[ĺƚɬľļḽȴḷḹⱡꝉḻŀɫᶅɭłꞁ]', 'l'),
'[lj]', 'lj'),
'[ḿṁṃɱᵯᶆɯɰ]', 'm'),
'[ńňņṋȵṅṇǹɲṉƞᵰᶇɳñ]', 'n'),
'[nj]', 'nj'),
'[ɵóŏǒôốộồổỗöȫȯȱọőȍòỏơớợờởỡȏꝋꝍⱺōṓṑǫǭøǿõṍṏȭɔᶗᴑᴓₒ]', 'o'),
'[ᴔœ]', 'oe'),
'[ƣ]', 'oi'),
'[ꝏ]', 'oo'),
'[ȣ]', 'ou'),
'[ṕṗꝓƥᵱᶈꝕᵽꝑ]', 'p'),
'[ꝙʠɋꝗ]', 'q'),
'[ꞃŕřŗṙṛṝȑɾᵳȓṟɼᵲᶉɍɽɿɹɻɺⱹᵣ]', 'r'),
'[ꞅſẜẛẝśṥšṧşŝșṡṣṩʂᵴᶊȿ]', 's'),
'[st]', 'st'),
'[ꞇťţṱțȶẗⱦṫṭƭṯᵵƫʈŧʇ]', 't'),
'[ᵺ]', 'th'),
'[ꜩ]', 'tz'),
'[ᴝúŭǔûṷüǘǚǜǖṳụűȕùủưứựừửữȗūṻųᶙůũṹṵᵤ]', 'u'),
'[ᵫ]', 'ue'),
'[ꝸ]', 'um'),
'[ʌⱴꝟṿʋᶌⱱṽᵥ]', 'v'),
'[ꝡ]', 'vy'),
'[ʍẃŵẅẇẉẁⱳẘ]', 'w'),
'[ẍẋᶍₓ]', 'x'),
'[ʎýŷÿẏỵỳƴỷỿȳẙɏỹ]', 'y'),
'[źžẑʑⱬżẓȥẕᵶᶎʐƶɀ]', 'z')
How can I remove accents on a string?
Try using COLLATE
:
select 'áéíóú' collate SQL_Latin1_General_Cp1251_CS_AS
For Unicode data, try the following:
select cast(N'áéíóú' as varchar(max)) collate SQL_Latin1_General_Cp1251_CS_AS
I am not sure what you may lose in the translation when using the second approach.
Update
It looks like œ
is a special case, and we have to handle upper and lower case separately. You can do it like this (this code is a good candidate for a user-defined function):
declare @str nvarchar(max) = N'ñaàeéêèioô; Œuf un œuf'
select cast(
replace((
replace(@str collate Latin1_General_CS_AS, 'Œ' collate Latin1_General_CS_AS, 'OE' collate Latin1_General_CS_AS)
) collate Latin1_General_CS_AS, 'œ' collate Latin1_General_CS_AS, 'oe' collate Latin1_General_CS_AS) as varchar(max)
) collate SQL_Latin1_General_Cp1251_CS_AS
-- Output:
-- naaeeeeioo; Oeuf un oeuf
User Defined Function
create function dbo.fnRemoveAccents(@str nvarchar(max))
returns varchar(max) as
begin
return cast(
replace((
replace(@str collate Latin1_General_CS_AS, 'Œ' collate Latin1_General_CS_AS, 'OE' collate Latin1_General_CS_AS)
) collate Latin1_General_CS_AS, 'œ' collate Latin1_General_CS_AS, 'oe' collate Latin1_General_CS_AS) as varchar(max)
) collate SQL_Latin1_General_Cp1251_CS_AS
end
Translate or convert French accented characters to base ASCII characters
you can use the translate function if you want :
translate(upper(ColName),'AAAEEEIIIOOOUUU','ÁÀÄÉÈËÍÌÏÓÒÖÚÙÜ')
How to remove accents and all chars a..z in sql-server?
You can avoid hard-coded REPLACE
statements by using a COLLATE
clause with an accent-insensitive collation to compare the accented alphabetic characters to non-alphabetic ones:
DECLARE
@s1 NVARCHAR(200),
@s2 NVARCHAR(200)
SET @s1 = N'aèàç=.32s df'
SET @s2 = N''
SELECT @s2 = @s2 + no_accent
FROM (
SELECT
SUBSTRING(@s1, number, 1) AS accent,
number
FROM master.dbo.spt_values
WHERE TYPE = 'P'
AND number BETWEEN 1 AND LEN(@s1)
) s1
INNER JOIN (
SELECT NCHAR(number) AS no_accent
FROM master.dbo.spt_values
WHERE type = 'P'
AND (number BETWEEN 65 AND 90 OR number BETWEEN 97 AND 122)
) s2
ON s1.accent COLLATE LATIN1_GENERAL_CS_AI = s2.no_accent
ORDER BY number
SELECT @s1
SELECT @s2
/*
aèàç=.32s df
aeacsdf
*/
DB2 accent insensitive queries
You could change the case of the column and the value you are looking for.
For example
Select upper(lastname)
From myTable
where upper(name) = upper('John')
You can use:
- Upper
Translate
Ucase
However, I am not sure if these functions are available in your very old DB2 version. Plan to upgrade!
DB2 accent insensitive queries
You could change the case of the column and the value you are looking for.
For example
Select upper(lastname)
From myTable
where upper(name) = upper('John')
You can use:
- Upper
Translate
Ucase
However, I am not sure if these functions are available in your very old DB2 version. Plan to upgrade!
DB2 remove trailing 0 and
cast (0.1 as varchar (x))
returns .1
, not 0.1
.
But cast (decfloat (0.1) as varchar (x))
returns 0.1
.
So try the following:
cast (decfloat (i) / 10000 as varchar (20))
How to find special characters in DB2?
You can use the DB2 TRANSLATE()
function to isolate non-alphanumeric characters. Note that this will not work in the Oracle compatibility mode, because in that case DB2 will treat empty strings as NULLs, as Oracle would do.
SELECT *
FROM yourtable
WHERE LENGTH(TRANSLATE(
yourcolumn,
'', -- empty string
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'
)) > 0 -- after translating ASCII characters to empty strings
-- there's still something left
Related Topics
Difference Between Datetime Converts in Msexcel and SQL Server
Cannot Have a Qualifier in the Select List While Performing a Join W/ Using Keyword
Parsing Openxml with Multiple Elements of the Same Name
Is Not Null Test for a Record Does Not Return True When Variable Is Set
Good Database and Structure to Store Synonyms
Delete Duplicate Record from Same Table in MySQL
Average Difference Between Two Dates, Grouped by a Third Field
How to Better Duplicate a Set of Data in SQL Server
Why Do SQL Id Sequences Go Out of Sync (Specifically Using Postgres)
Calculate Fiscal Year in SQL Select Statement
Sql:Remove Last Comma in String
Recursive Cte Stop Condition for Loops
Rolling Sum Previous 3 Months SQL Server
Ora-00600 When Running Alter Command
SQL Query Continues Running for a Very Long Time If Search Term Not Found