How to Remove Extended Ascii Characters from a String in T-Sql

How do I remove extended ASCII characters from a string in T-SQL?

OK, give this a try. It seems the same issue they have. Anyway you need to modify it based on your requirements.

CREATE FUNCTION RemoveNonASCII 
(
    @nstring nvarchar(255)
)
RETURNS varchar(255)
AS
BEGIN

    DECLARE @Result varchar(255)
    SET @Result = ''

    DECLARE @nchar nvarchar(1)
    DECLARE @position int

    SET @position = 1
    WHILE @position <= LEN(@nstring)
    BEGIN
        SET @nchar = SUBSTRING(@nstring, @position, 1)
        --Unicode & ASCII are the same from 1 to 255.
        --Only Unicode goes beyond 255
        --0 to 31 are non-printable characters
        IF UNICODE(@nchar) between 32 and 255
            SET @Result = @Result + @nchar
        SET @position = @position + 1
    END

    RETURN @Result

END
GO

Check it out at SqlServerCentral

T SQL Remove ASCII character from SELECT

DECLARE @str VARCHAR(20) = CONCAT('This is a tab--> ', '<--');
SELECT @str, REPLACE(@str, CHAR(9), '');


SELECT [E_Code]
FROM [Person] P
WHERE P.E_Code NOT IN (
        SELECT REPLACE(dsc, CHAR(9), '')
        FROM OPENQUERY(svr01, 'select "dsc" from TST.eth')
    )

SQL Server - Remove all non-printable ASCII characters

Another Option.

This function will replace control characters and correct any residual repeating spaces. For example Jane Smith{13}was here will not be returned as Jane Smithwas here, but rather Jane Smith was here

CREATE FUNCTION [dbo].[udf-Str-Strip-Control](@S varchar(max))
Returns varchar(max)
Begin
    ;with  cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
           cte2(C) As (Select Top (32) Char(Row_Number() over (Order By (Select NULL))-1) From cte1 a,cte1 b)
    Select @S = Replace(@S,C,' ')
     From  cte2

    Return ltrim(rtrim(replace(replace(replace(@S,' ','†‡'),'‡†',''),'†‡',' ')))
End
--Select [dbo].[udf-Str-Strip-Control]('Michael        '+char(13)+char(10)+'LastName')  --Returns: Michael LastName

Replacing non-ascii characters in MSSQL

You cannot do it any other way than the old-fashioned, "hard" way (in any language, even, not only SQL).

Since in many (spoken/written) languages, accented characters are not the same as non-accented ones, it's actually just a visual similarity, so there is no true correspondance. Some letters can look like just the same with a symbol added, but actually have a completely different "sound" and different rules (example).

Just build a table, or array, or whatever of the correspondances you know, or find, or can think of.


char | replacement
------------------
à    | a
è    | e
é    | e
ñ    | n
ç    | c
ß    | B
...

Then, loop on your data and replacement chars to replace all characters by their correspondance.

How to remove accents and all chars a..z in sql-server?

You can avoid hard-coded REPLACE statements by using a COLLATE clause with an accent-insensitive collation to compare the accented alphabetic characters to non-alphabetic ones:

DECLARE 
  @s1 NVARCHAR(200),
  @s2 NVARCHAR(200)

SET @s1 = N'aèàç=.32s df' 

SET @s2 = N''
SELECT @s2 = @s2 + no_accent 
FROM ( 
  SELECT 
    SUBSTRING(@s1, number, 1) AS accent,
    number
  FROM master.dbo.spt_values 
  WHERE TYPE = 'P'
  AND number BETWEEN 1 AND LEN(@s1) 
) s1 
INNER JOIN (
  SELECT NCHAR(number) AS no_accent
  FROM master.dbo.spt_values 
  WHERE type = 'P'
  AND (number BETWEEN 65 AND 90 OR number BETWEEN 97 AND 122) 
) s2 
  ON s1.accent COLLATE LATIN1_GENERAL_CS_AI = s2.no_accent 
ORDER BY number

SELECT @s1 
SELECT @s2 

/*
aèàç=.32s df
aeacsdf
*/

Replace Unicode characters in T-SQL

The þ character (Extended ASCII { via ISO-8859-1 and ANSI Code Page 1252 } & UNICODE value of 254) is known as "thorn" and in some languages equates directly to th:

Technical info on the character here: http://unicode-table.com/en/00FE/
Explanation of that character and collations here: http://userguide.icu-project.org/collation/customization. Search the page — typically Control-F — for "Complex Tailoring Examples" and you will see the following:
The letter 'þ' (THORN) is normally treated by UCA/root collation as a separate letter that has primary-level sorting after 'z'. However, in Swedish and some other Scandinavian languages, 'þ' and 'Þ' should be treated as just a tertiary-level difference from the letters "th" and "TH" respectively.

If you do not want þ to equate to th, then force a Binary collation as follows:

SELECT REPLACE(N'this is the news with a þ' COLLATE Latin1_General_100_BIN2,
                 N'þ', N'__');

Returns:

this is the news with a __

For more info on working with Collations, Unicode, encodings, etc, please visit: Collations Info