Remove Weird Characters ( a with Hat) from SQL Server Varchar Column

Remove weird characters ( A with hat) from SQL Server varchar column

You can use .net regular expression functions. For example, using Regex.Replace:

Regex.Replace(s, @"[^\u0000-\u007F]", string.Empty);

As there is no support for regular expressions in SQL Server you need to create a SQL CLR function. More information about the .net integration in SQL Server can be found here:

  • String Utility Functions Sample - full working examples
  • Stairway to SQLCLR - still in progress
  • Introduction to SQL Server CLR Integration - official documentation

In your case:

  1. Open Visual Studio and create Class Library Project:

    Sample Image

  2. Then rename the class to StackOverflow and paste the following code in its file:

    using Microsoft.SqlServer.Server;
    using System;
    using System.Collections.Generic;
    using System.Data.SqlTypes;
    using System.Linq;
    using System.Text;
    using System.Text.RegularExpressions;
    using System.Threading.Tasks;

    public class StackOverflow
    {
    [SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, Name = "RegexReplace")]
    public static SqlString Replace(SqlString sqlInput, SqlString sqlPattern, SqlString sqlReplacement)
    {
    string input = (sqlInput.IsNull) ? string.Empty : sqlInput.Value;
    string pattern = (sqlPattern.IsNull) ? string.Empty : sqlPattern.Value;
    string replacement = (sqlReplacement.IsNull) ? string.Empty : sqlReplacement.Value;
    return new SqlString(Regex.Replace(input, pattern, replacement));
    }
    }
  3. Now, build the project. Open the SQL Server Management Studio. Select your database and replace the path value of the following FROM clause to match your StackOverflow.dll:

    CREATE ASSEMBLY [StackOverflow] FROM 'C:\Users\gotqn\Desktop\StackOverflow\StackOverflow\bin\Debug\StackOverflow.dll';
  4. Finally, create the SQL CLR function:

    CREATE FUNCTION [dbo].[StackOverflowRegexReplace] (@input NVARCHAR(MAX),@pattern NVARCHAR(MAX), @replacement NVARCHAR(MAX))
    RETURNS NVARCHAR(4000)
    AS EXTERNAL NAME [StackOverflow].[StackOverflow].[Replace]
    GO

You are ready to use RegexReplace .net function directly in your T-SQL statements:

    SELECT [dbo].[StackOverflowRegexReplace] ('Hello Kitty Essential Accessory Kit', '[^\u0000-\u007F]', '')

//Hello Kitty Essential Accessory Kit

How to remove all special characters from a column name string using regular expressions

Try This:

DECLARE @str VARCHAR(400)='S$d#@gh'
DECLARE @expres VARCHAR(50) = '%[~,@,#,$,%,&,*,(,),.,!]%'

WHILE PATINDEX( @expres, @str ) > 0
SET @str = Replace(REPLACE( @str, SUBSTRING( @str, PATINDEX( @expres, @str ), 1 ),''),'-',' ')
SELECT @str

You can Create a SCALAR FUNCTION & pass this Column

CREATE FUNCTION dbo.Remove_SpecialCharacters( @str VARCHAR(MAX))
RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE @expres VARCHAR(50) = '%[~,@,#,$,%,&,*,(,),.,!]%'

WHILE PATINDEX( @expres, @str ) > 0
SET @str = Replace(REPLACE( @str, SUBSTRING( @str, PATINDEX( @expres, @str ), 1 ),''),'-',' ')
RETURN @str
END

O/P

SELECT dbo.Remove_SpecialCharacters(COLUMNNAME),* FROM TABLENAME

How to remove accents and all chars a..z in sql-server?

You can avoid hard-coded REPLACE statements by using a COLLATE clause with an accent-insensitive collation to compare the accented alphabetic characters to non-alphabetic ones:

DECLARE 
@s1 NVARCHAR(200),
@s2 NVARCHAR(200)

SET @s1 = N'aèàç=.32s df'

SET @s2 = N''
SELECT @s2 = @s2 + no_accent
FROM (
SELECT
SUBSTRING(@s1, number, 1) AS accent,
number
FROM master.dbo.spt_values
WHERE TYPE = 'P'
AND number BETWEEN 1 AND LEN(@s1)
) s1
INNER JOIN (
SELECT NCHAR(number) AS no_accent
FROM master.dbo.spt_values
WHERE type = 'P'
AND (number BETWEEN 65 AND 90 OR number BETWEEN 97 AND 122)
) s2
ON s1.accent COLLATE LATIN1_GENERAL_CS_AI = s2.no_accent
ORDER BY number

SELECT @s1
SELECT @s2

/*
aèàç=.32s df
aeacsdf
*/

Select query to remove non-numeric characters

See this blog post on extracting numbers from strings in SQL Server. Below is a sample using a string in your example:

DECLARE @textval NVARCHAR(30)
SET @textval = 'AB ABCDE # 123'

SELECT LEFT(SUBSTRING(@textval, PATINDEX('%[0-9.-]%', @textval), 8000),
PATINDEX('%[^0-9.-]%', SUBSTRING(@textval, PATINDEX('%[0-9.-]%', @textval), 8000) + 'X') -1)

Remove special characters from a database field

update mytable
set FieldName = REPLACE(FieldName,'/','')

That's a good place to start.

PL/SQL code to remove all the special characters from a particular column of a table

You can use below SQL query to get this output:

with raw_Text as (select '3890,john,melbourne,  ,developer,45' r from dual  union all select '2895,david,sydney,ana''lyst,61' from dual union all select '5198,kelly,perth,mana@ger,78' from dual
union all select '7071,tim,canberra,tes/ter,61' from dual
union all select '8132,mike,brisbane,leader,51' from dual ),
cleaned_text as(select regexp_replace(regexp_replace(r,'[^A-Za-z0-9,]',''),',,',',') ct from raw_text)
select substr(ct, 1,instr(ct,',',1,1)-1) id,
substr(ct,instr(ct,',',1,1)+1,instr(ct,',',1,2)-instr(ct,',',1,1)-1) name,
substr(ct,instr(ct,',',1,2)+1,instr(ct,',',1,3)-instr(ct,',',1,2)-1) location,
substr(ct,instr(ct,',',1,3)+1,instr(ct,',',1,4)-instr(ct,',',1,3)-1) profession,
substr(ct,instr(ct,',',1,4)+1) age
from cleaned_text```

--Edit to add PL/SQL code:
DECLARE
F UTL_FILE.FILE_TYPE;
V_LINE VARCHAR2 (32767);
V_id VARCHAR2(1000);
V_name VARCHAR2(1000);
V_location VARCHAR2(1000);
V_profession VARCHAR2(1000);
V_age VARCHAR2(1000);
v_ct varchar2(32767);

BEGIN
F := UTL_FILE.FOPEN ('DATA_PUMP_DIR', 'person_list.CSV', 'R');
IF UTL_FILE.IS_OPEN(F) THEN
LOOP
BEGIN
UTL_FILE.GET_LINE(F, V_LINE, 1000);
IF V_LINE IS NULL THEN
EXIT;
END IF;

v_ct:= regexp_replace(regexp_replace(V_LINE,'[^A-Za-z0-9,]',''),',,',',') ;
v_Id:=substr(ct, 1,instr(ct,',',1,1)-1) ;
v_name:=substr(ct,instr(ct,',',1,1)+1,instr(ct,',',1,2)-instr(ct,',',1,1)-1) ;
v_location:=substr(ct,instr(ct,',',1,2)+1,instr(ct,',',1,3)-instr(ct,',',1,2)-1) ;
v_profession:=substr(ct,instr(ct,',',1,3)+1,instr(ct,',',1,4)-instr(ct,',',1,3)-1) ;
v_age:=substr(ct,instr(ct,',',1,4)+1);

INSERT INTO person_list
values(v_id,v_name,v_location,v_profession,v_age);
COMMIT;
EXCEPTION
WHEN NO_DATA_FOUND THEN
EXIT;
END;
END LOOP;
END IF;
UTL_FILE.FCLOSE(F);
END;
/

How to strip all non-alphabetic characters from string in SQL Server?

Try this function:

Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin

Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^a-z]%'
While PatIndex(@KeepValues, @Temp) > 0
Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')

Return @Temp
End

Call it like this:

Select dbo.RemoveNonAlphaCharacters('abc1234def5678ghi90jkl')

Once you understand the code, you should see that it is relatively simple to change it to remove other characters, too. You could even make this dynamic enough to pass in your search pattern.

How can I remove accents on a string?

Try using COLLATE:

select 'áéíóú' collate SQL_Latin1_General_Cp1251_CS_AS

For Unicode data, try the following:

select cast(N'áéíóú' as varchar(max)) collate SQL_Latin1_General_Cp1251_CS_AS

I am not sure what you may lose in the translation when using the second approach.

Update

It looks like œ is a special case, and we have to handle upper and lower case separately. You can do it like this (this code is a good candidate for a user-defined function):

declare @str nvarchar(max) = N'ñaàeéêèioô; Œuf un œuf'
select cast(
replace((
replace(@str collate Latin1_General_CS_AS, 'Œ' collate Latin1_General_CS_AS, 'OE' collate Latin1_General_CS_AS)
) collate Latin1_General_CS_AS, 'œ' collate Latin1_General_CS_AS, 'oe' collate Latin1_General_CS_AS) as varchar(max)
) collate SQL_Latin1_General_Cp1251_CS_AS
-- Output:
-- naaeeeeioo; Oeuf un oeuf

User Defined Function

create function dbo.fnRemoveAccents(@str nvarchar(max))  
returns varchar(max) as
begin
return cast(
replace((
replace(@str collate Latin1_General_CS_AS, 'Œ' collate Latin1_General_CS_AS, 'OE' collate Latin1_General_CS_AS)
) collate Latin1_General_CS_AS, 'œ' collate Latin1_General_CS_AS, 'oe' collate Latin1_General_CS_AS) as varchar(max)
) collate SQL_Latin1_General_Cp1251_CS_AS
end

How can I see all the special characters permissible in a varchar or char field in SQL Server?

You probably just need to see the ASCII and EXTENDED ASCII character sets. As far as I know any of these are allowed in a char/varchar field.

If you use nchar/nvarchar then it's pretty much any character in any unicode set in the world.

Sample Image

Sample Image



Related Topics



Leave a reply



Submit