Remove weird characters ( A with hat) from SQL Server varchar column
You can use .net
regular expression functions. For example, using Regex.Replace
:
Regex.Replace(s, @"[^\u0000-\u007F]", string.Empty);
As there is no support for regular expressions in SQL Server
you need to create a SQL CLR
function. More information about the .net
integration in SQL Server
can be found here:
- String Utility Functions Sample - full working examples
- Stairway to SQLCLR - still in progress
- Introduction to SQL Server CLR Integration - official documentation
In your case:
Open
Visual Studio
and createClass Library Project
:Then rename the class to
StackOverflow
and paste the following code in its file:using Microsoft.SqlServer.Server;
using System;
using System.Collections.Generic;
using System.Data.SqlTypes;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
public class StackOverflow
{
[SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, Name = "RegexReplace")]
public static SqlString Replace(SqlString sqlInput, SqlString sqlPattern, SqlString sqlReplacement)
{
string input = (sqlInput.IsNull) ? string.Empty : sqlInput.Value;
string pattern = (sqlPattern.IsNull) ? string.Empty : sqlPattern.Value;
string replacement = (sqlReplacement.IsNull) ? string.Empty : sqlReplacement.Value;
return new SqlString(Regex.Replace(input, pattern, replacement));
}
}Now, build the project. Open the
SQL Server Management Studio
. Select your database and replace the path value of the followingFROM
clause to match yourStackOverflow.dll
:CREATE ASSEMBLY [StackOverflow] FROM 'C:\Users\gotqn\Desktop\StackOverflow\StackOverflow\bin\Debug\StackOverflow.dll';
Finally, create the
SQL CLR
function:CREATE FUNCTION [dbo].[StackOverflowRegexReplace] (@input NVARCHAR(MAX),@pattern NVARCHAR(MAX), @replacement NVARCHAR(MAX))
RETURNS NVARCHAR(4000)
AS EXTERNAL NAME [StackOverflow].[StackOverflow].[Replace]
GO
You are ready to use RegexReplace
.net
function directly in your T-SQL
statements:
SELECT [dbo].[StackOverflowRegexReplace] ('Hello Kitty Essential Accessory Kit', '[^\u0000-\u007F]', '')
//Hello Kitty Essential Accessory Kit
How to remove all special characters from a column name string using regular expressions
Try This:
DECLARE @str VARCHAR(400)='S$d#@gh'
DECLARE @expres VARCHAR(50) = '%[~,@,#,$,%,&,*,(,),.,!]%'
WHILE PATINDEX( @expres, @str ) > 0
SET @str = Replace(REPLACE( @str, SUBSTRING( @str, PATINDEX( @expres, @str ), 1 ),''),'-',' ')
SELECT @str
You can Create a SCALAR FUNCTION & pass this Column
CREATE FUNCTION dbo.Remove_SpecialCharacters( @str VARCHAR(MAX))
RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE @expres VARCHAR(50) = '%[~,@,#,$,%,&,*,(,),.,!]%'
WHILE PATINDEX( @expres, @str ) > 0
SET @str = Replace(REPLACE( @str, SUBSTRING( @str, PATINDEX( @expres, @str ), 1 ),''),'-',' ')
RETURN @str
END
O/P
SELECT dbo.Remove_SpecialCharacters(COLUMNNAME),* FROM TABLENAME
How to remove accents and all chars a..z in sql-server?
You can avoid hard-coded REPLACE
statements by using a COLLATE
clause with an accent-insensitive collation to compare the accented alphabetic characters to non-alphabetic ones:
DECLARE
@s1 NVARCHAR(200),
@s2 NVARCHAR(200)
SET @s1 = N'aèàç=.32s df'
SET @s2 = N''
SELECT @s2 = @s2 + no_accent
FROM (
SELECT
SUBSTRING(@s1, number, 1) AS accent,
number
FROM master.dbo.spt_values
WHERE TYPE = 'P'
AND number BETWEEN 1 AND LEN(@s1)
) s1
INNER JOIN (
SELECT NCHAR(number) AS no_accent
FROM master.dbo.spt_values
WHERE type = 'P'
AND (number BETWEEN 65 AND 90 OR number BETWEEN 97 AND 122)
) s2
ON s1.accent COLLATE LATIN1_GENERAL_CS_AI = s2.no_accent
ORDER BY number
SELECT @s1
SELECT @s2
/*
aèàç=.32s df
aeacsdf
*/
Select query to remove non-numeric characters
See this blog post on extracting numbers from strings in SQL Server. Below is a sample using a string in your example:
DECLARE @textval NVARCHAR(30)
SET @textval = 'AB ABCDE # 123'
SELECT LEFT(SUBSTRING(@textval, PATINDEX('%[0-9.-]%', @textval), 8000),
PATINDEX('%[^0-9.-]%', SUBSTRING(@textval, PATINDEX('%[0-9.-]%', @textval), 8000) + 'X') -1)
Remove special characters from a database field
update mytable
set FieldName = REPLACE(FieldName,'/','')
That's a good place to start.
PL/SQL code to remove all the special characters from a particular column of a table
You can use below SQL query to get this output:
with raw_Text as (select '3890,john,melbourne, ,developer,45' r from dual union all select '2895,david,sydney,ana''lyst,61' from dual union all select '5198,kelly,perth,mana@ger,78' from dual
union all select '7071,tim,canberra,tes/ter,61' from dual
union all select '8132,mike,brisbane,leader,51' from dual ),
cleaned_text as(select regexp_replace(regexp_replace(r,'[^A-Za-z0-9,]',''),',,',',') ct from raw_text)
select substr(ct, 1,instr(ct,',',1,1)-1) id,
substr(ct,instr(ct,',',1,1)+1,instr(ct,',',1,2)-instr(ct,',',1,1)-1) name,
substr(ct,instr(ct,',',1,2)+1,instr(ct,',',1,3)-instr(ct,',',1,2)-1) location,
substr(ct,instr(ct,',',1,3)+1,instr(ct,',',1,4)-instr(ct,',',1,3)-1) profession,
substr(ct,instr(ct,',',1,4)+1) age
from cleaned_text```
--Edit to add PL/SQL code:
DECLARE
F UTL_FILE.FILE_TYPE;
V_LINE VARCHAR2 (32767);
V_id VARCHAR2(1000);
V_name VARCHAR2(1000);
V_location VARCHAR2(1000);
V_profession VARCHAR2(1000);
V_age VARCHAR2(1000);
v_ct varchar2(32767);
BEGIN
F := UTL_FILE.FOPEN ('DATA_PUMP_DIR', 'person_list.CSV', 'R');
IF UTL_FILE.IS_OPEN(F) THEN
LOOP
BEGIN
UTL_FILE.GET_LINE(F, V_LINE, 1000);
IF V_LINE IS NULL THEN
EXIT;
END IF;
v_ct:= regexp_replace(regexp_replace(V_LINE,'[^A-Za-z0-9,]',''),',,',',') ;
v_Id:=substr(ct, 1,instr(ct,',',1,1)-1) ;
v_name:=substr(ct,instr(ct,',',1,1)+1,instr(ct,',',1,2)-instr(ct,',',1,1)-1) ;
v_location:=substr(ct,instr(ct,',',1,2)+1,instr(ct,',',1,3)-instr(ct,',',1,2)-1) ;
v_profession:=substr(ct,instr(ct,',',1,3)+1,instr(ct,',',1,4)-instr(ct,',',1,3)-1) ;
v_age:=substr(ct,instr(ct,',',1,4)+1);
INSERT INTO person_list
values(v_id,v_name,v_location,v_profession,v_age);
COMMIT;
EXCEPTION
WHEN NO_DATA_FOUND THEN
EXIT;
END;
END LOOP;
END IF;
UTL_FILE.FCLOSE(F);
END;
/
How to strip all non-alphabetic characters from string in SQL Server?
Try this function:
Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^a-z]%'
While PatIndex(@KeepValues, @Temp) > 0
Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')
Return @Temp
End
Call it like this:
Select dbo.RemoveNonAlphaCharacters('abc1234def5678ghi90jkl')
Once you understand the code, you should see that it is relatively simple to change it to remove other characters, too. You could even make this dynamic enough to pass in your search pattern.
How can I remove accents on a string?
Try using COLLATE
:
select 'áéíóú' collate SQL_Latin1_General_Cp1251_CS_AS
For Unicode data, try the following:
select cast(N'áéíóú' as varchar(max)) collate SQL_Latin1_General_Cp1251_CS_AS
I am not sure what you may lose in the translation when using the second approach.
Update
It looks like œ
is a special case, and we have to handle upper and lower case separately. You can do it like this (this code is a good candidate for a user-defined function):
declare @str nvarchar(max) = N'ñaàeéêèioô; Œuf un œuf'
select cast(
replace((
replace(@str collate Latin1_General_CS_AS, 'Œ' collate Latin1_General_CS_AS, 'OE' collate Latin1_General_CS_AS)
) collate Latin1_General_CS_AS, 'œ' collate Latin1_General_CS_AS, 'oe' collate Latin1_General_CS_AS) as varchar(max)
) collate SQL_Latin1_General_Cp1251_CS_AS
-- Output:
-- naaeeeeioo; Oeuf un oeuf
User Defined Function
create function dbo.fnRemoveAccents(@str nvarchar(max))
returns varchar(max) as
begin
return cast(
replace((
replace(@str collate Latin1_General_CS_AS, 'Œ' collate Latin1_General_CS_AS, 'OE' collate Latin1_General_CS_AS)
) collate Latin1_General_CS_AS, 'œ' collate Latin1_General_CS_AS, 'oe' collate Latin1_General_CS_AS) as varchar(max)
) collate SQL_Latin1_General_Cp1251_CS_AS
end
How can I see all the special characters permissible in a varchar or char field in SQL Server?
You probably just need to see the ASCII
and EXTENDED ASCII
character sets. As far as I know any of these are allowed in a char
/varchar
field.
If you use nchar
/nvarchar
then it's pretty much any character in any unicode set in the world.
Related Topics
"The Controls Collection Cannot Be Modified Because the Control Contains Code Blocks"
Conditional Compilation and Framework Targets
How to Add System.Windows.Interactivity to Project
How to Get the Full Path of Running Process
How to Get Windows Display Settings
How to Decompile a .Net Exe into Readable C# Source Code
ASP.NET MVC Custom Error Handling Application_Error Global.Asax
How to Use Reflection to Inspect the Code in a Method
Easiest Way to Parse JSON Response
How to Shut Down the Computer from C#
Serialize Class Containing Dictionary Member
How to Get a List of Users from Active Directory
How to Use Visual Studio Code to Develop Unity3D Projects in Ubuntu