How to Strip All Non-Alphabetic Characters from String in SQL Server

How to strip all non-alphabetic characters from string in SQL Server?

Try this function:

Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin

Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^a-z]%'
While PatIndex(@KeepValues, @Temp) > 0
Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')

Return @Temp
End

Call it like this:

Select dbo.RemoveNonAlphaCharacters('abc1234def5678ghi90jkl')

Once you understand the code, you should see that it is relatively simple to change it to remove other characters, too. You could even make this dynamic enough to pass in your search pattern.

Function to remove all Non-alpha-numeric characters, superscripts, and subscripts, except a dash '-'

Check this link. This removes all alpha numeric characters. You can include '-' also to the included list.

How to strip all non-alphabetic characters from string in SQL Server?

In this example for the answer from @George Mastros, use '%[^a-zA-Z0-9-]%' for regular expression instead of '%[^a-z]%'

Here is the reformatted function to include '-' and numeric characters:

-- Reformatted function
Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS

Begin
Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^a-zA-Z0-9\-]%'
While PatIndex(@KeepValues, @Temp) > 0
Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')

Return @Temp
End

--Call function


Select dbo.RemoveNonAlphaCharacters('Hello® World™ -123 !@#$%^')

OUTPUT: HelloWorld-123

How to remove all non-alphabetic characters from a column in SQL?

Firstly if this is for peoples' names then this is a terrible idea.

Globally names can certainly contain spaces, punctuation characters, accented characters and non Latin characters (and even numbers).

Nonetheless there may be other legitimate use cases for wanting to "remove all non-alphabetic characters from a column"

@lptr posted an interesting idea in the comments that heavily inspires this answer.

First use TRANSLATE to get a list of characters that need to be cleaned. Then call TRANSLATE again using this list and replacing them all with a space. Finally remove all spaces and convert to uppercase as desired in your case.

DECLARE @t TABLE
(
colX VARCHAR(100)
);

INSERT INTO @t
(colX)
VALUES ('@#£ab£cd&123x/=+xz'),
('%-+=/;:,.abc@#£&*()'),
('abc@#£&*() z')

SELECT *,
Cleaned = UPPER(REPLACE(translate(colx, bad_chars, SPACE(len(bad_chars))), ' ', ''))
FROM @t
CROSS APPLY (VALUES(replace(translate(colx, 'abcdefghijklmnopqrstuvwxyz' COLLATE Latin1_General_100_CI_AS, replicate('a', 26)), 'a', '') + '~')) V(bad_chars)

Returns

+---------------------+-------------------+---------+
| colX | bad_chars | Cleaned |
+---------------------+-------------------+---------+
| @#£ab£cd&123x/=+xz | @#££&123/=+~ | ABCDXXZ |
| %-+=/;:,.abc@#£&*() | %-+=/;:,.@#£&*()~ | ABC |
| abc@#£&*() z | @#£&*() ~ | ABCZ |
+---------------------+-------------------+---------+

Replace non-alpha characters only from the beginning of a string

A pattern could be used to replace non-alphabetic characters from a string as below:

Declare @Temp as varchar(50)
set @Temp = '6Hai'
Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^a-z]%'
While PatIndex(@KeepValues, @Temp) = 1
Set @Temp = Stuff(@Temp, 1, 1, '')

select @Temp

Or a function could be defined as :

Create Function [dbo].[RemoveNonAlphaCharactersFromStart](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin

Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^a-z]%'
While PatIndex(@KeepValues, @Temp) = 1
Set @Temp = Stuff(@Temp, 1, 1, '')

Return @Temp
End

and invoke the function as :

Select dbo.RemoveNonAlphaCharactersFromStart('6Hai')

Refer this answer to remove all non-alpha characters : https://stackoverflow.com/a/1008566/7804477

I have updated the function such that only the non-alphabetic characters from the beginning are removed:

Stuff and PatIndex are built-in functions.

This is the crucial condition to check only at the beginning of the string: While PatIndex(@KeepValues, @Temp) = 1

Stuff(@Temp, 1, 1, '') - this replaces the first character in the string with an empty character ''. (the second argument is the starting position and the third argument denotes the number of characters to replace)

T-SQL strip all non-alpha and non-numeric characters

One flexible-ish way;

CREATE FUNCTION [dbo].[fnRemovePatternFromString](@BUFFER VARCHAR(MAX), @PATTERN VARCHAR(128)) RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE @POS INT = PATINDEX(@PATTERN, @BUFFER)
WHILE @POS > 0 BEGIN
SET @BUFFER = STUFF(@BUFFER, @POS, 1, '')
SET @POS = PATINDEX(@PATTERN, @BUFFER)
END
RETURN @BUFFER
END

select dbo.fnRemovePatternFromString('cake & beer $3.99!?c', '%[$&.!?]%')

(No column name)
cake beer 399c

How to remove non-alphanumeric characters in SQL using Regex methods?

You could use REGEXP_REPLACE with a regex of [^a-z0-9]+$, replacing that with nothing. The regex will match any number of non-alphanumeric characters before the end of the string.

SELECT REGEXP_REPLACE(my_column, '[^a-z0-9]+$', '', 1, 'i')
FROM my_table

SQL Server : how to remove leading/trailing non-alphanumeric characters from string?

Well, if you know they are only at the beginning and end, you can do:

with t as (
select *
from (values ('www.google.com'), ('''www.google.com'''), ('/www.google.com')) v(text)
)
select t.text, v2.text2
from t cross apply
(values (stuff(t.text, 1, patindex('%[a-zA-Z0-9]%', t.text) - 1, ''))
) v(text1) cross apply
(values (case when v.text1 like '%[^a-zA-Z0-9]'
then stuff(v.text1, len(text) + 1 - patindex('%[a-zA-Z0-9]%', reverse(v.text1)), len(v.text1), '')
else v.text1
end)
) v2(text2);

Here is a db<>fiddle.



Related Topics



Leave a reply



Submit