How to remove all special characters in Linux text
Remove everything except the printable characters (character class [:print:]
), with sed
:
sed $'s/[^[:print:]\t]//g' file.txt
[:print:]
includes:
[:alnum:]
(alpha-numerics)[:punct:]
(punctuations)- space
The ANSI C quoting ($''
) is used for interpreting \t
as literal tab inside $''
(in bash
and alike).
Remove all special characters and case from string in bash
cat yourfile.txt | tr -dc '[:alnum:]\n\r' | tr '[:upper:]' '[:lower:]'
The first tr
deletes special characters. d
means delete, c
means complement (invert the character set). So, -dc
means delete all characters except those specified. The \n
and \r
are included to preserve linux or windows style newlines, which I assume you want.
The second one translates uppercase characters to lowercase.
Removing all special characters from a string in Bash
You can use tr
to print only the printable characters from a string like below. Just use the below command on your input file.
tr -cd "[:print:]\n" < file1
The flag -d
is meant to the delete the character sets defined in the arguments on the input stream, and -c
is for complementing those (invert what's provided). So without -c
the command would delete all printable characters from the input stream and using it complements it by removing the non-printable characters. We also keep the newline character \n
to preserve the line endings in the input file. Removing it would just produce the final output in one big line.
The [:print:]
is just a POSIX bracket expression which is a combination of expressions [:alnum:]
, [:punct:]
and space. The [:alnum:]
is same as [0-9A-Za-z]
and [:punct:]
includes characters !
"
#
$
%
&
'
(
)
*
+
,
-
.
/
:
;
<
=
>
?
@
[
\
]
^
_
`
{
|
}
~
how to remove special characters using sed
If you want sed
to not consider e.g. Arabic characters to be alphabetic (which they are), you need to set a locale that does not consider them thus.
The "C" locale only considers the basic character set, i.e. only [A-Za-z]
are alphabetic. I am assuming what you want is to delete everything that's not a character from that range (your question is fuzzy about what you really want):
echo -e "A \xd8\xa8" | LC_CTYPE=C sed -r "s/[^[:alpha:]]//g" | hexdump -C
Output:
00000000 41 0a
00000002
Remove special Characters from a specific field
Assuming you want to remove all characters that are not upper case or lower case letters or digits ([A-Za-z0-9]
) from the last field of every line you can use
awk -F '|' -v 'OFS=|' '{ gsub(/[^A-Za-z0-9]/,"",$NF); print}' inputfile > outputfile
From the input line in the question this creates exactly the requested output line.
How to remove special characters from text file
Remove characters that are not within the ascii table (11,12,40-176)
\11 = tab
\12 = new line
\40-176 = ( to ~ this range includes all letters and symbols present in the keyboard
cat test.txt | tr -cd '\11\12\40-\176' > temp && mv temp test.txt
NOTE: If your data has special characters that are not in the ascii table, they might be removed as well
Remove all special characters even 'éèô' from string
The simplest would be to run the command with the C
locale:
echo "SamPlE_@tExT%, reééééally ?" | LANG=C sed 's/[^a-zA-Z]//g'
Output:
SamPlEtExTreally
how to remove the special characters from a variable using shell
The following solution uses the tr
command:
$ str=`echo '"#$hello,)&^this I!s> m@ani: /& "'`
$ echo $str | tr -cd "[:alnum:]\"\n"
"hellothisIsmani"
All letters and digits, all " and new lines are allowed. If you want more or less characters to be allowed change the command.
Related Topics
Gurus Say That Ld_Library_Path Is Bad - What's the Alternative
Loading Multiple Shared Libraries with Different Versions
How to Runtime Debug Shared Libraries
How to Automatically Start a Node.Js Application in Amazon Linux Ami on Aws
Simulate a Faulty Block Device with Read Errors
Identifying Received Signal Name in Bash
Postgresql -Bash: Psql: Command Not Found
Installing R from Cran Ubuntu Repository: No Public Key Error
Error with Gradlew: /Usr/Bin/Env: Bash: No Such File or Directory
What Should Linux/Unix 'Make Install' Consist Of
Shell Script to Get the Process Id on Linux
Fuzzy File Search in Linux Console
Print Date for the Monday of the Current Week (In Bash)
Matlab on Linux Can't Plot Anything(Can't Load Libstdc++.So.6: Version 'Cxxabi_1.3.8' Not Found)