How to Remove Special Characters in File Names

How to remove special characters in file names?

Your rename would work if you add the g modifier to it, this performs all substitutions instead of only the first one:

$ echo "$file"
foo bar,spam.egg

$ rename -n 's/[^a-zA-Z0-9_-]//' "$file"
foo bar,spam.egg renamed as foobar,spam.egg

$ rename -n 's/[^a-zA-Z0-9_-]//g' "$file"
foo bar,spam.egg renamed as foobarspamegg

You can do this will bash alone, with parameter expansion:

  • For removing everything except a-zA-Z0-9_- from file names, assuming variable file contains the filename, using character class [:alnum:] to match all alphabetic characters and digits from current locale:

    "${file//[^[:alnum:]_-]/}"

    or explicitly, change the LC_COLLATE to C:

    "${file//[^a-zA-Z0-9_-]/}"

Example:

$ file='foo bar,spam.egg'

$ echo "${file//[^[:alnum:]_-]/}"
foobarspamegg

How to remove illegal characters from path and filenames?

Try something like this instead;

string illegal = "\"M\"\\a/ry/ h**ad:>> a\\/:*?\"| li*tt|le|| la\"mb.?";
string invalid = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());

foreach (char c in invalid)
{
illegal = illegal.Replace(c.ToString(), "");
}

But I have to agree with the comments, I'd probably try to deal with the source of the illegal paths, rather than try to mangle an illegal path into a legitimate but probably unintended one.

Edit: Or a potentially 'better' solution, using Regex's.

string illegal = "\"M\"\\a/ry/ h**ad:>> a\\/:*?\"| li*tt|le|| la\"mb.?";
string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
illegal = r.Replace(illegal, "");

Still, the question begs to be asked, why you're doing this in the first place.

Remove illegal characters from a file name but leave spaces

Illegal characters are listed here. To replace them use this regex /[/\\?%*:|"<>]/g like this:

var filename = "f?:i/le>  n%a|m\\e.ext";
filename = filename.replace(/[/\\?%*:|"<>]/g, '-');
console.log(filename);

Remove special chars from filenames

Based on the answer of @matias-barrios, I wrote my own solution:

#!/bin/bash
fileList=$(find . -mindepth 1)
echo "$fileList" | awk '{print length, $0}' | sort -rn | cut -d" " -f2- |
while read path; do
dirName=$(echo "$path" | rev | cut -d'/' -f2- | rev)
fileName=$(echo "$path" | rev | cut -d'/' -f1 | rev)
newFileName="$dirName/$(echo "$fileName" | tr -C -d 'a-zA-Z0-9-_.')"
if [ "$path" = "$newFileName" ]; then continue; fi;
echo "From: $path"
echo "To: $newFileName"
mv "$path" "$newFileName"
done

How to mass remove files that contain special characters in file name

This will delete every file whose name ends in (1), recursively:

find . -name '*(1)' -exec rm {} +
  • -name '*(1) (1)' to only delete files ending with a double 1.
  • -name '*([0-9])' will match any single digit.
  • find . -mindepth 1 -maxdepth 1 -name '*(1)' -exec rm {} + for no recursion.
  • I would do find . -name '*(1)' -exec echo rm {} \; to print a neat list of the rm commands to be executed. Review it, then remove echo to run for real.
  • Changing \; back to + is optional, + is more efficient.

I want to remove special characters from File name without affecting extension in c#

Use Path to split file name

var fileName = "Hello%@Im&an#Full-Stack+.Developer.pdf"
var fileNameWoExt = Path.GetFileNameWithoutExtension();
var ext = Path.GetExtension(fileName);
fileNameWoExt = Regex.Replace(fileNameWoExt, @"[^\w]", "_");
var result = fileNameWoExt + ext;
// "Hello__Im_an_Full_Stack__Developer.pdf"

Regex remove special characters in filename except extension

You may remove any chars other than word and dot chars with [^\w.] and any dot not followed with 1+ non-dot chars at the end of the string:

filename = filename.replace(/(?:\.(?![^.]+$)|[^\w.])+/g, "-");

See the regex demo

Details

  • (?: - start of a non-capturing group:

    • \.(?![^.]+$) - any dot not followed with 1+ non-dot chars at the end of the string
    • | - or
    • [^\w.] - any char other than a word char and a dot char
  • )+ - end of the group, repeat 1 or more times.

Another solution (if extensions are always present): split out the extension, run your simpler regex on the first chunk then join back:

var filename = "manuel fernandex – Index Prot.bla.otype 5 (pepito grillo).jpg";var ext = filename.substr(filename.lastIndexOf('.') + 1);var name = filename.substr(0, filename.lastIndexOf('.')); console.log(name.replace(/\W+/g, "-") + "." + ext);

Removing special characters from filename

Using preg replace:

$target_file = $target_dir . preg_replace("/[^a-z0-9\_\-\.]/i", '', basename($_FILES['fileToUpload']["name"]));

This will remove all characters that's not a letter (a-z), a number (0-9) or a dash, underscore or dot (we want to keep the file extension). The i flag in the end makes the match case insensitive.

Update

To make the expression shorter, you can replace a-z0-9\_-part with the word token \w.

The pattern would then be: /[^\w\-\.]/. Here we don't need the i flag, since the word token handles that for us.



Related Topics



Leave a reply



Submit