How to add file extensions based on file type on Linux/Unix?
Here's mimetypes' version:
#!/usr/bin/env python
"""It is a `filename -> filename.ext` filter.
`ext` is mime-based.
"""
import fileinput
import mimetypes
import os
import sys
from subprocess import Popen, PIPE
if len(sys.argv) > 1 and sys.argv[1] == '--rename':
do_rename = True
del sys.argv[1]
else:
do_rename = False
for filename in (line.rstrip() for line in fileinput.input()):
output, _ = Popen(['file', '-bi', filename], stdout=PIPE).communicate()
mime = output.split(';', 1)[0].lower().strip()
ext = mimetypes.guess_extension(mime, strict=False)
if ext is None:
ext = os.path.extsep + 'undefined'
filename_ext = filename + ext
print filename_ext
if do_rename:
os.rename(filename, filename_ext)
Example:
$ ls *.file? | python add-ext.py --rename
avi.file.avi
djvu.file.undefined
doc.file.dot
gif.file.gif
html.file.html
ico.file.obj
jpg.file.jpe
m3u.file.ksh
mp3.file.mp3
mpg.file.m1v
pdf.file.pdf
pdf.file2.pdf
pdf.file3.pdf
png.file.png
tar.bz2.file.undefined
Following @Phil H's response that follows @csl' response:
#!/usr/bin/env python
"""It is a `filename -> filename.ext` filter.
`ext` is mime-based.
"""
# Mapping of mime-types to extensions is taken form here:
# http://as3corelib.googlecode.com/svn/trunk/src/com/adobe/net/MimeTypeMap.as
mime2exts_list = [
["application/andrew-inset","ez"],
["application/atom+xml","atom"],
["application/mac-binhex40","hqx"],
["application/mac-compactpro","cpt"],
["application/mathml+xml","mathml"],
["application/msword","doc"],
["application/octet-stream","bin","dms","lha","lzh","exe","class","so","dll","dmg"],
["application/oda","oda"],
["application/ogg","ogg"],
["application/pdf","pdf"],
["application/postscript","ai","eps","ps"],
["application/rdf+xml","rdf"],
["application/smil","smi","smil"],
["application/srgs","gram"],
["application/srgs+xml","grxml"],
["application/vnd.adobe.apollo-application-installer-package+zip","air"],
["application/vnd.mif","mif"],
["application/vnd.mozilla.xul+xml","xul"],
["application/vnd.ms-excel","xls"],
["application/vnd.ms-powerpoint","ppt"],
["application/vnd.rn-realmedia","rm"],
["application/vnd.wap.wbxml","wbxml"],
["application/vnd.wap.wmlc","wmlc"],
["application/vnd.wap.wmlscriptc","wmlsc"],
["application/voicexml+xml","vxml"],
["application/x-bcpio","bcpio"],
["application/x-cdlink","vcd"],
["application/x-chess-pgn","pgn"],
["application/x-cpio","cpio"],
["application/x-csh","csh"],
["application/x-director","dcr","dir","dxr"],
["application/x-dvi","dvi"],
["application/x-futuresplash","spl"],
["application/x-gtar","gtar"],
["application/x-hdf","hdf"],
["application/x-javascript","js"],
["application/x-koan","skp","skd","skt","skm"],
["application/x-latex","latex"],
["application/x-netcdf","nc","cdf"],
["application/x-sh","sh"],
["application/x-shar","shar"],
["application/x-shockwave-flash","swf"],
["application/x-stuffit","sit"],
["application/x-sv4cpio","sv4cpio"],
["application/x-sv4crc","sv4crc"],
["application/x-tar","tar"],
["application/x-tcl","tcl"],
["application/x-tex","tex"],
["application/x-texinfo","texinfo","texi"],
["application/x-troff","t","tr","roff"],
["application/x-troff-man","man"],
["application/x-troff-me","me"],
["application/x-troff-ms","ms"],
["application/x-ustar","ustar"],
["application/x-wais-source","src"],
["application/xhtml+xml","xhtml","xht"],
["application/xml","xml","xsl"],
["application/xml-dtd","dtd"],
["application/xslt+xml","xslt"],
["application/zip","zip"],
["audio/basic","au","snd"],
["audio/midi","mid","midi","kar"],
["audio/mpeg","mp3","mpga","mp2"],
["audio/x-aiff","aif","aiff","aifc"],
["audio/x-mpegurl","m3u"],
["audio/x-pn-realaudio","ram","ra"],
["audio/x-wav","wav"],
["chemical/x-pdb","pdb"],
["chemical/x-xyz","xyz"],
["image/bmp","bmp"],
["image/cgm","cgm"],
["image/gif","gif"],
["image/ief","ief"],
["image/jpeg","jpg","jpeg","jpe"],
["image/png","png"],
["image/svg+xml","svg"],
["image/tiff","tiff","tif"],
["image/vnd.djvu","djvu","djv"],
["image/vnd.wap.wbmp","wbmp"],
["image/x-cmu-raster","ras"],
["image/x-icon","ico"],
["image/x-portable-anymap","pnm"],
["image/x-portable-bitmap","pbm"],
["image/x-portable-graymap","pgm"],
["image/x-portable-pixmap","ppm"],
["image/x-rgb","rgb"],
["image/x-xbitmap","xbm"],
["image/x-xpixmap","xpm"],
["image/x-xwindowdump","xwd"],
["model/iges","igs","iges"],
["model/mesh","msh","mesh","silo"],
["model/vrml","wrl","vrml"],
["text/calendar","ics","ifb"],
["text/css","css"],
["text/html","html","htm"],
["text/plain","txt","asc"],
["text/richtext","rtx"],
["text/rtf","rtf"],
["text/sgml","sgml","sgm"],
["text/tab-separated-values","tsv"],
["text/vnd.wap.wml","wml"],
["text/vnd.wap.wmlscript","wmls"],
["text/x-setext","etx"],
["video/mpeg","mpg","mpeg","mpe"],
["video/quicktime","mov","qt"],
["video/vnd.mpegurl","m4u","mxu"],
["video/x-flv","flv"],
["video/x-msvideo","avi"],
["video/x-sgi-movie","movie"],
["x-conference/x-cooltalk","ice"]]
#NOTE: take only the first extension
mime2ext = dict(x[:2] for x in mime2exts_list)
if __name__ == '__main__':
import fileinput, os.path
from subprocess import Popen, PIPE
for filename in (line.rstrip() for line in fileinput.input()):
output, _ = Popen(['file', '-bi', filename], stdout=PIPE).communicate()
mime = output.split(';', 1)[0].lower().strip()
print filename + os.path.extsep + mime2ext.get(mime, 'undefined')
Here's a snippet for old python's versions (not tested):
#NOTE: take only the first extension
mime2ext = {}
for x in mime2exts_list:
mime2ext[x[0]] = x[1]
if __name__ == '__main__':
import os
import sys
# this version supports only stdin (part of fileinput.input() functionality)
lines = sys.stdin.read().split('\n')
for line in lines:
filename = line.rstrip()
output = os.popen('file -bi ' + filename).read()
mime = output.split(';')[0].lower().strip()
try: ext = mime2ext[mime]
except KeyError:
ext = 'undefined'
print filename + '.' + ext
It should work on Python 2.3.5 (I guess).
Creating Files Of Specific Type
There is no "type" of regular file in Linux/Unix. You could register any of your own file extension to open files with specified extension with application you will chose.
Please look at answers for question: Register file extensions / mime types in Linux
Also to understand other association mechanism on Linux named "Shebangs (#!)", you could read this: File extensions and association with programs in linux
Adding extension to file using file command in linux
Something like this should get you started:
#!/bin/bash
for f in "$@"; do
if [[ $f == *'.'* ]]; then continue; fi # Naive check to make sure we don't add duplicate extensions
ext=''
case $(file -b "$f") in
*ASCII*) ext='.txt' ;;
*JPEG*) ext='.jpg' ;;
*PDF*) ext='.pdf' ;;
# etc...
*) continue ;;
esac
mv "${f}" "${f}${ext}"
done
You'll have to check the output of file
for each potential file type to find an appropriate case
label.
Identify and append file extension to files using bash
Your main problem seems to be in how you are assigning variables. When you assign a value to a variable:
- The variable name should not be quoted – opposite case of when using
$
for parameter expansion of the variable - There should only be one equal sign (the second equal sign would be considered as part of the string value)
- There should be no spaces on either side of the equals sign; otherwise, the variable name will be interpreted as a command name.
The following should do what you intend:
for i in *;
do ext=$(file "$i" --mime-type -b | sed 's#.*/##')
mv -v "$i" "$i.$ext"
done
Note: this code makes the same assumptions as your original code, i.e., that all files in the current directory (including any non-regular files such as directories) should be renamed – and that they will be renamed as per their MIME type so plain text files will have a .plain
suffix.
Add extra file extension to all filenames in a directory via Linux command line
Your code should work if no file has spaces (or other "special" character) in the name and if the directory is not pathologically big.
In those cases, you can use something like this:
ls|grep '*.utf8$'|while read i; do mv "$i" "$i.sbd"; done
How can I change the extension of files of a type using find with Bash?
Combining the find -exec bash
idea with the bash
loop idea, you can use the +
terminator on the -exec
to tell find
to pass multiple filenames to a single invocation of the bash command. Pass the new type as the first argument - which shows up in $0
and so is conveniently skipped by a for
loop over the rest of the command-line arguments - and you have a pretty efficient solution:
find . -type f -iname "*.$arg1" -exec bash -c \
'for arg; do mv "$arg" "${arg%.*}.$0"; done' "$arg2" {} +
Alternatively, if you have either version of the Linux rename
command, you can use that. The Perl one (a.k.a. prename
, installed by default on Ubuntu and other Debian-based distributions; also available for OS X from Homebrew via brew install rename
) can be used like this:
find . -type f -iname "*.$arg1" -exec rename 's/\Q'"$arg1"'\E$/'"$arg2"'/' {} +
That looks a bit ugly, but it's really just the s/old/new/
substitution command familiar from many UNIX tools. The \Q
and \E
around $arg1
keep any weird characters inside the suffix from being interpreted as regular expression metacharacters that might match something unexpected; the $
after the \E
makes sure the pattern only matches at the end of the filename.
The pattern-based version installed by default on Red Hat-based Linux distros (Fedora, CentOS, etc) is simpler:
find . -type f -iname "*.$arg1" -exec rename ".$arg1" ".$arg2" {} +
but it's also dumber: if you rename .com .exe stackoverflow.com_scanner.com
, you'll get a file named stackoverflow.exe_scanner.exe
.
Unix shell - find all different types of file extensions under one folder
Or you could write a script in, say, Perl to do it all.
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw/say/;
use File::Find;
# Takes directory/directories to scan as a command line argument
# or current directory if none given
@ARGV = qw/./ unless @ARGV;
my %extensions;
find(sub {
if ($File::Find::name =~ /\.([^.]+)$/) {
$extensions{$1} += 1;
}
}, @ARGV);
for my $ext (sort { $a cmp $b } keys %extensions) {
say "$ext\t$extensions{$ext}";
}
or using bash:
#!/usr/bin/env bash
shopt -s dotglob globstar
declare -A extensions
# Scans the current directory
allfiles=( **/*.* )
for ext in "${allfiles[@]##*.}"; do
extensions["$ext"]=$(( ${extensions["$ext"]:-0} + 1))
done
for ext in "${!extensions[@]}"; do
printf "%s\t%d\n" "$ext" "${extensions[$ext]}"
done | sort -k1,1
or any shell (Won't work well with filenames with newlines; there are ways around that if using, say, GNU userland tools, though):
find . -name "*.*" | sed 's/.*\.\([^.]\{1,\}\)$/\1/' | sort | uniq -c
recursively add file extension to all files
Alternative command without an explicit loop (man find
):
find . -type f -exec mv '{}' '{}'.jpg \;
Explanation: this recursively finds all files (-type f
) starting from the current directory (.
) and applies the move command (mv
) to each of them. Note also the quotes around {}
, so that filenames with spaces (and even newlines...) are properly handled.
Related Topics
Serving a Request from Gunicorn
How to Print a Variable Name in Python
How Does Zip(*[Iter(S)]*N) Work in Python
Read Specific Columns from a CSV File with CSV Module
How to Convert a String with Dot and Comma into a Float in Python
How to Tail a Log File in Python
How Does Swapping of Members in Tuples (A,B)=(B,A) Work Internally
Using an Numpy Array as Indices of the 2Nd Dim of Another Array
How to Use a Dot "." to Access Members of Dictionary
No Module Named 'Virtualenvwrapper'
Getting an "Invalid Syntax" When Trying to Perform String Interpolation
Psycopg2: Insert Multiple Rows with One Query
What's the Difference Between Globals(), Locals(), and Vars()
Regular Expression to Match a Dot