How to Add File Extensions Based on File Type on Linux/Unix

How to add file extensions based on file type on Linux/Unix?

Here's mimetypes' version:

#!/usr/bin/env python
"""It is a `filename -> filename.ext` filter.

`ext` is mime-based.

"""
import fileinput
import mimetypes
import os
import sys
from subprocess import Popen, PIPE

if len(sys.argv) > 1 and sys.argv[1] == '--rename':
do_rename = True
del sys.argv[1]
else:
do_rename = False

for filename in (line.rstrip() for line in fileinput.input()):
output, _ = Popen(['file', '-bi', filename], stdout=PIPE).communicate()
mime = output.split(';', 1)[0].lower().strip()
ext = mimetypes.guess_extension(mime, strict=False)
if ext is None:
ext = os.path.extsep + 'undefined'
filename_ext = filename + ext
print filename_ext
if do_rename:
os.rename(filename, filename_ext)

Example:


$ ls *.file? | python add-ext.py --rename
avi.file.avi
djvu.file.undefined
doc.file.dot
gif.file.gif
html.file.html
ico.file.obj
jpg.file.jpe
m3u.file.ksh
mp3.file.mp3
mpg.file.m1v
pdf.file.pdf
pdf.file2.pdf
pdf.file3.pdf
png.file.png
tar.bz2.file.undefined

Following @Phil H's response that follows @csl' response:

#!/usr/bin/env python
"""It is a `filename -> filename.ext` filter.

`ext` is mime-based.
"""
# Mapping of mime-types to extensions is taken form here:
# http://as3corelib.googlecode.com/svn/trunk/src/com/adobe/net/MimeTypeMap.as
mime2exts_list = [
["application/andrew-inset","ez"],
["application/atom+xml","atom"],
["application/mac-binhex40","hqx"],
["application/mac-compactpro","cpt"],
["application/mathml+xml","mathml"],
["application/msword","doc"],
["application/octet-stream","bin","dms","lha","lzh","exe","class","so","dll","dmg"],
["application/oda","oda"],
["application/ogg","ogg"],
["application/pdf","pdf"],
["application/postscript","ai","eps","ps"],
["application/rdf+xml","rdf"],
["application/smil","smi","smil"],
["application/srgs","gram"],
["application/srgs+xml","grxml"],
["application/vnd.adobe.apollo-application-installer-package+zip","air"],
["application/vnd.mif","mif"],
["application/vnd.mozilla.xul+xml","xul"],
["application/vnd.ms-excel","xls"],
["application/vnd.ms-powerpoint","ppt"],
["application/vnd.rn-realmedia","rm"],
["application/vnd.wap.wbxml","wbxml"],
["application/vnd.wap.wmlc","wmlc"],
["application/vnd.wap.wmlscriptc","wmlsc"],
["application/voicexml+xml","vxml"],
["application/x-bcpio","bcpio"],
["application/x-cdlink","vcd"],
["application/x-chess-pgn","pgn"],
["application/x-cpio","cpio"],
["application/x-csh","csh"],
["application/x-director","dcr","dir","dxr"],
["application/x-dvi","dvi"],
["application/x-futuresplash","spl"],
["application/x-gtar","gtar"],
["application/x-hdf","hdf"],
["application/x-javascript","js"],
["application/x-koan","skp","skd","skt","skm"],
["application/x-latex","latex"],
["application/x-netcdf","nc","cdf"],
["application/x-sh","sh"],
["application/x-shar","shar"],
["application/x-shockwave-flash","swf"],
["application/x-stuffit","sit"],
["application/x-sv4cpio","sv4cpio"],
["application/x-sv4crc","sv4crc"],
["application/x-tar","tar"],
["application/x-tcl","tcl"],
["application/x-tex","tex"],
["application/x-texinfo","texinfo","texi"],
["application/x-troff","t","tr","roff"],
["application/x-troff-man","man"],
["application/x-troff-me","me"],
["application/x-troff-ms","ms"],
["application/x-ustar","ustar"],
["application/x-wais-source","src"],
["application/xhtml+xml","xhtml","xht"],
["application/xml","xml","xsl"],
["application/xml-dtd","dtd"],
["application/xslt+xml","xslt"],
["application/zip","zip"],
["audio/basic","au","snd"],
["audio/midi","mid","midi","kar"],
["audio/mpeg","mp3","mpga","mp2"],
["audio/x-aiff","aif","aiff","aifc"],
["audio/x-mpegurl","m3u"],
["audio/x-pn-realaudio","ram","ra"],
["audio/x-wav","wav"],
["chemical/x-pdb","pdb"],
["chemical/x-xyz","xyz"],
["image/bmp","bmp"],
["image/cgm","cgm"],
["image/gif","gif"],
["image/ief","ief"],
["image/jpeg","jpg","jpeg","jpe"],
["image/png","png"],
["image/svg+xml","svg"],
["image/tiff","tiff","tif"],
["image/vnd.djvu","djvu","djv"],
["image/vnd.wap.wbmp","wbmp"],
["image/x-cmu-raster","ras"],
["image/x-icon","ico"],
["image/x-portable-anymap","pnm"],
["image/x-portable-bitmap","pbm"],
["image/x-portable-graymap","pgm"],
["image/x-portable-pixmap","ppm"],
["image/x-rgb","rgb"],
["image/x-xbitmap","xbm"],
["image/x-xpixmap","xpm"],
["image/x-xwindowdump","xwd"],
["model/iges","igs","iges"],
["model/mesh","msh","mesh","silo"],
["model/vrml","wrl","vrml"],
["text/calendar","ics","ifb"],
["text/css","css"],
["text/html","html","htm"],
["text/plain","txt","asc"],
["text/richtext","rtx"],
["text/rtf","rtf"],
["text/sgml","sgml","sgm"],
["text/tab-separated-values","tsv"],
["text/vnd.wap.wml","wml"],
["text/vnd.wap.wmlscript","wmls"],
["text/x-setext","etx"],
["video/mpeg","mpg","mpeg","mpe"],
["video/quicktime","mov","qt"],
["video/vnd.mpegurl","m4u","mxu"],
["video/x-flv","flv"],
["video/x-msvideo","avi"],
["video/x-sgi-movie","movie"],
["x-conference/x-cooltalk","ice"]]

#NOTE: take only the first extension
mime2ext = dict(x[:2] for x in mime2exts_list)

if __name__ == '__main__':
import fileinput, os.path
from subprocess import Popen, PIPE

for filename in (line.rstrip() for line in fileinput.input()):
output, _ = Popen(['file', '-bi', filename], stdout=PIPE).communicate()
mime = output.split(';', 1)[0].lower().strip()
print filename + os.path.extsep + mime2ext.get(mime, 'undefined')

Here's a snippet for old python's versions (not tested):

#NOTE: take only the first extension
mime2ext = {}
for x in mime2exts_list:
mime2ext[x[0]] = x[1]

if __name__ == '__main__':
import os
import sys

# this version supports only stdin (part of fileinput.input() functionality)
lines = sys.stdin.read().split('\n')
for line in lines:
filename = line.rstrip()
output = os.popen('file -bi ' + filename).read()
mime = output.split(';')[0].lower().strip()
try: ext = mime2ext[mime]
except KeyError:
ext = 'undefined'
print filename + '.' + ext

It should work on Python 2.3.5 (I guess).

Creating Files Of Specific Type

There is no "type" of regular file in Linux/Unix. You could register any of your own file extension to open files with specified extension with application you will chose.

Please look at answers for question: Register file extensions / mime types in Linux

Also to understand other association mechanism on Linux named "Shebangs (#!)", you could read this: File extensions and association with programs in linux

Adding extension to file using file command in linux

Something like this should get you started:

#!/bin/bash
for f in "$@"; do
if [[ $f == *'.'* ]]; then continue; fi # Naive check to make sure we don't add duplicate extensions
ext=''
case $(file -b "$f") in
*ASCII*) ext='.txt' ;;
*JPEG*) ext='.jpg' ;;
*PDF*) ext='.pdf' ;;
# etc...
*) continue ;;
esac
mv "${f}" "${f}${ext}"
done

You'll have to check the output of file for each potential file type to find an appropriate case label.

Identify and append file extension to files using bash

Your main problem seems to be in how you are assigning variables. When you assign a value to a variable:

  • The variable name should not be quoted – opposite case of when using $ for parameter expansion of the variable
  • There should only be one equal sign (the second equal sign would be considered as part of the string value)
  • There should be no spaces on either side of the equals sign; otherwise, the variable name will be interpreted as a command name.

The following should do what you intend:

for i in *;
do ext=$(file "$i" --mime-type -b | sed 's#.*/##')
mv -v "$i" "$i.$ext"
done

Note: this code makes the same assumptions as your original code, i.e., that all files in the current directory (including any non-regular files such as directories) should be renamed – and that they will be renamed as per their MIME type so plain text files will have a .plain suffix.

Add extra file extension to all filenames in a directory via Linux command line

Your code should work if no file has spaces (or other "special" character) in the name and if the directory is not pathologically big.

In those cases, you can use something like this:

ls|grep '*.utf8$'|while read i; do mv "$i" "$i.sbd"; done

How can I change the extension of files of a type using find with Bash?

Combining the find -exec bash idea with the bash loop idea, you can use the + terminator on the -exec to tell find to pass multiple filenames to a single invocation of the bash command. Pass the new type as the first argument - which shows up in $0 and so is conveniently skipped by a for loop over the rest of the command-line arguments - and you have a pretty efficient solution:

find . -type f -iname "*.$arg1" -exec bash -c \
'for arg; do mv "$arg" "${arg%.*}.$0"; done' "$arg2" {} +

Alternatively, if you have either version of the Linux rename command, you can use that. The Perl one (a.k.a. prename, installed by default on Ubuntu and other Debian-based distributions; also available for OS X from Homebrew via brew install rename) can be used like this:

find . -type f -iname "*.$arg1" -exec rename 's/\Q'"$arg1"'\E$/'"$arg2"'/' {} +

That looks a bit ugly, but it's really just the s/old/new/ substitution command familiar from many UNIX tools. The \Q and \E around $arg1 keep any weird characters inside the suffix from being interpreted as regular expression metacharacters that might match something unexpected; the $ after the \E makes sure the pattern only matches at the end of the filename.

The pattern-based version installed by default on Red Hat-based Linux distros (Fedora, CentOS, etc) is simpler:

find . -type f -iname "*.$arg1" -exec rename ".$arg1" ".$arg2" {} +

but it's also dumber: if you rename .com .exe stackoverflow.com_scanner.com, you'll get a file named stackoverflow.exe_scanner.exe.

Unix shell - find all different types of file extensions under one folder

Or you could write a script in, say, Perl to do it all.

#!/usr/bin/env perl
use strict;
use warnings;
use feature qw/say/;
use File::Find;

# Takes directory/directories to scan as a command line argument
# or current directory if none given
@ARGV = qw/./ unless @ARGV;

my %extensions;
find(sub {
if ($File::Find::name =~ /\.([^.]+)$/) {
$extensions{$1} += 1;
}
}, @ARGV);
for my $ext (sort { $a cmp $b } keys %extensions) {
say "$ext\t$extensions{$ext}";
}

or using bash:

#!/usr/bin/env bash

shopt -s dotglob globstar
declare -A extensions

# Scans the current directory

allfiles=( **/*.* )

for ext in "${allfiles[@]##*.}"; do
extensions["$ext"]=$(( ${extensions["$ext"]:-0} + 1))
done

for ext in "${!extensions[@]}"; do
printf "%s\t%d\n" "$ext" "${extensions[$ext]}"
done | sort -k1,1

or any shell (Won't work well with filenames with newlines; there are ways around that if using, say, GNU userland tools, though):

find . -name "*.*" | sed 's/.*\.\([^.]\{1,\}\)$/\1/' | sort | uniq -c

recursively add file extension to all files

Alternative command without an explicit loop (man find):

find . -type f -exec mv '{}' '{}'.jpg \;

Explanation: this recursively finds all files (-type f) starting from the current directory (.) and applies the move command (mv) to each of them. Note also the quotes around {}, so that filenames with spaces (and even newlines...) are properly handled.



Related Topics



Leave a reply



Submit