How to Check Type of Files Without Extensions

How to check type of files without extensions?

There are Python libraries that can recognize files based on their content (usually a header / magic number) and that don't rely on the file name or extension.

If you're addressing many different file types, you can use python-magic. That's just a Python binding for the well-established magic library. This has a good reputation and (small endorsement) in the limited use I've made of it, it has been solid.

There are also libraries for more specialized file types. For example, the Python standard library has the imghdr module that does the same thing just for image file types.

If you need dependency-free (pure Python) file type checking, see filetype.

determine file type of a file without extension

You can use vim this way:

vim -c ':silent execute ":!echo " . &ft . " > /dev/stdout"' -c ':q!' the_file

It simply constructs command to run in the shell as a string concatenation.

How do you check a file type when there is no extension in c#

I've heard of reading the first few bytes of a file's contents and making an educated guess at the file's format. This link seems promising:

Using .NET, how can you find the mime type of a file based on the file signature not the extension

Identifying the type of a file without extension from binary data

You could read the first few bytes of the file and look for a "magic number". The Wikipedia page on magic numbers suggests that PDF files begin with ASCII %PDF and doc files begin with hex D0 CF 11 E0.

Identifying text files is going be pretty tough in the general case, because a lot of standard magic numbers are actually ASCII text at the beginning of a binary file. For your case, if you can guarantee that you won't be getting anything but PDF, DOC, or TXT, what you could probably get away with is checking for the PDF and DOC magic numbers, and then assuming it's text if it's not either of those.

C++: How to check type of files without extension

Use libmagic.

Libmagic is available on all major platforms (and many minors).

#include <boost/filesystem.hpp>
#include <boost/range.hpp>
#include <iostream>
#include <magic.h>

using namespace boost;
namespace fs = filesystem;

int main() {
auto handle = ::magic_open(MAGIC_NONE|MAGIC_COMPRESS);
::magic_load(handle, NULL);

for (fs::directory_entry const& x : make_iterator_range(fs::directory_iterator("."), {})) {
auto type = ::magic_file(handle, x.path().native().c_str());
std::cout << x.path() << "\t" << (type? type : "UNKOWN") << "\n";
}

::magic_close(handle);
}

Prints, e.g.

sehe@desktop:~/custom/boost/status$ /tmp/test 
"./Jamfile.v2" ASCII text
"./explicit-failures.xsd" XML document text
"./expected_results.xml" XML document text
"./explicit-failures-markup.xml" XML document text

You can use the flags to control the detail of classification, e.g. MAGIC_MIME:

sehe@desktop:~/custom/boost/status$ /tmp/test 
"./Jamfile.v2" text/plain; charset=us-ascii
"./explicit-failures.xsd" application/xml; charset=us-ascii
"./expected_results.xml" application/xml; charset=us-ascii
"./explicit-failures-markup.xml" application/xml; charset=utf-8

Or loading just /etc/magic:

sehe@desktop:~/custom/boost/status$ /tmp/test 
"./Jamfile.v2" ASCII text
"./explicit-failures.xsd" ASCII text
"./expected_results.xml" ASCII text, with very long lines
"./explicit-failures-markup.xml" UTF-8 Unicode text

Given a file without extension, how to determine the type of that file

If the file types are text, as in the examples given, you would need to parse the file content to determine the file type, other file types have a predefined structure which includes headers, such as .dll and .exe files, which follow the PE format (COFF on nix systems).

How to know file type without extension

Yeap you can figure out the type without an extension using the magic number.
Also, the way the file command figures it out, is actually through a 3 step check:

  1. Check for filesystem properties to identifie empty files, folders, etc...
  2. The said magic number
  3. In text files, check for language in it

Here's a library that'll help you with Magic Numbers: jmimemagic

Is there a way to find file type?

Yes it is possible to determine file type without using the file extension. You can do this by reading the file header also sometimes referred as file signature which occupies first few bytes of the file.

How many bytes do file header/signature occupy? This depends from file type to file type. So you should check the internet for more detailed information about the file header/signature for specific file type you want to identify.

You can find list of some more popular signatures List of file signatures - Wikipedia

PS: Most program stopped relying only on file signatures for determining file way back when first Windows came out. The main reason for this was the fact that since in the beginning file extensions were limited to three character length (limit of the old file systems like old FAT8 or FAT16) world quickly ran out of possible file extensions so multiple programs began to use same file extensions but used completely different file types. So by storing file header/signature at the beginning of the file you would no longer be limited by this file system limitation.

How to get content type of a file which has no extension?

Use Files.probeContentType(path) in jdk7. Or detect the file type yourself, according to different file type specifications. Say pdf http://www.adobe.com/devnet/pdf/pdf_reference.html



Related Topics



Leave a reply



Submit