How to Determine a File's True Extension/Type Programmatically

How can I determine a file's true extension/type programmatically?

Not really, no.

You will need to read the first few bytes of each file and interpret it as a header for a finite set of known filetypes. Most files have distinct file headers, some sort of metadata in the first few bytes or first few kilobytes in the case of MP3.

Your program will have to simply try parsing the file for each of your accepted filetypes.

For my program, I send the uploaded image to imagemagick in a try-catch block, and if it blows up, then I guess it was a bad image. This should be considered insecure, because I am loading arbitrary (user supplied) binary data into an external program, which is generally an attack vector. here, I am trusting imageMagick to not do anything to my system.

I recommend writing your own handlers for the significant filetypes you intend to use, to avoid any attack vectors.

Edit: I see in PHP there are some tools to do this for you.

Also, MIME types are what the user's browser claims the file to be. It is handy and useful to read those and act on them in your code, but it is not a secure method, because anyone sending you bad files will fake the MIME headers easily. It's sort of a front line defense to keep your code that expects a JPEG from barfing on a PNG, but if someone embedded a virus in a .exe and named it JPEG, there's no reason not to have spoofed the MIME type.

Find out exact file type in C#

You can try checking for certain file signatures or magic numbers in the files. Here's the link for list of known file signatures and seems quite up to date:

There is another way of doing the same. Use Winista MIME Detector.

There is one XML file mime-type.xml that contains information about file types and the signatures used to identify the content type. You will need this file to create instance of MimeTypes object. Once you have created MimeTypes object, then call GetMimeType method to get MimeType of the stream. If the mime type could not be determined then a null object is returned from this method. Following code snippet demonstrates use of the library.

Example :

  MimeTypes g_MimeTypes = new MimeTypes("mime-types.xml");
sbyte [] fileData = null;
using (System.IO.FileStream srcFile =
new System.IO.FileStream(strFile, System.IO.FileMode.Open))
{
byte [] data = new byte[srcFile.Length];
srcFile.Read(data, 0, (Int32)srcFile.Length);
fileData = Winista.Mime.SupportUtil.ToSByteArray(data);
}
MimeType oMimeType = g_MimeTypes.GetMimeType(fileData);

How to check the File Type in java

You may use Files Utility of Guava , and use the method of Files.getFileExtension(String String fullName)

System.out.println(Files.getFileExtension("C:\\fileName.txt"));

The output is:

txt

The source code is pretty simple though,

public static String getFileExtension(String fullName) {
checkNotNull(fullName);
String fileName = new File(fullName).getName();
int dotIndex = fileName.lastIndexOf('.');
return (dotIndex == -1) ? "" : fileName.substring(dotIndex + 1);
}

How to determine the file extension of a file from a uri

At first, I want to make sure you know it's impossible to find out what kind of file a URI links too, since a link ending with .jpg might let you access a .exe file (this is especially true for URL's, due to symbolic links and .htaccess files), thus it isn't a rock solid solution to fetch the real extension from the URI if you want to limit allowed file types, if this is what you're going for of course. So, I assume you just want to know what extension a file has based on it's URI even though this isn't completely trustworthy;

You can get the extension from any URI, URL or file path using the method bellow. You don't have to use any libraries or extensions, since this is basic Java functionality. This solution get's the position of the last . (period) sign in the URI string, and creates a sub-string starting at the position of the period sign, ending at the end of the URI string.

String uri = "http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/integrating_apps/images/google_logo.png";
String extension = uri.substring(uri.lastIndexOf("."));

This code sample will above will output the .png extension from the URI in the extension variable, note that a . (period) is included in the extension, if you want to gather the file extension without a prefixed period, increase the substring index by one, like this:

String extension = uri.substring(url.lastIndexOf(".") + 1);

One pro for using this method over regular expressions (a method other people use a lot) is that this is a lot less resource expensive and a lot less heavy to execute while giving the same result.

Additionally, you might want to make sure the URL contains a period character, use the following code to achieve this:

String uri = "http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/integrating_apps/images/google_logo.png";
if(uri.contains(".")) {
String extension = uri.substring(url.lastIndexOf("."));
}

You might want to improve the functionally even further to create a more robust system. Two examples might be:

  • Validate the URI by checking it exists, or by making sure the syntax of the URI is valid, possibly using a regular expression.
  • Trim the extension to remove unwanted white spaces.

I won't cover the solutions for these two features in here, because that isn't what was being asked in the first place.

Hope this helps!

Is there a way to find file type?

Yes it is possible to determine file type without using the file extension. You can do this by reading the file header also sometimes referred as file signature which occupies first few bytes of the file.

How many bytes do file header/signature occupy? This depends from file type to file type. So you should check the internet for more detailed information about the file header/signature for specific file type you want to identify.

You can find list of some more popular signatures List of file signatures - Wikipedia

PS: Most program stopped relying only on file signatures for determining file way back when first Windows came out. The main reason for this was the fact that since in the beginning file extensions were limited to three character length (limit of the old file systems like old FAT8 or FAT16) world quickly ran out of possible file extensions so multiple programs began to use same file extensions but used completely different file types. So by storing file header/signature at the beginning of the file you would no longer be limited by this file system limitation.

How do I get the file extension of a file in Java?

In this case, use FilenameUtils.getExtension from Apache Commons IO

Here is an example of how to use it (you may specify either full path or just file name):

import org.apache.commons.io.FilenameUtils;

// ...

String ext1 = FilenameUtils.getExtension("/path/to/file/foo.txt"); // returns "txt"
String ext2 = FilenameUtils.getExtension("bar.exe"); // returns "exe"

Maven dependency:

<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.6</version>
</dependency>

Gradle Groovy DSL

implementation 'commons-io:commons-io:2.6'

Gradle Kotlin DSL

implementation("commons-io:commons-io:2.6")

Others https://search.maven.org/artifact/commons-io/commons-io/2.6/jar

How to check file extension on Android

I would get the file name as a String, split it into an array with "." as the delimiter, and then get the last index of the array, which would be the file extension. For example:

public class main {
public static void main(String[] args) {
String filename = "image.jpg";
String filenameArray[] = filename.split("\\.");
String extension = filenameArray[filenameArray.length-1];
System.out.println(extension);
}
}

Which outputs:

jpg

How to find the extension of a file in C#?

Path.GetExtension

string myFilePath = @"C:\MyFile.txt";
string ext = Path.GetExtension(myFilePath);
// ext would be ".txt"

How do you check a file type when there is no extension in c#

I've heard of reading the first few bytes of a file's contents and making an educated guess at the file's format. This link seems promising:

Using .NET, how can you find the mime type of a file based on the file signature not the extension



Related Topics



Leave a reply



Submit