How to Get the File Extension from a Url

How to determine the file extension of a file from a uri

At first, I want to make sure you know it's impossible to find out what kind of file a URI links too, since a link ending with .jpg might let you access a .exe file (this is especially true for URL's, due to symbolic links and .htaccess files), thus it isn't a rock solid solution to fetch the real extension from the URI if you want to limit allowed file types, if this is what you're going for of course. So, I assume you just want to know what extension a file has based on it's URI even though this isn't completely trustworthy;

You can get the extension from any URI, URL or file path using the method bellow. You don't have to use any libraries or extensions, since this is basic Java functionality. This solution get's the position of the last . (period) sign in the URI string, and creates a sub-string starting at the position of the period sign, ending at the end of the URI string.

String uri = "http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/integrating_apps/images/google_logo.png";
String extension = uri.substring(uri.lastIndexOf("."));

This code sample will above will output the .png extension from the URI in the extension variable, note that a . (period) is included in the extension, if you want to gather the file extension without a prefixed period, increase the substring index by one, like this:

String extension = uri.substring(url.lastIndexOf(".") + 1);

One pro for using this method over regular expressions (a method other people use a lot) is that this is a lot less resource expensive and a lot less heavy to execute while giving the same result.

Additionally, you might want to make sure the URL contains a period character, use the following code to achieve this:

String uri = "http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/integrating_apps/images/google_logo.png";
if(uri.contains(".")) {
String extension = uri.substring(url.lastIndexOf("."));
}

You might want to improve the functionally even further to create a more robust system. Two examples might be:

  • Validate the URI by checking it exists, or by making sure the syntax of the URI is valid, possibly using a regular expression.
  • Trim the extension to remove unwanted white spaces.

I won't cover the solutions for these two features in here, because that isn't what was being asked in the first place.

Hope this helps!

How to safely get the file extension from a URL?

The real proper way is to not use file extensions at all. Do a GET (or HEAD) request to the URL in question, and use the returned "Content-type" HTTP header to get the content type. File extensions are unreliable.

See MIME types (IANA media types) for more information and a list of useful MIME types.

Getting file extension from http url using Java

You can use the URL library from JAVA. It has a lot of utility in this cases. You should do something like this:

String url = "https://your_url/logo.svg?position=5";
URL fileIneed = new URL(url);

Then, you have a lot of getter methods for the "fileIneed" variable. In your case the "getPath()" will retrieve this:

fileIneed.getPath() ---> "/logo.svg"

And then use the Apache library that you are using, and you will have the "svg" String.

FilenameUtils.getExtension(fileIneed.getPath()) ---> "svg"

JAVA URL library docs >>>
https://docs.oracle.com/javase/7/docs/api/java/net/URL.html

Identify the file extension of a URL

The urlparse module (urllib.parse in Python 3) provides tools for working with URLs. Although it doesn't provide a way to extract the file extension from a URL, it's possible to do so by combining it with os.path.splitext:

from urlparse import urlparse
from os.path import splitext

def get_ext(url):
"""Return the filename extension from url, or ''."""
parsed = urlparse(url)
root, ext = splitext(parsed.path)
return ext # or ext[1:] if you don't want the leading '.'

Example usage:

>>> get_ext("www.example.com/image.jpg")
'.jpg'
>>> get_ext("https://www.example.com/page.html?foo=1&bar=2#fragment")
'.html'
>>> get_ext("https://www.example.com/resource")
''

how to get file extension from url

As suggested by @melpomene you can make HEAD request for file, get Content-Type from response headers

fetch("https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcTA_Rg2GwJVJEmOGGoYFev_eTSZAjkp_stpi4cUXpjWbE6Wh7gSpCvldg", {method:"HEAD"}).then(response => response.headers.get("Content-Type")).then(type => console.log(`.${type.replace(/.+\/|;.+/g, "")}`));

How to get the file extension from a url?

Use File.extname

File.extname("test.rb")         #=> ".rb"
File.extname("a/b/d/test.rb") #=> ".rb"
File.extname("test") #=> ""
File.extname(".profile") #=> ""

To format the string

"http://www.example.com/%s.%s" % [filename, extension]

How can I get file extensions with JavaScript?

Newer Edit: Lots of things have changed since this question was initially posted - there's a lot of really good information in wallacer's revised answer as well as VisioN's excellent breakdown


Edit: Just because this is the accepted answer; wallacer's answer is indeed much better:

return filename.split('.').pop();

My old answer:

return /[^.]+$/.exec(filename);

Should do it.

Edit: In response to PhiLho's comment, use something like:

return (/[.]/.exec(filename)) ? /[^.]+$/.exec(filename) : undefined;

How to get the File Extension from a string Path

You can use the extension function in the path package to get the extension from a file path:

import 'package:path/path.dart' as p;

final path = '/some/path/to/file/file.dart';

final extension = p.extension(path); // '.dart'

If your file has multiple extensions, like file.dart.js, you can specify the optional level parameter:

final extension = p.extension('file.dart.js', 2); // '.dart.js'


Related Topics



Leave a reply



Submit