Get File Name from Url

Get file name from URL

Instead of reinventing the wheel, how about using Apache commons-io:

import org.apache.commons.io.FilenameUtils;

public class FilenameUtilTest {

public static void main(String[] args) throws Exception {
URL url = new URL("http://www.example.com/some/path/to/a/file.xml?foo=bar#test");

System.out.println(FilenameUtils.getBaseName(url.getPath())); // -> file
System.out.println(FilenameUtils.getExtension(url.getPath())); // -> xml
System.out.println(FilenameUtils.getName(url.getPath())); // -> file.xml
}

}

js function to get filename from url

var filename = url.split('/').pop()

Extract file name from Java URL (file: and http/https protocol)?

The URI class properly parses the parts of a URI. For most URLs, you want the path of the URI. In the case of a URI with no slashes, there won’t be any parsing of the parts, so you’ll have to rely on the entire scheme-specific part:

URI uri = new URI(b);
String path = uri.getPath();
if (path == null) {
path = uri.getSchemeSpecificPart();
}
String filename = path.substring(path.lastIndexOf('/') + 1);

The above should work for all of your URLs.

Need to extract filename from URL

You could also do it without regular expressions like so:

String url = "https://abc.xyz.com/path/somefilename.xy";
String fileName = url.substring(url.lastIndexOf('/') + 1);
// fileName is now "somefilename.xy"

EDIT (credit to @SomethingSomething): If you should also support urls with parameters, like https://abc.xyz.com/path/somefilename.xy?param1=blie¶m2=bla, you could use this instead:

String url = "https://abc.xyz.com/path/somefilename.xy?param1=blie¶m2=bla";
java.net.Url urlObj = new java.net.Url(url);
String urlPath = urlObj.getPath();
String fileName = urlPath.substring(urlPath.lastIndexOf('/') + 1);
// fileName is now "somefilename.xy"

How to extract a filename from a URL and append a word to it?

You can use urllib.parse.urlparse with os.path.basename:

import os
from urllib.parse import urlparse

url = "http://photographs.500px.com/kyle/09-09-201315-47-571378756077.jpg"
a = urlparse(url)
print(a.path) # Output: /kyle/09-09-201315-47-571378756077.jpg
print(os.path.basename(a.path)) # Output: 09-09-201315-47-571378756077.jpg

Your URL might contain percent-encoded characters like %20 for space or %E7%89%B9%E8%89%B2 for "特色". If that's the case, you'll need to unquote (or unquote_plus) them. You can also use pathlib.Path().name instead of os.path.basename, which could help to add a suffix in the name (like asked in the original question):

from pathlib import Path
from urllib.parse import urlparse, unquote

url = "http://photographs.500px.com/kyle/09-09-2013%20-%2015-47-571378756077.jpg"
urlparse(url).path

url_parsed = urlparse(url)
print(unquote(url_parsed.path)) # Output: /kyle/09-09-2013 - 15-47-571378756077.jpg
file_path = Path("/home/ubuntu/Desktop/") / unquote(Path(url_parsed.path).name)
print(file_path) # Output: /home/ubuntu/Desktop/09-09-2013 - 15-47-571378756077.jpg

new_file = file_path.with_stem(file_path.stem + "_small")
print(new_file) # Output: /home/ubuntu/Desktop/09-09-2013 - 15-47-571378756077_small.jpg

Also, an alternative is to use unquote(urlparse(url).path.split("/")[-1]).

Get Filename from URL and Strip File Extension

You can do the following using javascript. Pop returns the last element which is a string, and then you can use the replace function to get just the filename without .html on the end.

function getFilename () {
return {{ Page Path }}.split('/').pop().replace('.html', '');
}

I see that {{ Page Path }} is probably some templating language but you could modify the above script, to get the current URL and then get the filename as so.

function getFilename () {
return window.location.href.split('/').pop().replace('.html', '');
}

Furthermore you could make it more dynamic to handle any file extension with the following. You need to get the index of the period using indexOf and then sub string from the start of the filename up to the position of the period.

function getFilename () {
var filename = window.location.href.split('/').pop();
return filename.substr(0, filename.lastIndexOf('.');
}

Get file name from URI string in C#

You can just make a System.Uri object, and use IsFile to verify it's a file, then Uri.LocalPath to extract the filename.

This is much safer, as it provides you a means to check the validity of the URI as well.


Edit in response to comment:

To get just the full filename, I'd use:

Uri uri = new Uri(hreflink);
if (uri.IsFile) {
string filename = System.IO.Path.GetFileName(uri.LocalPath);
}

This does all of the error checking for you, and is platform-neutral. All of the special cases get handled for you quickly and easily.

How to extract the filename from a URL in Elixir?

I think only need Path.basename/1:

"https://randomWebsite.com/folder/filename.jpeg"
|> Path.basename()

http.Request: get file name from url

I believe you are looking for path.Base: "Base returns the last element of path."

r,_ := http.NewRequest("GET", "http://localhost/slow/one.json", nil)
fmt.Println(path.Base(r.URL.Path))
// one.json

Playground link



Related Topics



Leave a reply



Submit