Extracting extension from filename in Python
Use os.path.splitext
:
>>> import os
>>> filename, file_extension = os.path.splitext('/path/to/somefile.ext')
>>> filename
'/path/to/somefile'
>>> file_extension
'.ext'
Unlike most manual string-splitting attempts, os.path.splitext
will correctly treat /a/b.c/d
as having no extension instead of having extension .c/d
, and it will treat .bashrc
as having no extension instead of having extension .bashrc
:
>>> os.path.splitext('/a/b.c/d')
('/a/b.c/d', '')
>>> os.path.splitext('.bashrc')
('.bashrc', '')
Splitting a file name into name,extension
Use strsplit
:
R> strsplit("name1.csv", "\\.")[[1]]
[1] "name1" "csv"
R>
Note that you a) need to escape the dot (as it is a metacharacter for regular expressions) and b) deal with the fact that strsplit()
returns a list of which typically only the first element is of interest.
A more general solution involves regular expressions where you can extract the matches.
For the special case of filenames you also have:
R> library(tools) # unless already loaded, comes with base R
R> file_ext("name1.csv")
[1] "csv"
R>
and
R> file_path_sans_ext("name1.csv")
[1] "name1"
R>
as these are such a common tasks (cf basename
in shell etc).
Java: splitting the filename into a base and extension
I know others have mentioned String.split
, but here is a variant that only yields two tokens (the base and the extension):
String[] tokens = fileName.split("\\.(?=[^\\.]+$)");
For example:
"test.cool.awesome.txt".split("\\.(?=[^\\.]+$)");
Yields:
["test.cool.awesome", "txt"]
The regular expression tells Java to split on any period that is followed by any number of non-periods, followed by the end of input. There is only one period that matches this definition (namely, the last period).
Technically Regexically speaking, this technique is called zero-width positive lookahead.
BTW, if you want to split a path and get the full filename including but not limited to the dot extension, using a path with forward slashes,
String[] tokens = dir.split(".+?/(?=[^/]+$)");
For example:
String dir = "/foo/bar/bam/boozled";
String[] tokens = dir.split(".+?/(?=[^/]+$)");
// [ "/foo/bar/bam/" "boozled" ]
Extract filename and extension in Bash
First, get file name without the path:
filename=$(basename -- "$fullfile")
extension="${filename##*.}"
filename="${filename%.*}"
Alternatively, you can focus on the last '/' of the path instead of the '.' which should work even if you have unpredictable file extensions:
filename="${fullfile##*/}"
You may want to check the documentation :
- On the web at section "3.5.3 Shell Parameter Expansion"
- In the bash manpage at section called "Parameter Expansion"
How to separate the file name and the extension of a file in c#
You can use Path.GetExtension:
var extension =
Path.GetExtension("C:\\sample.txt"); // returns txt
..and Path.GetFileNameWithoutExtension:
var fileNameWithoutExtension =
Path.GetFileNameWithoutExtension("C:\\sample.txt"); // returns sample
How to split file name into base and extension
Final solution:
String pat = "(?!^)\\.(?=[^.]*$)|(?<=^\\.[^.]{0,1000})$|$";
The pattern consists of 3 alternatives to split with:
(?!^)\\.(?=[^.]*$)
- split with a dot that is not the first character in the string ((?!^)
) and that has 0+ characters other than.
to the right of it up to the string end (``)(?<=^\\.[^.]{0,1000})$)
- split at the end of string if a string starts with a literal.
and has 0 to 1000 characters (maybe setting to1,256
is enough, but there are longer file names, please adjust accordingly)$
- split at the end of string (replace with\\z
if you need no\n
if a string ends with\n
)
When you pass 2
as a limit argument to the split
method, you can limit the number of splits to just two, see Java demo:
System.out.println(Arrays.toString(".MyFile".split(pat,2))); // [.MyFile, ]
System.out.println(Arrays.toString("MyFile.ext".split(pat,2))); // [MyFile, ext]
System.out.println(Arrays.toString("Another.MyFile.ext".split(pat,2))); // [Another.MyFile, ext]
System.out.println(Arrays.toString("MyFile.".split(pat,2))); // [MyFile, ]
System.out.println(Arrays.toString("MyFile".split(pat,2))); // [MyFile, ]
Original answer
I believe you are looking for
(?!^)\\.(?=[^.]*$)|(?<=^\\.[^.]{0,1000})$
One note: the pattern that can be used with split
uses a constrained-width lookbehind that assumes that the length of the file cannot be more than 1000. Increase the value as needed.
See the IDEONE demo:
String pat = "(?!^)\\.(?=[^.]*$)|(?<=^\\.[^.]{0,1000})$";
String s = ".MyFile";
System.out.println(Arrays.toString(s.split(pat,-1)));
s = "MyFile.ext";
System.out.println(Arrays.toString(s.split(pat,-1)));
s = "Another.MyFile.ext";
System.out.println(Arrays.toString(s.split(pat,-1)));
s = "MyFile.";
System.out.println(Arrays.toString(s.split(pat,-1)));
Results:
".MyFile" => [.MyFile, ]
"MyFile.ext" => [MyFile, ext]
"Another.MyFile.ext" => [Another.MyFile, ext]
"MyFile." => [MyFile, ]
Related Topics
Why Does "One" < 2 Equal False in R
Output a Vector in R in the Same Format Used for Inputting It into R
Can't Download Data from Yahoo Finance Using Quantmod in R
Format for Ordinal Dates (Day of Month with Suffixes -St, -Nd, -Rd, -Th)
Fill Missing Combinations in a Dataframe
Specifying Column Names in a Data.Frame Changes Spaces to "."
Rscript Does Not Load Methods Package, R Does -- Why, and What Are the Consequences
Apply a Function Over Groups of Columns
Get_Map Not Passing the API Key (Http Status Was '403 Forbidden')
Dplyr/R Cumulative Sum with Reset
Python's Xrange Alternative for R or How to Loop Over Large Dataset Lazilly
In R, Use Gsub to Remove All Punctuation Except Period
Create Zip File: Error Running Command " " Had Status 127
Setting Y Axis Breaks in Ggplot