Get the metadata of a file
There is a basic set of metadata that you can get from a file.
Path file = ...;
BasicFileAttributes attr = Files.readAttributes(file, BasicFileAttributes.class);
System.out.println("creationTime: " + attr.creationTime());
System.out.println("lastAccessTime: " + attr.lastAccessTime());
System.out.println("lastModifiedTime: " + attr.lastModifiedTime());
System.out.println("isDirectory: " + attr.isDirectory());
System.out.println("isOther: " + attr.isOther());
System.out.println("isRegularFile: " + attr.isRegularFile());
System.out.println("isSymbolicLink: " + attr.isSymbolicLink());
System.out.println("size: " + attr.size());
Some things are platform dependent and may throw exceptions or return unexpected results.
You can read more at Managing Metadata (File and File Store Attributes).
How to get metadata from a file in c
Did you try fstat()?
Man page link: https://linux.die.net/man/2/fstat
How to get file metadata from Azure File Share?
You can use the getProperties method for fetching metadata
for the file
and directory
. Here is the definition of this method:
Returns all user-defined metadata, standard HTTP properties, and
system properties for the file. It does not return the content of the
file.
So in your code -> inside for await (const item of dirIter)
, you need to determine if it's a file
or directory
, then call the getProperties()
method. The sample code looks like below:
for await (const item of dirIter) {
if (item.kind === "directory") {
const mydirectory = directoryClient.getDirectoryClient(item.name);
var diretory_properties = await mydirectory.getProperties();
//for test, you can print out the metadata
console.log(diretory_properties.metadata);
//here, you can write code to add the metadata in your list
} else {
const myfile=directoryClient.getFileClient(item.name);
var the_properties = await myfile.getProperties();
//for test, you can print out the metadata
console.log(the_properties.metadata)
//here, you can write code to add the metadata in your list
}
}
How to get files metadata, when retrieving data from HDFS?
Easiest way to do so is with spark udf input_file_name
.
import scala.collection.mutable.Map
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}
val df = spark.read.text("<path>").withColumn("input_file_name", input_file_name()).repartition($"input_file_name")
def getMetadata(rdd: Iterator[Row]) = {
val map = Map[String, Long]()
val fs = FileSystem.get(new Configuration())
rdd.map(row => {
val path = row.getString(row.size -1)
if(! map.contains(path)){
map.put(path,fs.listStatus(new Path(path))(0).getModificationTime())
}
Row.fromSeq(row.toSeq ++ Array[Any](map(path)))
})
}
spark.createDataFrame(df.rdd.mapPartitions(getMetadata),df.schema.add("modified_ts", LongType)).show(10000,false)
Here modified_ts
is the mtime
for the file.
Depending on size of the data, you can also do it with join. The logic will look something like:
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}
import org.apache.spark.sql.functions._
val mtime =(path:String)=> FileSystem.get(new Configuration()).listStatus(new Path(path)).head.getModificationTime
val mtimeUDF = udf(mtime)
val df = spark.read.text("<path>").withColumn("input_file_name", input_file_name())
val metadata_df = df.select($"input_file_name").distinct().withColumn("mtime", mtimeUDF($"input_file_name"))
val rows_with_metadata = df.join(metadata_df , "input_file_name")
rows_with_metadata.show(false)
php get meta data of file last time file was modified
Stream metadata is not the same as the file metadata, since not all streams are connected to files.
Use fstat()
to get the metadata of the file.
$meta = fstat($file);
echo date('Y-m-d H:i:s', $meta['mtime']);
How to to get Image file metadata from it's path
I was able to resolve it after reading after reading this documentation
import cover from "../assets/cover.png"; // importing img file
let blob = await fetch(cover).then((r) => r.blob()); //creating blob object
const file = new File([blob], "cover.png", {
type: "image/png",
});
console.log(file);
// output
// {
// lastModified: 1656486792733
// lastModifiedDate: Wed Jun 29 2022 12:43:12 GMT+0530 (India Standard Time) {}
// name: "cover.png"
// size: 1446458
// type: "image/png"
// webkitRelativePath: ""
// }
Powershell script to get the metadata field writing application
edit: actually, this seems more reliable. So far any file that mediainfo can read, this also works with.
$FILE = "C:\test.mkv"
$content = (Get-Content -Path $FILE -First 100) + (Get-Content -Path $FILE -Tail 100)
if(($content -match '\*data')[0] -match '\*data\W*([\w\n\s\.]*)'){
write-host "Writing Application:" $Matches[1]
exit
}elseif(($content -match 'M€.*WA(.*)s¤')[0] -match 'M€.*WA(.*)s¤'){
write-host "Writing Application:" $Matches[1]
}
It looks like the last bytes in the file after *data that specify the writer, so try this:
(Get-Content -Path "c:\video.mkv" -Tail 1) -match '\*data\W*(.*)$' | out-null
write-host "Writing Application:" $Matches[1]
On my test file that resulted in "HandBrake 1.5.1 2022011000"
I'm not sure what standard specifies this sorry. There's also a host of useful info on the first line of data in the file as well, e.g:
ftypmp42 mp42iso2avc1mp41 free6dÊmdat ôÿÿðÜEé½æÙH·–,Ø Ù#îïx264 - core 164 r3065 ae03d92 - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=1 deblock=1:0:0 analyse=0x1:0x111 me=hex subme=2 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadz
one=21,11 fast_pskip=1 chroma_qp_offset=0 threads=18 lookahead_threads=5 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=10 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin
=0 qpmax=69 qpstep=4 vbv_maxrate=14000 vbv_bufsize=14000 crf_max=0.0 nal_hrd=none filler=0 ip_ratio=1.40 aq=1:1.00
I couldn't replicate your success viewing the info with Windows Explorer, the field is invisible for me even though I can view it with MediaInfo etc
Obtaining metadata Where from of a file on Mac
TL;DR: Get the extended attribute like MacOS's "Where from" by e.g. pip-install pyxattr
and use xattr.getxattr("file.pdf", "com.apple.metadata:kMDItemWhereFroms")
.
Extended Attributes on files
These extended file attributes like your "Where From" in MacOS (since 10.4) store metadata not interpreted by the filesystem. They exist for different operating systems.
using the command-line
You can also query them on the command-line with tools like:
exiftool
:
exiftool -MDItemWhereFroms -MDItemTitle -MDItemAuthors -MDItemDownloadedDate /path/to/file
xattr
(apparently MacOS also uses a Python-script)
xattr -p -l -x /path/to/file
On MacOS many attributes are displayed in property-list format, thus use -x
option to obtain hexadecimal output.
using Python
Ture Pålsson pointed out the missing link keywords. Such common and appropriate terms are helpful to search Python Package Index (PyPi):
Search PyPi by keywords: extend file attributes, meta data:
xattr
pyxattr
osxmetadata
, requires Python 3.7+, MacOS only
For example to list and get attributes use (adapted from pyxattr's official docs)
import xattr
xattr.listxattr("file.pdf")
# ['user.mime_type', 'com.apple.metadata:kMDItemWhereFroms']
xattr.getxattr("file.pdf", "user.mime_type")
# 'text/plain'
xattr.getxattr("file.pdf", "com.apple.metadata:kMDItemWhereFroms")
# ['https://example.com/downloads/file.pdf']
However you will have to convert the MacOS specific metadata which is stored in plist format, e.g. using plistlib
.
File metadata on MacOS
Mac OS X 10.4 (Tiger) introduced Spotlight a system for extracting (or harvesting), storing, indexing, and querying metadata. It provides an integrated system-wide service for searching and indexing.
This metadata is stored as extended file attributes having keys prefixed with com.apple.metadata:
. The "Where from" attribute for example has the key com.apple.metadata:kMDItemWhereFroms
.
using Python
Use osxmetadata to use similar functionality like in MacOS's md*
utils:
from osxmetadata import OSXMetaData
filename = 'file.pdf'
meta = OSXMetaData(filename)
# get and print "Where from" list, downloaded date, title
print(meta.wherefroms, meta.downloadeddate, meta.title)
See also
- MacIssues (2014): How to look up file metadata in OS X
- OSXDaily (2018): How to View & Remove Extended Attributes from a File on Mac OS
- Ask Different: filesystem - What all file metadata is available in macOS?
- Query Spotlight for a range of dates via PyObjC
- Mac OS X : add a custom meta data field to any file
Related Topics
How Does the Spring @Responsebody Annotation Work
How to Write an Arraylist of Strings into a Text File
What Is the Regex for "Any Positive Integer, Excluding 0"
Session.Connection() Deprecated on Hibernate
When to Use Atomicreference in Java
Spark Strutured Streaming Automatically Converts Timestamp to Local Time
How to Handle Simultaneous Key Presses in Java
Last Row Always Removed from Defaulttablemodel, Regardless of Index
Passing Pointers Between C and Java Through Jni
Java Jtree Directory Structure from File Paths
Termination of Program on Main Thread Exit
Java Split Is Eating My Characters
Is Polymorphism Possible Without Inheritance