Get the Metadata of a File

Get the metadata of a file

There is a basic set of metadata that you can get from a file.

Path file = ...;
BasicFileAttributes attr = Files.readAttributes(file, BasicFileAttributes.class);

System.out.println("creationTime: " + attr.creationTime());
System.out.println("lastAccessTime: " + attr.lastAccessTime());
System.out.println("lastModifiedTime: " + attr.lastModifiedTime());

System.out.println("isDirectory: " + attr.isDirectory());
System.out.println("isOther: " + attr.isOther());
System.out.println("isRegularFile: " + attr.isRegularFile());
System.out.println("isSymbolicLink: " + attr.isSymbolicLink());
System.out.println("size: " + attr.size());

Some things are platform dependent and may throw exceptions or return unexpected results.

You can read more at Managing Metadata (File and File Store Attributes).

How to get metadata from a file in c

Did you try fstat()?

Man page link: https://linux.die.net/man/2/fstat

How to get file metadata from Azure File Share?

You can use the getProperties method for fetching metadata for the file and directory. Here is the definition of this method:

Returns all user-defined metadata, standard HTTP properties, and
system properties for the file. It does not return the content of the
file.

So in your code -> inside for await (const item of dirIter), you need to determine if it's a file or directory, then call the getProperties() method. The sample code looks like below:

for await (const item of dirIter) {
if (item.kind === "directory") {

const mydirectory = directoryClient.getDirectoryClient(item.name);
var diretory_properties = await mydirectory.getProperties();

//for test, you can print out the metadata
console.log(diretory_properties.metadata);

//here, you can write code to add the metadata in your list

} else {

const myfile=directoryClient.getFileClient(item.name);
var the_properties = await myfile.getProperties();

//for test, you can print out the metadata
console.log(the_properties.metadata)

//here, you can write code to add the metadata in your list

}
}

How to get files metadata, when retrieving data from HDFS?

Easiest way to do so is with spark udf input_file_name.

import scala.collection.mutable.Map
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}

val df = spark.read.text("<path>").withColumn("input_file_name", input_file_name()).repartition($"input_file_name")

def getMetadata(rdd: Iterator[Row]) = {
val map = Map[String, Long]()
val fs = FileSystem.get(new Configuration())
rdd.map(row => {
val path = row.getString(row.size -1)
if(! map.contains(path)){
map.put(path,fs.listStatus(new Path(path))(0).getModificationTime())
}
Row.fromSeq(row.toSeq ++ Array[Any](map(path)))
})
}

spark.createDataFrame(df.rdd.mapPartitions(getMetadata),df.schema.add("modified_ts", LongType)).show(10000,false)

Here modified_ts is the mtime for the file.

Depending on size of the data, you can also do it with join. The logic will look something like:

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}
import org.apache.spark.sql.functions._

val mtime =(path:String)=> FileSystem.get(new Configuration()).listStatus(new Path(path)).head.getModificationTime
val mtimeUDF = udf(mtime)

val df = spark.read.text("<path>").withColumn("input_file_name", input_file_name())

val metadata_df = df.select($"input_file_name").distinct().withColumn("mtime", mtimeUDF($"input_file_name"))

val rows_with_metadata = df.join(metadata_df , "input_file_name")
rows_with_metadata.show(false)

php get meta data of file last time file was modified

Stream metadata is not the same as the file metadata, since not all streams are connected to files.

Use fstat() to get the metadata of the file.

$meta = fstat($file);
echo date('Y-m-d H:i:s', $meta['mtime']);

How to to get Image file metadata from it's path

I was able to resolve it after reading after reading this documentation

import cover from "../assets/cover.png"; // importing img file

let blob = await fetch(cover).then((r) => r.blob()); //creating blob object

const file = new File([blob], "cover.png", {
type: "image/png",
});

console.log(file);

// output
// {
// lastModified: 1656486792733
// lastModifiedDate: Wed Jun 29 2022 12:43:12 GMT+0530 (India Standard Time) {}
// name: "cover.png"
// size: 1446458
// type: "image/png"
// webkitRelativePath: ""
// }

Powershell script to get the metadata field writing application

edit: actually, this seems more reliable. So far any file that mediainfo can read, this also works with.

$FILE = "C:\test.mkv"
$content = (Get-Content -Path $FILE -First 100) + (Get-Content -Path $FILE -Tail 100)
if(($content -match '\*data')[0] -match '\*data\W*([\w\n\s\.]*)'){
write-host "Writing Application:" $Matches[1]
exit
}elseif(($content -match 'M€.*WA(.*)s¤')[0] -match 'M€.*WA(.*)s¤'){
write-host "Writing Application:" $Matches[1]
}

It looks like the last bytes in the file after *data that specify the writer, so try this:

(Get-Content -Path "c:\video.mkv" -Tail 1) -match '\*data\W*(.*)$' | out-null
write-host "Writing Application:" $Matches[1]

On my test file that resulted in "HandBrake 1.5.1 2022011000"

I'm not sure what standard specifies this sorry. There's also a host of useful info on the first line of data in the file as well, e.g:

ftypmp42 mp42iso2avc1mp41 free6dÊmdat ôÿÿðÜEé½æÙH·–,Ø Ù#îïx264 - core 164 r3065 ae03d92 - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=1 deblock=1:0:0 analyse=0x1:0x111 me=hex subme=2 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadz
one=21,11 fast_pskip=1 chroma_qp_offset=0 threads=18 lookahead_threads=5 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=10 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin
=0 qpmax=69 qpstep=4 vbv_maxrate=14000 vbv_bufsize=14000 crf_max=0.0 nal_hrd=none filler=0 ip_ratio=1.40 aq=1:1.00

I couldn't replicate your success viewing the info with Windows Explorer, the field is invisible for me even though I can view it with MediaInfo etc

Obtaining metadata Where from of a file on Mac

TL;DR: Get the extended attribute like MacOS's "Where from" by e.g. pip-install pyxattr and use xattr.getxattr("file.pdf", "com.apple.metadata:kMDItemWhereFroms").

Extended Attributes on files

These extended file attributes like your "Where From" in MacOS (since 10.4) store metadata not interpreted by the filesystem. They exist for different operating systems.

using the command-line

You can also query them on the command-line with tools like:

  • exiftool:
exiftool -MDItemWhereFroms -MDItemTitle -MDItemAuthors -MDItemDownloadedDate /path/to/file
  • xattr (apparently MacOS also uses a Python-script)
xattr -p -l -x /path/to/file

On MacOS many attributes are displayed in property-list format, thus use -x option to obtain hexadecimal output.

using Python

Ture Pålsson pointed out the missing link keywords. Such common and appropriate terms are helpful to search Python Package Index (PyPi):

Search PyPi by keywords: extend file attributes, meta data:

  • xattr
  • pyxattr
  • osxmetadata, requires Python 3.7+, MacOS only

For example to list and get attributes use (adapted from pyxattr's official docs)

import xattr

xattr.listxattr("file.pdf")
# ['user.mime_type', 'com.apple.metadata:kMDItemWhereFroms']
xattr.getxattr("file.pdf", "user.mime_type")
# 'text/plain'
xattr.getxattr("file.pdf", "com.apple.metadata:kMDItemWhereFroms")
# ['https://example.com/downloads/file.pdf']

However you will have to convert the MacOS specific metadata which is stored in plist format, e.g. using plistlib.

File metadata on MacOS

Mac OS X 10.4 (Tiger) introduced Spotlight a system for extracting (or harvesting), storing, indexing, and querying metadata. It provides an integrated system-wide service for searching and indexing.

This metadata is stored as extended file attributes having keys prefixed with com.apple.metadata:. The "Where from" attribute for example has the key com.apple.metadata:kMDItemWhereFroms.

using Python

Use osxmetadata to use similar functionality like in MacOS's md* utils:

from osxmetadata import OSXMetaData

filename = 'file.pdf'
meta = OSXMetaData(filename)

# get and print "Where from" list, downloaded date, title
print(meta.wherefroms, meta.downloadeddate, meta.title)

See also

  • MacIssues (2014): How to look up file metadata in OS X
  • OSXDaily (2018): How to View & Remove Extended Attributes from a File on Mac OS
  • Ask Different: filesystem - What all file metadata is available in macOS?
  • Query Spotlight for a range of dates via PyObjC
  • Mac OS X : add a custom meta data field to any file


Related Topics



Leave a reply



Submit