How Does Linux Recognize a File as a Certain File Type, and How to Programmatically Change It

How do I programmatically change file permissions?

Full control over file attributes is available in Java 7, as part of the "new" New IO facility (NIO.2). For example, POSIX permissions can be set on an existing file with setPosixFilePermissions(), or atomically at file creation with methods like createFile() or newByteChannel().

You can create a set of permissions using EnumSet.of(), but the helper method PosixFilePermissions.fromString() will uses a conventional format that will be more readable to many developers. For APIs that accept a FileAttribute, you can wrap the set of permissions with with PosixFilePermissions.asFileAttribute().

Set<PosixFilePermission> ownerWritable = PosixFilePermissions.fromString("rw-r--r--");
FileAttribute<?> permissions = PosixFilePermissions.asFileAttribute(ownerWritable);
Files.createFile(path, permissions);

In earlier versions of Java, using native code of your own, or exec-ing command-line utilities are common approaches.

How to read linux file permission programmatically in C/C++

The stat(2) system call returns a struct stat that contains a st_mode member. This is the mode bits that ls -l displays.

On my system, the man 2 stat page says:

   The following flags are defined for the st_mode field:

S_IFMT 0170000 bitmask for the file type bitfields
S_IFSOCK 0140000 socket
S_IFLNK 0120000 symbolic link
S_IFREG 0100000 regular file
S_IFBLK 0060000 block device
S_IFDIR 0040000 directory
S_IFCHR 0020000 character device
S_IFIFO 0010000 FIFO
S_ISUID 0004000 set UID bit
S_ISGID 0002000 set-group-ID bit (see below)
S_ISVTX 0001000 sticky bit (see below)
S_IRWXU 00700 mask for file owner permissions
S_IRUSR 00400 owner has read permission
S_IWUSR 00200 owner has write permission
S_IXUSR 00100 owner has execute permission
S_IRWXG 00070 mask for group permissions
S_IRGRP 00040 group has read permission
S_IWGRP 00020 group has write permission
S_IXGRP 00010 group has execute permission
S_IRWXO 00007 mask for permissions for others (not in group)
S_IROTH 00004 others have read permission
S_IWOTH 00002 others have write permission
S_IXOTH 00001 others have execute permission

How to check the File Type in java

You may use Files Utility of Guava , and use the method of Files.getFileExtension(String String fullName)

System.out.println(Files.getFileExtension("C:\\fileName.txt"));

The output is:

txt

The source code is pretty simple though,

public static String getFileExtension(String fullName) {
checkNotNull(fullName);
String fileName = new File(fullName).getName();
int dotIndex = fileName.lastIndexOf('.');
return (dotIndex == -1) ? "" : fileName.substring(dotIndex + 1);
}

How do I know programmatically the attributes of a file given its `Path` or `PathBuf`

You can use the Metadata object to acquire the file size:

fn main() {
// Acquire the object from an existing File instance
let metadata = std::fs::File::open("./demo").unwrap()
.metadata().unwrap();

// Or just get it directly
let metadata = std::fs::metadata("./demo").unwrap();

println!("Size: {}", metadata.len())
}

But the rest of your question is not Rust specific. There is nothing in rust, or any other language to tell you what's the file type.

A simple solution is to maintain a map file extension -> type but that can be fooled very easily. Just change a .jpg to .txt and it will report an incorrect type.

Another option is to read the first few bytes of the file and compare them to known magic numbers. For instance all PNGs start with 89 50 4e 47 in HEX, JPGs start with ff d8 ff e0 etc.

So you can combine analyzing headers, footers, extensions and other means to truly identify the type.

Or you can use some already available crates. I've not used them, so I do not know how good and accurate they are:

  • https://github.com/flier/rust-mime-sniffer
  • https://docs.rs/tree_magic/0.2.3/tree_magic/

Update

If you only want to know if it's a text file, then you just need to read some bytes from it and try to interpret them as a string:

fn main() {
let file = File::open("./demo").expect("failed to open file");

let mut buffer = Vec::with_capacity(32);
file.take(32)
.read_to_end(&mut buffer)
.expect("failed to read from file");

match std::str::from_utf8(&buffer) {
Ok(_) => println!("It's a text file"),
Err(_) => println!("It's NOT a text file"),
}
}

This will work for UTF-8/ASCII text files. If you want to support other encodings you have to use some additional crate that provides support for that specific code-page and basically do the same thing again.

PS: Nautilus works in the way I've explained above.

How can I programmatically change file encoding linux?

iconv will take care of that, use it like this:

iconv -f ISO88591 -t UTF8 in.txt out.txt

where 88591 is the encoding for latin1, one of the most common 8-bit encodings, which might (or not) be your input encoding.

If you don't know the input charset, you can detect it with the standard file command or the python based chardet. For instance:

iconv -f $(file -bi myfile.txt | sed -e 's/.*[ ]charset=//') -t UTF8 in.txt out.txt

You may want to do something more robust than this one liner, like don't process files when encoding is unknown.

From here, to iterate over multiple files, you can do something like

find . -iname *.txt -exec iconv -f ISO88591 -t UTF8 {} {} \;

I didn't check this, so you might want to google iconv and find, read about them here on SO, or simply read their man pages.



Related Topics



Leave a reply



Submit