view contents of file in hdfs hadoop
I believe hadoop fs -cat <file>
should do the job.
How to copy file from HDFS to the local file system
bin/hadoop fs -get /hdfs/source/path /localfs/destination/path
bin/hadoop fs -copyToLocal /hdfs/source/path /localfs/destination/path
- Point your web browser to HDFS WEBUI(
namenode_machine:50070
), browse to the file you intend to copy, scroll down the page and click on download the file.
View contents of a file in HDFS
The only way to see the content of a file is hadoop fs -cat /path/to/your/file
. In the path, you have to provide the path to file and not folder. I think you used hadoop fs -cat /tej/
that will not work.
How to navigate directories in Hadoop HDFS
There is no cd
(change directory) command in hdfs file system. You can only list the directories and use them for reaching the next directory.
You have to navigate manually by providing the complete path using the ls
command.
hdfs dfs -ls /user/username/app1/subdir/
How files or directories are getting stored in hadoop hdfs
HDFS file system is a distributed storage system wherein the storage location is virtual and created using the disk space from all the DataNodes. While installing hadoop, you must have specified paths for dfs.namenode.name.dir
and dfs.datanode.data.dir
. These are the locations at which all the HDFS related files are stored on individual nodes.
While storing the data onto HDFS, it is stored as blocks of a specified size (default 128MB in Hadoop 2.X). When you use hdfs dfs
commands you will see the complete files but internally HDFS stores these files as blocks. If you check the above mentioned paths on your local file system, you will see a bunch of files which correcpond to files on your HDFS. But again, you will not see them as actual files as they are split into blocks.
Check below mentioned command's output to get more details on how much space from each DataNode is used to create the virtual HDFS storage.
hdfs dfsadmin -report
#Or
sudo -u hdfs hdfs dfsadmin -report
HTH
URI to access a file in HDFS
Default port is "8020".
You can access the "hdfs" paths in 3 different ways.
Simply use "/" as the root path
For e.g.
E:\HadoopTests\target>hadoop fs -ls /
Found 6 items
drwxrwxrwt - hadoop hdfs 0 2015-08-17 18:43 /app-logs
drwxr-xr-x - mballur hdfs 0 2015-11-24 15:36 /tmp
drwxrwxr-x - mballur hdfs 0 2015-10-20 15:27 /userUse "hdfs:///"
For e.g.
E:\HadoopTests\target>hadoop fs -ls hdfs:///
Found 6 items
drwxrwxrwt - hadoop hdfs 0 2015-08-17 18:43 hdfs:///app-logs
drwxr-xr-x - mballur hdfs 0 2015-11-24 15:36 hdfs:///tmp
drwxrwxr-x - mballur hdfs 0 2015-10-20 15:27 hdfs:///userUse "hdfs://{NameNodeHost}:8020/"
For e.g.
E:\HadoopTests\target>hadoop fs -ls hdfs://MBALLUR:8020/
Found 6 items
drwxrwxrwt - hadoop hdfs 0 2015-08-17 18:43 hdfs://MBALLUR:8020/app-logs
drwxr-xr-x - mballur hdfs 0 2015-11-24 15:36 hdfs://MBALLUR:8020/tmp
drwxrwxr-x - mballur hdfs 0 2015-10-20 15:27 hdfs://MBALLUR:8020/userIn this case, "MBALLUR" is the name of my Name Node host.
find file in hadoop filesystem
If you are looking for equivalent of locate Linux command than such option does not exist in Hadoop. But if you are looking for the way of how to find specific file you can use name parameter of fs -find command for this:
hadoop fs -find /some_directory -name some_file_name
If you are looking for the actual location of hdfs file in your local file system you can use fsck command for this:
hdfs fsck /some_directory/some_file_name -files -blocks -locations
Related Topics
Complete Password Field Scp Command on Linux
Change The Rlimit_Nproc in Linux
How to Find Grid Points Nearest to Given Location Using Shell Script
Start Tomcat from Eclipse in Port 80 in Ubuntu with Authbind
Suppressing Compile Time Linkage of Shared Libraries
Link Extraction from a Google Page in Bash
Can't Load Mod_Wsgi Compiled for Python 3
What Is The Analogue of an Ndis Filter in Linux
Shell Script Not Running via Crontab, Runs Fine Manually
Bash Command to Search for Any Occurrence of Phrase and Return List of Files and Paths
Sed: Remove Whole Words Containg a Character Class
Is There Any Way for Ioctl() in Linux to Specify Submission Queue Id for a Nvme Io Request
Linux Clipboard Read/Write in C
Cuda-Gdb Not Working in Nsight on Linux
The Behavior When a Gnu Make Phony Target Happens to Be The Same as a Directory Name