How to Add User in Supergroup of Hdfs in Linux

How permission groups (by default) in hdfs work?? Why all users files belong to supergroup?

Well, long story short, looks like security was disabled after all. I just didn't know that server-side services do not use /etc/hadoop/conf, but each has their own configs inside /var/run/cloudera-scm-agent/process/_process-name/. These can also be seen in CM UI e.g. CM ->HDFS -> Instances -> NameNode -> Processes -> hdfs-site.xml.

http://community.cloudera.com/t5/Storage-Random-Access-HDFS/HDFS-default-permissioning-workes-weird-CDH5-1/m-p/24137#U24137

Difference between Superuser and supergroup in Hadoop

Superuser

Based on the Hadoop official documentation:

The super-user is the user with the same identity as the name node process itself. Loosely, if you started the name node, then you are the super-user. The super-user can do anything in that permissions checks never fail for the super-user.

Supergroup

Supergroup is the group of superusers. This group is used to ensure that the Hadoop Client has superuser access. It can be configured using dfs.permissions.superusergroup property in the core-site.xml file.


References

  • Hadoop superuser and supergroup

Hadoop: Pseudo Distributed mode for multiple users

Adding a dedicated Hadoop system user

We will use a dedicated Hadoop user account for running Hadoop. While that’s not required it is recommended because it helps to separate the Hadoop installation from other software applications and user accounts running on the same machine (think: security, permissions, backups, etc).

#addgroup hadoop
#adduser --ingroup hadoop hadoop1
#adduser --ingroup hadoop hadoop2

This will add the user hduser and the group hadoop to your local machine.

Change permission of your hadoop installed directory

chown -R hduser:hadoop hadoop

And lastly change hadoop temporary directoy permission

If your temp directory is /app/hadoop/tmp

#mkdir -p /app/hadoop/tmp
#chown hduser:hadoop /app/hadoop/tmp

and if you want to tighten up security, chmod from 755 to 750...

#chmod 750 /app/hadoop/tmp

Adding ec2-user to use hadoop

SSH in as hadoop@(publicIP) for Amazon EMR.

From there you can do anything you like with HDFS without having to "su." I just did an mkdir and ran distcp and a streaming job. I do everything as hadoop@, as per the EMR instructions.



Related Topics



Leave a reply



Submit