Select the features which are provided
by Kerberos.
1. There will be a central KDC server which
will authenticate the user for accessing cluster resources.
2. Root access to the cluster for users hdfs
and mapred but non-root access for clients
3. Encryption for data during transfer
between the Mappers and Reducers
4. User will be authenticated on all remote
procedure calls (RPCs)
5. Before storing the data on the disk they
will be encrypted.
1. 1,4
2. 2,4
3. 2,3
4. 4,5
5. 1,5
Ans : 1
Exp : Kerberos principals for Hadoop Daemons and
Users : For running hadoop service daemons in Hadoop in secure mode, Kerberos
principals are required. Each service reads auhenticate information saved in
keytab file with appropriate permission. HTTP web-consoles should be served by
principal different from RPC's one.
Subsections below shows the examples of credentials
for Hadoop services.Kerberos is a security system which provides authentication
to the Hadoop cluster. It has nothing to do with data encryption, nor does it
control login to the cluster nodes themselves. Instead, it concerns itself with
ensuring that only authorized users can access the cluster via HDFS and
MapReduce. (It also controls other access, such as via Hive, Sqoop etc.)
Hadoop supports strong authentication using the
Kerberos protocol. Kerberos was developed by a team at MIT to provide strong
authentication of clients to a server and is well-known to many enterprises.
When operating in secure mode, all clients must provide a valid Kerberos ticket
that can be verified by the server. In addition to clients being authenticated,
daemons are also verified. In the case of HDFS, for instance, a datanode is not
permitted to connect to the namenode unless it provides a valid ticket within
each RPC. All of this amounts to an environment where every daemon and client
application can be cryptographically verified as a known entityprior to
allowing any operations to be performed, a desirable feature of any data
storage and processing system.The idea is to configure Cloudera Manager and CDH
to talk to the KDC which was set up specifically for Hadoop and then create a
unidirectional cross-realm trust between this KDC and the (production) Active
Directory or KDC.
The Kerberos integration using one-way cross-realm
trust is the recommended solution by Cloudera. Why? Because the hadoop services
also need to use certificates when Kerberos is enabled. In case of a large
enough cluster we can have quite an increased number of certificate requests
and added users to the Kerberos server (we will have one per service -i.e.
hdfs, mapred, etc. - per server). In case we would not configure a one-way
cross-realm trust, all these users would end up in our production Active
Directory or KDC server. The Kerberos server dedicated for the hadoop cluster
could contain all the hadoop related users (mostly services and hosts), then
connect to the production server to obtain the actual users of the cluster. It
would also act as a shield, catching all the certificate requests from the
hadoop service/host requests and only contacting the Active Directory or KDC
production server when a 'real' user wants to connect. To see why Cloudera
recommends setting up a KDC with one-way cross-realm trust read the Integrating
Hadoop Security with Active Directory site.