Technical Blog

All the latest technical and engineering news from the world of Guavus

Authorization of Services using Knox, Ranger and LDAP on Hadoop Cluster

by Harsh Takkar, Technical Lead, Application Engineering, Guavus, Inc.

What is Apache Knox?

Apache Knox is a reverse proxy application gateway for the REST services in the Hadoop ecosystem. It provides a single point of authentication and pluggable policy enforcement for services running in Apache Hadoop ecosystem.

Authorization of a resource or service is a basic requirement for every product. If you are building some REST APIs or a User Interface (be it for a customer or for the in-house use), the first challenge that arises is, how to  authenticate and authorize the users? One approach is to build your own authentication and authorization. The other, and a simpler, approach is to use a preexisting tool for authentication.

Apache Knox, along with Apache Ranger, can be used to authenticate and authorize the users.

The above image depicts the integration of a pre-existing API (for example, My-Service) with Apache Knox and authenticate and authorize the user using LDAP/AD and Apache Ranger, respectively.

The above image is a sequence diagram depicting a request through Knox.

Before starting, ensure that the following requirements are fulfilled:

  • The Apache Hadoop cluster must be running with Apache Knox and Apache Ranger.
  • A Rest API to access a resource must be present. If you do not have one, use Python’s Simple HTTP server, which can be created using following commands

python -m SimpleHTTPServer <port_number>
Let us assume you have a HTTP service running on an URL (for example: http://mycluster:9097/myservice)

Adding a Service to Knox

Let us first add a service which will be proxied through Apache Knox. We must prepare 2 files as follows:

  • xml : This file contains the various routes (paths) to be provided by the Service and the rewrite rule bindings to those paths.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><service role="MY-SERVICE" name="myservice" version="0.0.1">  <routes>
<route path="/myservice">
<rewrite apply="MY-SERVICE/myservice/inbound/root" to="request.url"/>
</route>
<route path="/myservice/**">
<rewrite apply="MY-SERVICE/myservice/inbound/path" to="request.url"/>
</route>
<route path="/myservice/**?**">
<rewrite apply="MY-SERVICE/myservice/inbound/query" to="request.url"/>
</route>
</routes>
<dispatch classname="org.apache.hadoop.gateway.dispatch.PassAllHeadersDispatch"/>
</service>
  • xml : This file contains the rewrite rules for the service.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<rules>
<rule dir="IN" name="MY-SERVICE/myservice/inbound/root" pattern="*://*:*/**/myservice">
<rewrite template="{$serviceUrl[MY-SERVICE]}"/>
</rule>
<rule dir="IN" name="MY-SERVICE/myservice/inbound/path" pattern="*://*:*/**/myservice/{**}">
<rewrite template="{$serviceUrl[MY-SERVICE]}/{**}"/>
</rule>
<rule dir="IN" name="MY-SERVICE/myservice/inbound/query" pattern="*://*:*/**/myservice/{**}?{**}">
<rewrite template="{$serviceUrl[MY-SERVICE]}/{**}?{**}"/>
</rule>
</rules>

Copy the above files on the server where Knox is installed at following location: /usr/hdp/current/knox-server/data/services/my-service/0.0.1/

Services
…Service name
……Version
………service.xml
………rewrite.xml

For more details refer : https://knox.apache.org/books/knox-1-3-0/dev-guide.html#Service+Definition+Directory+Structure

In the rewrite.xml, you must have noticed that we are using a variable $serviceUrl[MY-SERVICE]. Let us define it.

<service>
<role>MY-SERVICE</role>
<url>http://mycluster:9097/myservice</url>
</service>

Authentication and Authorization of Service using Knox and Ranger

  • Enable Knox authorization using Ranger:

In Advanced topology file, add the following under the <gateway> tag

<provider>
<role>authorization</role>
<name>XASecurePDPKnox</name>
<enabled>true</enabled>
</provider>
  • Add LDAP parameters to enable user and group search:

In the Advanced topology file, add or update the following parameters depending upon your LDAP server configurations under the provider role “authentication” to enable LDAP user and group sync.
The following configuration enables Knox to authenticate the user from the LDAP before checking the authorization from Ranger.

<param>
<name>main.ldapRealm</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
</param>
<param>
<name>main.ldapContextFactory</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value>
</param>
<param>
<name>main.ldapRealm.contextFactory</name>
<value>$ldapContextFactory</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.url</name>
<value>ldap://[ldap ip : port]</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.authenticationMechanism</name>
<value>simple</value>
</param>
<param>
<name>urls./**</name>
<value>authcBasic</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.systemUsername</name>
<value>cn=ldapadm,dc=lab,dc=openldap,dc=com</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.systemPassword</name>
<value>[ldap system user password]</value>
</param>
<param>
<name>main.ldapRealm.memberAttributeValueTemplate</name>
<value>uid={0},ou=People,dc=lab,dc=openldap,dc=com</value>
</param>
<param>
<name>main.ldapRealm.searchBase</name>
<value>dc=lab,dc=openldap,dc=com</value>
</param>
<param>
<name>main.ldapRealm.userObjectClass</name>
<value>posixAccount</value>
</param>
<param>
<name>main.ldapRealm.userSearchAttributeName</name>
<value>cn</value>
</param>
<param>
<name>main.ldapRealm.authorizationEnabled</name>
<value>true</value>
</param>
<param>
<name>main.ldapRealm.groupSearchBase</name>
<value>ou=Group,dc=lab,dc=openldap,dc=com</value>
</param>
<param>
<name>main.ldapRealm.groupObjectClass</name>
<value>posixGroup</value>
</param>
<param>
<name>main.ldapRealm.groupIdAttribute</name>
<value>cn</value>
</param>
<param>
<name>main.ldapRealm.memberAttribute</name>
<value>memberUid</value>
</param>

For configuration related to AD and description for each property, refer to the following link https://developer.ibm.com/hadoop/2016/08/03/ldap-integration-with-apache-knox/

  • Adding Ranger Policies:

Open Ranger UI and click on Knox policies and add a new policy as follows:

Policy Name : my-service
Knox Topology : default
Knox Service: MY-SERVICE
Under Allow condition section select the group or user name to whom you want to give access to your service
select option “Allow” under permissions

You are all set now. Access your service using the following URL, it will ask you for the authentication for the URL.
https://<knox fqdn or ip address>:8443/gateway/default/myservice

References

Apache Knox 1.3.0

Featured image by Jan Alexander

Posted by guavus