Blog do projektu Open Source JavaHotel

poniedziałek, 23 września 2019

How to obtain an active NameNode remotely

Problem
While using WebHDFS REST API interface, the client is dealing directly with the NameNode. In HA (high availability) environment, only one NameNode is active, the second is standby. If the standby NameNode is addressed, the request is denied. So the client should be aware of which NameNode is active and construct a valid URL.  But how to discover remotely the active NameNode automatically and avoid redirecting the client manually in case of failover?
Sounds strange but the there is no Ambari REST API to detect an active NameNode.
One obvious solution is to use WebHDFS Knox Gateway which, assuming configured properly, is propagating the query to the valid NameNode.
Solution
There are two convenient methods to discover the active NameNode outside Knox Gateway. One is to use JMX query and the second is to use hdfs haadmin.
The solution is described here in more details. I also added a convenient bash script to extract the active NameNode using both methods: JMX Query and hdfs haadmin. The script can be easily customized. If hdfs haadmin method is used, the script can be executed inside the cluster only so the remote shell call should be implemented.