Blog do projektu Open Source JavaHotel

wtorek, 22 października 2019

BigSQL and HDP upgrade

Problem
I spent several sleepless nights trying to resolve the really nasty problem. It happened after upgrade from HDP 2.6.4 and BigSQL 5.0 to HDP 3.1 and BigSQL 6.0. Everything runs smoothly, even BigSQL Healthcheck was smiling. The only exception was "LOAD HADOOP" command which failed. BigSQL can run on the top of Hive tables but it is an alternative SQL engine, it is using HCatalog service to get access to Hive tables. In order to ingest data into Hive tables, it launches a separate MapReduce task to accomplish the task.
An example command:
db2 "begin execute immediate 'load hadoop using file url ''/tmp/data_1211057166.txt'' with source properties (''field.delimiter''=''|'', ''ignore.extra.fields''=''true'') into table testuser.smoke_hadoop2_2248299375'; end" Closer examination of MapReduce logs brought up a more detailed error message.
Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.NumberFormatException: For input string: "30s"
Good uncle Google suggests that it could be caused by old MapReduce engine against new configuration files. But how could it happen since all stuff related to HDP 2.6.4 and BigSQL 5.0.0. was meticulously annihilated?
What is more, another HDP 3.1/BigSQL 6.0 installation is executing LOAD HADOOP command without any interruption. Comparing all configuration data between both environments did not reveal any difference.
After a more closer examination, I discovered that related to LOAD HADOOP MapReduce job is empowered by HDP 2.6.4 environment including legacy jar files, pay attention to  2.6.4.0-91 parameter.
exec /bin/bash -c "$JAVA_HOME/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.6.4.0-91 -Xmx545m 
Also, corresponding local cache seems to be populated by the old jar files.
ll /data/hadoop/yarn/local/filecache/14/mapreduce.tar.gz/hadoop/
drwxr-xr-x. 2 yarn hadoop 4096 Jan 4 2018 bin
drwxr-xr-x. 3 yarn hadoop 4096 Jan 4 2018 etc
drwxr-xr-x. 2 yarn hadoop 4096 Jan 4 2018 include
drwxr-xr-x. 3 yarn hadoop 4096 Jan 4 2018 lib
drwxr-xr-x. 2 yarn hadoop 4096 Jan 4 2018 libexec

-r-xr-xr-x. 1 yarn hadoop 87303 Jan 4 2018 LICENSE.txt
-r-xr-xr-x. 1 yarn hadoop 15753 Jan 4 2018 NOTICE.txt
-r-xr-xr-x. 1 yarn hadoop 1366 Jan 4 2018 README.txt
drwxr-xr-x. 2 yarn hadoop 4096 Jan 4 2018 sbin
drwxr-xr-x. 4 yarn hadoop 4096 Jan 4 2018 share

But how it could be possible when all remnants of old HDP were wiped out and no sign of any reference do 2.6.4 including running the grep command against any directory suspected of retaining this nefarious mark.
grep 2\.6\.4 /etc/hadoop -R
Solution
The nutcracker turned out to be BigSQL/DB2 dbset command.
db2set
DB2_BIGSQL_JVM_STARTARGS=-Dhdp.version=3.1.0.0-78 -Dlog4j.configuration=file:///usr/ibmpacks/bigsql/6.0.0.0/bigsql/conf/log4j.properties -Dbigsql.logid.prefix=BSL-${DB2NODE}
DB2_DEFERRED_PREPARE_SEMANTICS=YES
DB2_ATS_ENABLE=YES
DB2_COMPATIBILITY_VECTOR=40B
DB2RSHTIMEOUT=60
DB2RSHCMD=/usr/bin/ssh
DB2FODC=CORESHM=OFF
DB2_JVM_STARTARGS=-Xnocompressedrefs -Dhdp.version=2.6.4.0-91 -Dlog4j.configuration=file:///usr/ibmpacks/bigsql/5.0.4.0/bigsql/conf/log4j.properties -Dbigsql.logid.prefix=BSL-${DB2NODE}
DB2_EXTENDED_OPTIMIZATION=BI_INFER_CC ON
DB2COMM=TCPIP
DB2AUTOSTART=NO

Obviously, the DB2_JVM_STARTARGS took precedence over DB2_BIGSQL_JVM_STARTARGS and it was the reason why the old MapReduce framework was resurrected. The legacy jar files were downloaded from HFDFS /hdp/apps directory.
hdfs dfs -ls /hdp/apps
Found 2 items
drwxr-xr-x   - hdfs hdfs          0 2019-10-07 22:09 /hdp/apps/2.6.4.0-91
drwxr-xr-x   - hdfs hdfs          0 2019-10-12 00:16 /hdp/apps/3.1.0.0-78

The problem was sorted by a single command unsetting malicious DB2_JVM_STARTARGS variable and restarting BigSQL to take it into effect.
db2set DB2_JVM_STARTARGS=
I also removed /hdp/apps/2.6.4.0-91 HDFS directory to make sure that the vampire is ultimately killed.

Brak komentarzy:

Prześlij komentarz