Blog do projektu Open Source JavaHotel

sobota, 30 listopada 2013

Netezza Software Emulator, Linux and IBM InfoSphere Streams

Introduction
Netezza and InfoSphere Streams are the part of IBM Big Data offering. Although both are commercial products, good news is that for educational and non-production purpose also free versions are available.
Netezza Software Emulator can be downloaded here.
Non-production InfoSphere Streams package is available here. Although fresh installation of IBM InfoSphere Streams is not complicated it is a good idea to start with ready to use VMware image.
Netezza Software Emulator
Netezza in production requires dedicated hardware but for developing and discovering one can play with VMware machines (Host and SPU) behaving exactly like the real Netezza database. The Netezza Emulator is available only for Windows but it is possible to execute it also under Linux. But keep in mind that IBM supports only Windows version so in case of any problems you are left on your own.
Netezza Software Emulator for Linux
Download and execute INSEfDsetup.exe (it requires at least 13GB free disk space). After downloading transfer unpacked VMware machines (c:\Program Files\IBM\Netezza Software Emulator for Developer\VMS) to the Linux machine. Before starting them create additional vmnet3 network.

This network connects Host and SPU machine. Important: switch off DHCP and set IP address as 10.0.0.1.
Starting Netezza Emulator
Firstly start Host machine. After several minutes log in as nz user (standard password: nz) and execute nzstate. If output is like:
[nz@netezza ~]$ nzstate
System state is 'Discovering'.
[nz@netezza ~]$ nzhw
Description HW ID Location   Role   State
----------- ----- ---------- ------ -------
SPA         1001  spa1       None   Ok
Disk        1002  spa1.disk1 Active None
Disk        1003  spa1.disk2 Active None
Disk        1004  spa1.disk3 Active None
Disk        1005  spa1.disk4 Active None
Disk        1006  spa1.disk5 Spare  None
SPU         1007  spa1.spu1  Active Booting
[nz@netezza ~]$ 

In case of any problem simply launch nzstart command. Next step is to start SPU machine. The SPU machine should connect to Host and start networked booting. Do not bother about a good number of technical messages flying through the screen, it is as expected and does not mean that something wrong is happening. After some time execute again nzstate command in the Host machine. If the screen is like:
[nz@netezza ~]$ nzhw
Description HW ID Location   Role   State
----------- ----- ---------- ------ ------
SPA         1001  spa1       None   Ok
Disk        1002  spa1.disk1 Active OkSQL> create table testt (numb integer,name varchar(100))
SQLRowCount returns -1
SQL> select * from testt;
+------------+-----------------------------------------------------------------------------------------------------+
| NUMB       | NAME                                                                                                |
+------------+-----------------------------------------------------------------------------------------------------+
+------------+-----------------------------------------------------------------------------------------------------+
SQLRowCount returns 0

Disk        1003  spa1.disk2 Active Ok
Disk        1004  spa1.disk3 Active Ok
Disk        1005  spa1.disk4 Active Ok
Disk        1006  spa1.disk5 Spare  Ok
SPU         1007  spa1.spu1  Active Online
[nz@netezza ~]$ nzstate
System state is 'Online'.
[nz@netezza ~]$ 
it means that Netezza is started and ready to handle the requests. Then prepare a test database, a test database and a test user.
  nzsql
  create database test;
  create user testuser with password 'secret';
  grant all admin to testuser;
  grant all on test to testuser;
Check if testuser can log in into test database and execute some basic queries.
 nzsql test -U testuser -W secret
 create table x (x int);
 insert into x values(1);
 select * from x;
 drop table x;
Obtain Netezza connection software
This software can be downloaded from this place.Important: this software is available only for IBM customers and business partners.
Install and configure ODBC connection to Netezza on RedHat 6
Installing Netezza client is very simple and straightforward. Just unpack the Netezza client package (something like nz-linuxclient-v7.0.3-P2.tar.gz) and execute two unpack scripts, one in linux directory and the second in linux64 directory. The default installation location is /usr/local/nz.
Firstly test the connection by executing nzsqlodbc utility from /usr/local/nz/bin64 directory (netez is the host name of the HOST VMmachine in my environment).
[root@oc8442647460 bin64]# ./nzodbcsql -h netez test testuser secret

NZODBCSQL - program to test Netezza ODBC driver
            NOT FOR PRODUCTION USE


     Type 'quit' or '\q' or CTRL-D or ENTER at the prompt to quit
     NOTE: Max 100 rows are displayed (for selects)

 Driver version  : 'Release 7.0.3 (P-2) [Build 32506]'
 NPS version  : '07.00.0003 Release 7.0.3 (P-2) [Build 32506]'
 Database  : 'TEST'


nzodbc > 

Secondly configure unixODBC connection. That's very simple, just follow instructions in /usr/local/nz/lib64/ODBC_README note. In my environment I replaced host, database name, user name and password with netez,test,testuser and secret. Then test the connection using isql utility.
[sb@oc8442647460 ~]$ rlwrap isql NZSQL
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL> create table x (x int);
SQLRowCount returns -1
SQL> insert into x values(1);
SQLRowCount returns 1
SQL> select * from x;
+------------+
| X          |
+------------+
| 1          |
+------------+
SQLRowCount returns 1
1 rows fetched
SQL> 
Important: I failed trying to configure unixODBC Netezza connection in Ubuntu (although nzodbcsql utility is working). Probably there are some incompatibilities with unixODBC version installed as default into Ubuntu.
Configure InfoShophere Streams environment to work with Netezza 
It is described in detail here.
It looks complicated but it is enough to add three entries to .bashrc file.

export STREAMS_ADAPTERS_ODBC_NETEZZA=NETEZZA
export STREAMS_ADAPTERS_ODBC_INCPATH=/usr/include/
export STREAMS_ADAPTERS_ODBC_LIBPATH=/usr/lib64/
Important: InfoSphere Streams can work with one database only. So if a connection with Netezza is specified the access to other database (like DB2) is not possible.
Create a simple InfoSphere Streams application to load data to Netezza
Source code (StremsStudio project) is available here. There are two standard methods to load data to Netezza.

  • Using ODBCAppend operator.
  • Using specialized NetezzaLoad operator.
Recommended method is the second one because using the ODBCAppend could create a significant performance bottleneck. The advantage of the first method is that by changing the ODBC data source name in the configuration file one can easy switch the application from one database to another. Firstly create a simple table to be populated by our sample.
SQL> create table testt (numb integer,name varchar(100))
SQLRowCount returns -1
SQL> select * from testt;
+------------+-----------------------------------------------------------------------------------------------------+
| NUMB       | NAME                                                                                                |
+------------+-----------------------------------------------------------------------------------------------------+
+------------+-----------------------------------------------------------------------------------------------------+
SQLRowCount returns 0

Both methods require preparing connection.xml file. The connection.xml file used in this example is available here.
Hint. While working with connection.xml I found pretty difficult to identify error in this file because the StreamsStudio is not  very talkative on this. But you can execute xmlint (XML syntax checker) directly from command line to get more explanatory output. Just copy and past command line invocation from StremsStudio console and remove redirection of error output to null device.


[sb@oc8442647460 NetezzaLoader]$ pwd
/home/sb/workspace/testp/NetezzaLoader
[sb@oc8442647460 NetezzaLoader]$ xmllint --noout --schema /opt/ibm/InfoSphereStreams/toolkits/com.ibm.streams.db/com.ibm.streams.db/Common/connection.xsd ./etc/connections.xml
./etc/connections.xml:29: parser error : Opening and ending tag mismatch: table line 23 and access_specification
    </access_specification>
                           ^
./etc/connections.xml:32: parser error : expected '>'
  </access_specifications>
                        ^
./etc/connections.xml:34: parser error : Opening and ending tag mismatch: access_specifications line 10 and st:connections
</st:connections> 
                 ^
./etc/connections.xml:34: parser error : Premature end of data in tag connections line 1
</st:connections> 
                  ^
[sb@oc8442647460 NetezzaLoader]$ 
The application code is available here. It is very simple. Beacon operator produces a series of randomly generated tuples. By modifying the iteration parameter one can increase the number of tuples generated. Removing this parameter for good produce the infinite sequence. Then the output is directed to ODBCAppend operator and and the same output  is directed NetezzaPrepareLoad and NetezzaLoad operator putting it to the Netezza database.
Next step is to launch application (as distributed), wait for the moment and verify the content of testt table.

--+
SQLRowCount returns 20
20 rows fetched
SQL> select * from testt;
+------------+-----------------------------------------------------------------------------------------------------+
| NUMB       | NAME                                                                                                |
+------------+-----------------------------------------------------------------------------------------------------+
| 31099      | I'm here                                                                                            |
| 35         | Good bye, cruel world                                                                               |
| 16515      | Good bye, cruel world                                                                               |
| 31099      | I'm here                                                                                            |
| 35         | Good bye, cruel world                                                                               |
| 16515      | Good bye, cruel world                                                                               |
| 5334       | Good bye, cruel world                                                                               |
| 1586       | Good bye, cruel world                                                                               |
| 5334       | Good bye, cruel world                                                                               |
| 1586       | Good bye, cruel world                                                                               |
| 14369      | Hello Kitty                                                                                         |
| 14369      | Hello Kitty                                                                                         |
| 8129       | Good bye, cruel world                                                                               |
| 4465       | Hello world                                                                                         |
| 8129       | Good bye, cruel world                                                                               |
| 4465       | Hello world                                                                                         |
| 15072      | I'm here                                                                                            |
| 20664      | Good bye, cruel world                                                                               |
| 15072      | I'm here                                                                                            |
| 20664      | Good bye, cruel world                                                                               |
+------------+-----------------------------------------------------------------------------------------------------+
SQLRowCount returns 20
20 rows fetched
SQL> 

4 komentarze:

  1. NETEZZA Online Training
    Call Us-91-900-044-4287 21st Century Software Solutions Online Training
    IBM Netezza Overview
    Netezza Architecture
    Connecting to Netezza
    Databases
    Tables
    Database Objects
    Data Distribution
    Loading and Unloading Tables
    Statistics
    Zone Maps
    Clustered Base Tables
    Materialized Views
    http://www.21cssindia.com/courses/netezza-developer-online-training-21.html

    OdpowiedzUsuń
  2. Hi, You provided Good information about IBM Netezza it is very usefull to IBM Netezza Learners you course content also good.Netezza is widely credited for either inventing or bringing renewed attention to the data warehouse appliance category. We also providing IBM Netezza Online Training by Certified Professionals

    OdpowiedzUsuń

  3. This is the information that I was looking for.. Thanks for the efforts you put to gather such a nice content and posted here.

    http://www.iqonlinetraining.com/ibm-netezza-online-training/

    OdpowiedzUsuń
  4. I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in NETEZZA TRAINING, kindly contact us http://www.maxmunus.com/contact
    MaxMunus Offer World Class Virtual Instructor led training on TECHNOLOGY. We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
    For Demo Contact us.
    Sangita Mohanty
    MaxMunus
    E-mail: sangita@maxmunus.com
    Skype id: training_maxmunus
    Ph:(0) 9738075708 / 080 - 41103383
    http://www.maxmunus.com/

    OdpowiedzUsuń