Blog do projektu Open Source JavaHotel

sobota, 30 listopada 2013

Netezza Software Emulator, Linux and IBM InfoSphere Streams

Introduction
Netezza and InfoSphere Streams are the part of IBM Big Data offering. Although both are commercial products, good news is that for educational and non-production purpose also free versions are available.
Netezza Software Emulator can be downloaded here.
Non-production InfoSphere Streams package is available here. Although fresh installation of IBM InfoSphere Streams is not complicated it is a good idea to start with ready to use VMware image.
Netezza Software Emulator
Netezza in production requires dedicated hardware but for developing and discovering one can play with VMware machines (Host and SPU) behaving exactly like the real Netezza database. The Netezza Emulator is available only for Windows but it is possible to execute it also under Linux. But keep in mind that IBM supports only Windows version so in case of any problems you are left on your own.
Netezza Software Emulator for Linux
Download and execute INSEfDsetup.exe (it requires at least 13GB free disk space). After downloading transfer unpacked VMware machines (c:\Program Files\IBM\Netezza Software Emulator for Developer\VMS) to the Linux machine. Before starting them create additional vmnet3 network.

This network connects Host and SPU machine. Important: switch off DHCP and set IP address as 10.0.0.1.
Starting Netezza Emulator
Firstly start Host machine. After several minutes log in as nz user (standard password: nz) and execute nzstate. If output is like:
[nz@netezza ~]$ nzstate
System state is 'Discovering'.
[nz@netezza ~]$ nzhw
Description HW ID Location   Role   State
----------- ----- ---------- ------ -------
SPA         1001  spa1       None   Ok
Disk        1002  spa1.disk1 Active None
Disk        1003  spa1.disk2 Active None
Disk        1004  spa1.disk3 Active None
Disk        1005  spa1.disk4 Active None
Disk        1006  spa1.disk5 Spare  None
SPU         1007  spa1.spu1  Active Booting
[nz@netezza ~]$ 

In case of any problem simply launch nzstart command. Next step is to start SPU machine. The SPU machine should connect to Host and start networked booting. Do not bother about a good number of technical messages flying through the screen, it is as expected and does not mean that something wrong is happening. After some time execute again nzstate command in the Host machine. If the screen is like:
[nz@netezza ~]$ nzhw
Description HW ID Location   Role   State
----------- ----- ---------- ------ ------
SPA         1001  spa1       None   Ok
Disk        1002  spa1.disk1 Active OkSQL> create table testt (numb integer,name varchar(100))
SQLRowCount returns -1
SQL> select * from testt;
+------------+-----------------------------------------------------------------------------------------------------+
| NUMB       | NAME                                                                                                |
+------------+-----------------------------------------------------------------------------------------------------+
+------------+-----------------------------------------------------------------------------------------------------+
SQLRowCount returns 0

Disk        1003  spa1.disk2 Active Ok
Disk        1004  spa1.disk3 Active Ok
Disk        1005  spa1.disk4 Active Ok
Disk        1006  spa1.disk5 Spare  Ok
SPU         1007  spa1.spu1  Active Online
[nz@netezza ~]$ nzstate
System state is 'Online'.
[nz@netezza ~]$ 
it means that Netezza is started and ready to handle the requests. Then prepare a test database, a test database and a test user.
  nzsql
  create database test;
  create user testuser with password 'secret';
  grant all admin to testuser;
  grant all on test to testuser;
Check if testuser can log in into test database and execute some basic queries.
 nzsql test -U testuser -W secret
 create table x (x int);
 insert into x values(1);
 select * from x;
 drop table x;
Obtain Netezza connection software
This software can be downloaded from this place.Important: this software is available only for IBM customers and business partners.
Install and configure ODBC connection to Netezza on RedHat 6
Installing Netezza client is very simple and straightforward. Just unpack the Netezza client package (something like nz-linuxclient-v7.0.3-P2.tar.gz) and execute two unpack scripts, one in linux directory and the second in linux64 directory. The default installation location is /usr/local/nz.
Firstly test the connection by executing nzsqlodbc utility from /usr/local/nz/bin64 directory (netez is the host name of the HOST VMmachine in my environment).
[root@oc8442647460 bin64]# ./nzodbcsql -h netez test testuser secret

NZODBCSQL - program to test Netezza ODBC driver
            NOT FOR PRODUCTION USE


     Type 'quit' or '\q' or CTRL-D or ENTER at the prompt to quit
     NOTE: Max 100 rows are displayed (for selects)

 Driver version  : 'Release 7.0.3 (P-2) [Build 32506]'
 NPS version  : '07.00.0003 Release 7.0.3 (P-2) [Build 32506]'
 Database  : 'TEST'


nzodbc > 

Secondly configure unixODBC connection. That's very simple, just follow instructions in /usr/local/nz/lib64/ODBC_README note. In my environment I replaced host, database name, user name and password with netez,test,testuser and secret. Then test the connection using isql utility.
[sb@oc8442647460 ~]$ rlwrap isql NZSQL
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL> create table x (x int);
SQLRowCount returns -1
SQL> insert into x values(1);
SQLRowCount returns 1
SQL> select * from x;
+------------+
| X          |
+------------+
| 1          |
+------------+
SQLRowCount returns 1
1 rows fetched
SQL> 
Important: I failed trying to configure unixODBC Netezza connection in Ubuntu (although nzodbcsql utility is working). Probably there are some incompatibilities with unixODBC version installed as default into Ubuntu.
Configure InfoShophere Streams environment to work with Netezza 
It is described in detail here.
It looks complicated but it is enough to add three entries to .bashrc file.

export STREAMS_ADAPTERS_ODBC_NETEZZA=NETEZZA
export STREAMS_ADAPTERS_ODBC_INCPATH=/usr/include/
export STREAMS_ADAPTERS_ODBC_LIBPATH=/usr/lib64/
Important: InfoSphere Streams can work with one database only. So if a connection with Netezza is specified the access to other database (like DB2) is not possible.
Create a simple InfoSphere Streams application to load data to Netezza
Source code (StremsStudio project) is available here. There are two standard methods to load data to Netezza.

  • Using ODBCAppend operator.
  • Using specialized NetezzaLoad operator.
Recommended method is the second one because using the ODBCAppend could create a significant performance bottleneck. The advantage of the first method is that by changing the ODBC data source name in the configuration file one can easy switch the application from one database to another. Firstly create a simple table to be populated by our sample.
SQL> create table testt (numb integer,name varchar(100))
SQLRowCount returns -1
SQL> select * from testt;
+------------+-----------------------------------------------------------------------------------------------------+
| NUMB       | NAME                                                                                                |
+------------+-----------------------------------------------------------------------------------------------------+
+------------+-----------------------------------------------------------------------------------------------------+
SQLRowCount returns 0

Both methods require preparing connection.xml file. The connection.xml file used in this example is available here.
Hint. While working with connection.xml I found pretty difficult to identify error in this file because the StreamsStudio is not  very talkative on this. But you can execute xmlint (XML syntax checker) directly from command line to get more explanatory output. Just copy and past command line invocation from StremsStudio console and remove redirection of error output to null device.


[sb@oc8442647460 NetezzaLoader]$ pwd
/home/sb/workspace/testp/NetezzaLoader
[sb@oc8442647460 NetezzaLoader]$ xmllint --noout --schema /opt/ibm/InfoSphereStreams/toolkits/com.ibm.streams.db/com.ibm.streams.db/Common/connection.xsd ./etc/connections.xml
./etc/connections.xml:29: parser error : Opening and ending tag mismatch: table line 23 and access_specification
    </access_specification>
                           ^
./etc/connections.xml:32: parser error : expected '>'
  </access_specifications>
                        ^
./etc/connections.xml:34: parser error : Opening and ending tag mismatch: access_specifications line 10 and st:connections
</st:connections> 
                 ^
./etc/connections.xml:34: parser error : Premature end of data in tag connections line 1
</st:connections> 
                  ^
[sb@oc8442647460 NetezzaLoader]$ 
The application code is available here. It is very simple. Beacon operator produces a series of randomly generated tuples. By modifying the iteration parameter one can increase the number of tuples generated. Removing this parameter for good produce the infinite sequence. Then the output is directed to ODBCAppend operator and and the same output  is directed NetezzaPrepareLoad and NetezzaLoad operator putting it to the Netezza database.
Next step is to launch application (as distributed), wait for the moment and verify the content of testt table.

--+
SQLRowCount returns 20
20 rows fetched
SQL> select * from testt;
+------------+-----------------------------------------------------------------------------------------------------+
| NUMB       | NAME                                                                                                |
+------------+-----------------------------------------------------------------------------------------------------+
| 31099      | I'm here                                                                                            |
| 35         | Good bye, cruel world                                                                               |
| 16515      | Good bye, cruel world                                                                               |
| 31099      | I'm here                                                                                            |
| 35         | Good bye, cruel world                                                                               |
| 16515      | Good bye, cruel world                                                                               |
| 5334       | Good bye, cruel world                                                                               |
| 1586       | Good bye, cruel world                                                                               |
| 5334       | Good bye, cruel world                                                                               |
| 1586       | Good bye, cruel world                                                                               |
| 14369      | Hello Kitty                                                                                         |
| 14369      | Hello Kitty                                                                                         |
| 8129       | Good bye, cruel world                                                                               |
| 4465       | Hello world                                                                                         |
| 8129       | Good bye, cruel world                                                                               |
| 4465       | Hello world                                                                                         |
| 15072      | I'm here                                                                                            |
| 20664      | Good bye, cruel world                                                                               |
| 15072      | I'm here                                                                                            |
| 20664      | Good bye, cruel world                                                                               |
+------------+-----------------------------------------------------------------------------------------------------+
SQLRowCount returns 20
20 rows fetched
SQL> 

niedziela, 24 listopada 2013

How to quickly create and remove Oracle database

Introduction
Sometimes it is necessary for me to quickly create an Oracle database, load data, perform some test and analysis and clear when it is done. I realized that it is much better to create a new database (instance) instead of constantly fleshing out the existing one. By removing this instance I can also reclaim disk space.
Create a database 
I want to create a tempdb database.
1. Prepare initialization parameter file.
tempdb.__oracle_base='/home/oracle/app/oracle'#ORACLE_BASE set from environment
*.db_create_file_dest='tempdb'
*.db_create_online_log_dest_1='tempdb'
*.db_name='tempdb'
*.db_recovery_file_dest='tempdb'
*.db_recovery_file_dest_size=10G
*.diagnostic_dest='tempdb'
2. Put this file in proper place (/home/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/inittempdb.ora in my environment)
vi /home/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/inittempdb.ora 
3. Make directory for instance data
mkdir /home/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/tempdb 
4. Set ORACLE_SID
export ORACLE_SID=tempdb
5. Start sqlplus as DBA
sqlplus / as sysdba 
6. Create database, test user and run system scripts.
create spfile from pfile; 
startup nomount 
create database tempdb; 
create user test identified by secret;
grant all provileges to test;
@?/rdbms/admin/catalog.sql
@?/rdbms/admin/catproc.sql
@?/sqlplus/admin/pupbld.sql
shut
7. Prepare settings for remote access (file /home/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin/tnsnames.ora) Add entries like:
LISTENER_TEMPDB =
  (ADDRESS = (PROTOCOL = TCP)(HOST = think)(PORT = 1521))

TEMPDB =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sb-ThinkPad-T42)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = tempdb)
    )
  )
8. Add entry to /etc/oratab
tempdb:/home/oracle/app/oracle/product/11.2.0/dbhome_1:Y
9. Start database
dbstart
10. Prepare entry for sqldeveloper
11. Connect and act

Remove database
It is pretty easy (look also).

export ORACLE_SID=mydb
sqlplus / as sysdba
startup force mount
alter system enable restricted session;
drop database;
Remove also entries from /home/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin/tnsnames.ora Reclaim disk space
rm -rf /home/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/tempdb

piątek, 15 listopada 2013

Clojure, tautology and logical expression

Introduction
I decided to create a simple Clojure project for verifying logical tautologies. The source code is available here.

For instance:

 ((A \land B) \to C) \Leftrightarrow (A \to (B \to C)).
The algorithm is simple:
  1. Parse the logical expression
  2. Generate all possible combination of values for literals in the expression
  3. Evaluate the results for any value
  4. If every result is true then we have a tautology
It is not an optimal method for proving the tautology because the complexity is exponential but if we have a small number of literals it can run pretty fast.

Definition
Grammar for logical expression to be analyzed:

<expression> ::=
    [~] < expression >
         | <literal>
         | ( < expression > <operator> <expression>)

< literal > ::= chain of letter and digits starting from letter
< operator > ::= 
    & : logical AND
    ^ : logical OR
    == : logical equivalence
    => : logical implication
    ~ : logical NOT

Remarks:
  • I assume that every subexpression (except literals) should be enclosed in parenthesis. For instance: (A & B) but not (A & B ^ C). This I'm avoiding a problem related to operator precedence. Because any logical expression can be "normalized" that way it is not a limitation which matters.
  • The tautology quoted above can be put down that way: (( A & B) => C) == (A => (B => C))
  • One of De Morgan's Laws: ~(A & B) == (~A ^ ~B)
Parser
The first thing to do is the "tokenizer" - creates a flow of separated components. It can be achieved simply by means of regular expression.

  [seqw (re-seq #"[A-Za-z][\w]*|[\^\(\)&~]|=>|==" str)]

Source code for parser is available here.
 
defn parse
"parse string expression
 returns three element vector (Polish notation: (operator, left operand, right operand)
 left/right operand can be an atomic operand (variable) or 
 subexpression (another triada vector)"
[str]

Parser decomposes input string to "Polish notation". Every part of the expression is transformed to a triad [operator, left operand, right operand]. In the case of negation (one parameter operator) the right operand is nil.

Evaluation

The evaluation (counting value) is very simple while the expression is decomposed.

(defn evaluate 
    [expr val]

"evaluate (calculate value) of the parsed expression using current values (as hash-map)
 returns true or false
"
    (defn getval [subexpr]
    (let 
      [operator (first subexpr)
       first (fnext subexpr)
       second (nth subexpr 2)
       firstval (if (vector? first) (getval first) (get val first))
       secondval (if (vector? second) (getval second) (get val second))
      ] 
     (cond 
        (= operator nil) firstval
        (= operator "&") (and firstval secondval)
        (= operator "^") (or firstval secondval)
        (= operator "==") (= firstval secondval)
        (= operator "=>") (if (and firstval (not secondval)) false true)
        (= operator "~") (not firstval)
        :else nil
     )
     )
     )
     (getval expr)
)
Just scan the parsed expression and calculate. "Values" is the map: { "literal" : true/false }.
Verify the tautology
 By "verifying the tautology" I mean calculating the value for every combination of literals and making sure that it evaluates as "true" all the time. First thing is to extract all literals from the parsed string. It is received by a simple statement (assuming that 'e' is parsed expression)
distinct (filter #(and (not (nil? %)) (Character/isLetter  (first %))) (flatten e))
The number of all literal combination is 2 power {number of literals}. So generating all combinations can be achieved by iterating all integers from 0 to 2 power {number of literal} and extracting literal values from binary representation of the current integer (1 as true and 0 as false.
(defn genvalues [lvals num ]
     (defn getsinglevalue [k]
        [ (nth lvals k) (odd? (bit-shift-right num k)) ]
     )
     (map getsinglevalue (range 0 (count lvals)))
  )

The last but not least is the main "verify" function. The function returns a vector of values producing false. So the expression is tautology if the output list is empty.
(defn varifytautology

"verify if logical expression is tautology
 returns the collection of values giving false result
 if list is empty then expression is tautology
"

  [str]
  (let [ vals (calculateall str)]
;    (seq (for [x vals] (println x)))
    (reduce #( if (not (second %2 )) (conj %1 %2) %1) [] vals)
  )
)

Unit tests
The series of unit tests is provided with the source. Example
; DeMorgan's Low
(deftest parse-taut2
  (let [sent "(~(p & q) == (~p ^ ~q))"
        lfalse (logical.expression/varifytautology sent)
       ]
;   (println lfalse)   
    (is (empty? lfalse))
  )
)

; DeMorgan's Low
(deftest parse-taut3
  (let [sent "(~(p ^ q) == (~p & ~q))"
        lfalse (logical.expression/varifytautology sent)
       ]
;   (println lfalse)   
    (is (empty? lfalse))
  )
)

(deftest parse-taut4
  (let [sent "((a&b)=>c) == (a=>(b=>c))))"
        lfalse (logical.expression/varifytautology sent)
       ]
;   (println lfalse)   
    (is (empty? lfalse))
  )
)

; the Distribution Law
(deftest parse-taut5
  (let [sent "(p^(q&r)) == ((p^q)&(p^r))"
        lfalse (logical.expression/varifytautology sent)
       ]
;   (println lfalse)   
    (is (empty? lfalse))
  )
)
I'm happy to announce the De Morgan's Laws still holds true.