Blog do projektu Open Source JavaHotel

środa, 22 sierpnia 2012

DB2, HADR and Java

Introduction
HADR (High Availability and Disaster Recovery) is working well with C/C++ client but what about Java client. No problem, DB2 is also supporting Java without any problem for both DataSource and DriverManager client. It is described with details here.
Example for DriverManager client
I created a simple console Java application which connects to DB2 HADR controlled database and executes several JDBC action. Full source code is available here. How to setup a simple HADR configuration is described here.
Sample application description
There is a Java interfaces encapsulating some JDBC action.

public interface IJDBCAction {

        /**
         * Connect to database
         */
        public void connect();

        /**
         * Disconnect
         */
        public void disconnect();

        /**
         * Add new person to PERSON table.
         * 
         * @param re
         *            Person data to add (only first and family name should be
         *            filled)
         */
        public void addPerson(PersonRecord re);

        /**
         * Return the list of the persons
         * 
         * @return List of persons.
         */
        List<PersonRecord> getList();

        /**
         * Create PERSON table
         */
        public void createTable();

        /**
         * Drop PERSON table
         */
        public void dropTable();

        /**
         * Get current connection status
         * 
         * @return String describing the current status
         */
        public String getStatus();

        /**
         * Get the status of the last executed statement
         * 
         * @return String describing the status of the last statement
         */
        public String getLastResult();

        /**
         * Set autocommit mode
         * @param on true: set autocommit on, off otherwise
         */
        public void setAutocommit(boolean on);
}
There is also a concrete implementation of this interface, factory providing this implementation and simple console application with UI for this JDBC actions. Pay attention that it is the responsibility of the   IJDBCAction user to provide valid Connection reference. This way IJDBCAction implementation is free of any configuration details and can be reused in different context.
The application itself:

Connection status: Not connected
1) Connect
2) Disconnect
3) Read list
4) Add person
5) Create PERSON table
6) Drop PERSON table
7) Set autocommit on
8) Set autocommit off
99) Exit
There is nothing special in it, everything is a typical JDBC application. Because Java client does not use CLP connection pools, to make it HADR enabled connection URL string should be extended. It is described here.
The code sample used in the application.

  private static class GetConn implements IGetConnection {

                private final String user = "db2had1";
                private final String password = "db2had1";
                private final String url = "jdbc:db2://think:50001/sample";

                /** Parameters for HADR stand by, server name and port. */
                private final String serverAlternate = "think";
                private final int portAlternate = 50002;

                @Override
                public Connection getCon() throws SQLException, ClassNotFoundException {
                        Class.forName("com.ibm.db2.jcc.DB2Driver");
                        // Construct URL referencing HADR environment
                        String u = url
                                        + ":"
                                        + com.ibm.db2.jcc.DB2BaseDataSource.propertyKey_clientRerouteAlternateServerName
                                        + "=" + serverAlternate + ";";
                        u += com.ibm.db2.jcc.DB2BaseDataSource.propertyKey_clientRerouteAlternatePortNumber
                                        + "=" + portAlternate + ";";
                        // Connect !
                        Connection con = DriverManager.getConnection(u, user, password);
                        return con;
                }
        }
To make connection HADR enabled one has to add two additional parameters with server and port name for HADR stand by server and that is final. Nothing more is necessary.
Making the application HADR failover transparent
Extending URL string with HADR related parameters causes only that after HADR failover application is automatically redirected to another server. But current transaction is rollbacked with SQL error message thrown:
[jcc][t4][2027][11212][4.7.85] Użycie połączenia nie powiodło się, lecz połączenie to zostało ustanowione ponownie. Nazwa hosta lub adres IP to "think", a nazwa usługi lub numer portu to 50 001.
Ponowna próba ustawienia rejestrów specjalnych może, lecz nie musi zostać podjęta (kod przyczyny = 1). ERRORCODE=-4498, SQLSTATE=08506
The solution is to replay the last transaction. But the question could be raised - how to accomplish this task automatically without any user or db admin intervention, just make this problem completely transparent. There could be a lot of different ideas - the idea implements in this sample application is to use command design pattern. To run any JDBC related task one has to extend an abstract class:
/**
 * Command design pattern, abstract command
 * 
 * @author sbartkowski
 * 
 */
abstract class SqlCommand {

        protected final SQLCommandRunTime context;

        /** Error code 'connection reestablished' */
        private final int RETRYCODE = -4498;

        /**
         * Constructor
         * 
         * @param context
         *            SQLCommandRunTime context
         */
        SqlCommand(SQLCommandRunTime context) {
                this.context = context;
        }

        /**
         * Transactional command to be executed.
         * 
         * @throws SQLException
         *             SQLException thrown
         */
        abstract protected void command() throws SQLException;

        /**
         * Executes command, encapsulation around lower level transactional code.
         * Implements logic related to HADR failover
         */
        void runCommand() {
                if (context.con == null) {
                        context.lastResult = "Cannot execute, not connected";
                        return;
                }
                try {
                        command();
                        context.lastResult = "Command executed successfully";
                } catch (SQLException e) {
                        int errCode = e.getErrorCode();
                        if (errCode == RETRYCODE) {
                                // run again command in case of reestablish error
                                try {
                                        command();
                                        context.lastResult = "Command executed successfully";
                                } catch (SQLException e1) {
                                        // do not try again, throw exception unconditionally
                                        context.lastResult = e.getMessage();
                                        e.printStackTrace();
                                }
                        } else {
                                // throws exception if error code is not reestablish error
                                context.lastResult = e.getMessage();
                                e.printStackTrace();
                        }

                }
        }
}
This way all logic related to 'connection reestablished' SQL error is implemented in one place. Pay attention that command is replayed only once, second execution of the transaction should be completed without any problem. If it does not happen then it is much better to raise an exception at once without any delay.
Conclusion

  • It is quite simple to enable Java application for HADR, only simple URL string extension is required.
  • This sample application keeps all connection details hardcoded in the source code. In the most business application database connection details are reconfigurable outside the source code but by all means no problem also to extend it to HADR related parameters.
  • This example is related only to connection extracted from DriverManager. DataSource connection (typical for web application) should be enabled for HADR another way.
  • Making application HADR failover transparent requires proper handling of "A connection failed but has been reestablished." error exception. It is quite simple if application is started from the very beginning (for example by using the solution described above) but could be complicated in case of legacy code.

niedziela, 19 sierpnia 2012

JspWiki and Bach Cantatas

I started to collect Bach Cantatas recordings on YouTube service hoping to collect them all. The result of my research is available on JSPWiki implementation.
There are a lot recordings available on YouTube but I'm not happy with a lot of them. Some of them are partial  (only one movement), some of them are of very poor quality (for instance vinyl records ripped and uploaded to YouTube). There is almost complete recordings of Bach Cantatas made by Nikolaus Harnoncourt and Gustav Leonhardt but the quality is not perfect. I mean, of course, the quality of recording uploaded to You Tube - not the quality of performance which is excellent.
So I started to collect recording I'm happy with using the following rules:

  • Completed (not single movement).
  • Good recording quality. It goes without saying that 'quality' here reflects my personal taste.

środa, 8 sierpnia 2012

CellTree and CellTable

Introduction
I added a support for CellTree in my JavaHotel framework. Demo is available here. Source code containing  implementation is available here.
Of course - there is nothing special in it - samples are available together with the javadoc. My purpose was to make table view "tree view" enabled.  For some data tree (hierarchical) view is often more accurate. For instance - chart of accounts - has hierarchical structure. Some accounts are used as a collective for more specialized accounts related to them. Or we can have ledger accounts and subledger (subsidiary ledger) containing more specialized information.
On the other hand flat structure is also very convenient.
More information
So this is why this extension was created. We can make some tables (for instance char of accounts) to be "tree enabled" but all the time having the property to see them as a flat table.
The main problem was to include information about tree structure inside the table to avoid keeping additional metadata.
It was achieved by adding additional numeric column to the table. This column contains two information: whether the record is tree node or a leaf and depth of the node (leaf). The tree structure reads from the beginning to the end. If the column contains number greater or equal then 100 it is regarded as a node and leaf otherwise. The number itself (minus 100 for nodes) is regarded as a node (leaf) depth. So the notation is as follows: node is the first and then all next records with the level + 1 are its children until first record (node or leaf) with the level less then the starting node is detected.
Example (* marks the node) :


*A     (number = 100, node and level = 0)
 A1    (number = 1, leaf and level = 1
 A2    (number = 1, next leaf and level = 1)
 *A3   (number = 101, node and level = 1)
  A31  (number = 2, leaf and level = 2, child of the node A3)
  A32  (number = 3, next leaf and level = 2)
 A4    (number = 1, leaf and level = 1, next child for node A)
B      (number = 0, lead and level = 0, next child for the root) 
Additional info 

  • There is no need for this additional column to be displayed.
  • In the table (tree) view header additional property menu is added. It allows to switch between table view and tree view.
  • The disadvantage of the tree view is that a record filter cannot be applied. After removing some record (for instance nodes) it is unlikely to keep the proper tree structure. So when table view is filtered the switching to tree view is blocked.

poniedziałek, 30 lipca 2012

DB2, HADR and C++, make an application failover transparent

Introduction
When failover is executed the application receives "SQL30108N" exception. The remedy is quit simple, just redo the last transaction. But how to accomplish it in a programmatic way ? If our application contains tens or hundreds transaction scattered through the code it could be complicated. One would have to identify all transaction boundaries and add additional logic to detect "SQL30108N" exception after every commit and replay transaction.
Solution example
There is no universal solution to this problem but in the example provided before  it is quite easy to make this application failover transparent. It can be done by virtue of command pattern implemented there.
To implement all methods exposed in header I designed a simple class with signature like

class SqlCommand {
    hadrmodel &ha;
protected:
    virtual SQLRETURN command(hadrmodel &ha,SQLHANDLE  StmtHandle) = 0;
    bool errorHandleSTMT;
    
    bool directrunCommand(std::string info) {
    .....
    SQLRETURN RetCode = command(ha,StmtHandle);
    ....
public :
    void runCommand(std::string info) ...
}
Client should implement 'command' method and reuse some common stuff provided (for instance statement handler created before). Main function (runCommand method) is responsible for preparing environment, launching 'command' and evaluating result. We assume that every instance of SqlCommand is stateless and contains all necessary data. Because main class evaluates result it is quite easy to recognize "SQL30108N" exception (-30108 numerically) and rerun command again. So assuming that 'directrunCommand' return 'true' if replaying is suggested the main method looks like.
  ....
  public:
    SqlCommand(hadrmodel &par) : ha(par) {}
    void runCommand(std::string info) {
      if (directrunCommand(info)) {
        directrunCommand(info);
      }
    }
  ....
The full source code is available here.
Because the next transaction after failover should be executed correctly there is no loop but simple logic : 'if (repeat suggested) then {run again once}' . It is unlikely to have series of failover and failback one after another so if again "SQL30108N" will be caught it is the other mistake and it is much better to raise an error at once.
Only modifying 'sqlCommand' class was necessary in this example. All 'client' code (meaning 'sqlCommand' implementations) and application code did not require any fixes.
Conclusion
Making application failover transparent was very easy in this simple example. I dare say that even the application were more complicated (meaning more sql related commands) it would be simple also and would involve only fixes in one place.
But it is not an universal solution, it was possible only because a proper design pattern was implemented from the very beginning.

niedziela, 29 lipca 2012

JSPWiki in the Clouds and new database schema

Introduction
I decided to change a database schema drastically. The database schema implemented so far was  typical relational with tables, rows and one to many relationship. But - on the other hand - for what reason is to have a dedicated table for Wiki user groups if they are read all at once and never searched at the table level ? The same for users. Also it does not make a lot of sense to have a table and separated rows for every wiki page version. We are interested only at the latest version and the history (all versions) at once. There is never any search for a particular version.
New database schema
Groups
New database entity is defined here.


@Entity 
@MappedSuperclass
public abstract class AbstractEntBlob {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Key key;

    @Basic(optional = false)
    private Blob contentByte;

    public byte[] getContent() {
        return contentByte.getBytes();
    }

    public void setContent(byte[] memberList) {
        this.contentByte = new Blob(memberList);
    }

    public Key getKey() {
        return key;
    }
    
}
It is one record for all groups and Blob column keeping list of all groups in a normalized form. So in order to read all groups it is enough to read one record instead of reading list of rows.

Users
The same as above. One Blob column keeps list of all users in a normalized format.

Wiki page
Entity is defined here. It is the same superclass as before with additional 'page name' attribute. All page versions are kept as a normalized list in a Blob column. So in order to read all page history it is enough to read one record. To get the current version simply find the version with the greatest version number by applying trivial java written algorithm.
The content of the page is stored in a separate entity, one row for every page and every version.
@NamedQueries({ @NamedQuery(name = "FindPageContent", query = "SELECT P FROM WikiOnePage P WHERE P.pageKey = :1 AND P.version = :2") })
@Entity
public class WikiOnePage {
  
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Key key;

    @Basic(optional = false)
    private long pageKey;

    @Basic(optional = false)
    private int version;
    
    @Basic(optional = false)
    private Text content;

    public void setPageKey(long pageKey) {
        this.pageKey = pageKey;
    }

    public int getVersion() {
        return version;
    }

    public void setVersion(int version) {
        this.version = version;
    }

    public String getContent() {
        return content.getValue();
    }

    public void setContent(String content) {
        this.content = new Text(content);
    }

}
Conclusion
The current version is available here. There is a slight performance improvement but the number of reads from datastore is reduced 5-10 times. From the user point of view there is no any difference.

But this approach has several disadvantages. Because relational columns are reduced so also the possibility to run a relational query on the new schema is reduced. It does not matter here but can be a significant degradation in a different context.
Also replacing rows in a relational table with normalized java list in a single blob column means putting a limit on the number of entities - it depends on the size of single entity and limitation on the size of blob column. It does not matter here (it is unlikely to have millions of users and one wiki page wirh millions versions) but can be important in a different context.

So escaping from relational schema to gain some benefits on performance is not always the best approach. One has to leverage advantages and disadvantages.

poniedziałek, 23 lipca 2012

DB2, HADR and C++

Introduction
HADR works well with CLP (Command Line Processor) but what about applications ? Could they also benefit from being connected to HADR environment ? To check it I created simple C++/ODBC application.
Application
The application is available here. There is a simple class encapsulating access to DB2.

class person {
  friend class QueryCommand;
  person(int pid,const char *pname, const char *pfamilyname) : id(pid),name(pname),familyname(pfamilyname) {}
public:
  int id;
  std::string name;
  std::string familyname;
};

class hadrmodel
{
  friend class SqlCommand;
  friend class LogoutCommand;
  std::string lastActionStatus;
  SQLHANDLE  EnvHandle;
  SQLHANDLE  ConHandle;
public:
  hadrmodel() {
    EnvHandle = ConHandle = 0;
  }
  void connect();
  void disconnect();
  void createPersons();
  void dropPersons();
  void autoCommit(bool on);
  std::vector<person> getListOfPersons();
  void addPerson(const std::string &name, const std::string &familyName);
  SQLHANDLE getConHandle() const { return ConHandle; }
  std::string getConnectionStatus() const;
  std::string getLastActionStatus() const { return lastActionStatus; }
};
And simple console application making usage of the methods exposed.

#include <iostream>

#include "hadrmodel.h"

namespace {
void P(std::string s) {
    std::cout << s << std::endl;
}

void textInterface(hadrmodel &ha) {
    while (1) {
        P("=======================");
        P(ha.getConnectionStatus());
        P("1) Connect");
        P("2) Disconnect");
        P("3) Get list of persons");
        P("4) Add person");
        P("5) Create table PERSONS");
        P("6) Drop table PERSONS");
        P("7) Switch on autommit");
        P("8) Switch off autommit");
        P("99) Exit");
        int iChoice;
        std::cout << "Enter:";
        std::cin >> iChoice;
        bool action = true;
        switch (iChoice) {
        case 99:
            return;
        case 1:
            ha.connect();
            break;
        case 2:
            ha.disconnect();
            break;
        case 3:
        {
            std::vector<person> pList = ha.getListOfPersons();
            std::vector<person>::iterator i = pList.begin();
            for (; i != pList.end(); i++) {
                std::cout<< "id:" << i->id << " name: " << i->name << " family name:" << i->familyname << std::endl;
            }
        }
        break;
        case 4:
        {
            std::string name;
            std::string familyName;
            std::cout << "Name: ";
            std::cin >> name;
            std::cout << "Family name: ";
            std::cin >> familyName;
            ha.addPerson(name,familyName);
        }
        break;
        case 5:
          ha.createPersons();
          break;
        case 6:
          ha.dropPersons();
          break;
        case 7:
        case 8:
          ha.autoCommit(iChoice == 7 ? true : false);
          break;
        default:
            action = false;
            break;
        } // switch
        if (!action) {
            P("Invalid action.");
        }
        else {
            P(ha.getLastActionStatus());
        }
    } // while
}

} // namespace

int main(int argc, char **argv) {
    P("Hello, world!");
    hadrmodel ha;
    textInterface(ha);
    P("I'm done");
}
Test
One can replay all test described before and be sure that everything is working perfectly also with this simple C++  application.
Autocommit on/off
There is only a difference in behavior related to "autocommit" mode. More information about autocommit mode here. Below is an example where autocommit mode is "on". The HADR takeover was performed at option 4) - adding a new person to the person table.































One interesting thing - there is nothing special in this screen. Although takeover has been performed the new person have been added smoothly without throwing any exception. So C++/ODBC application in autocommit mode "on" is takeover resistant.
But most database application must have autocommit switch "off" and mark transaction boundaries by releasing "commit" command manually.
So run again the same test but with autocommit mode off (option 8).


































This time 'SQL30108N' exception is thrown and transaction is rolled back. But the connection is still alive. We can repeat the last operation and this time it will be successful.
Conclusion
Also C++ application can benefit from HADR without any changes in the code. What is mode - if the autocommit mode is "on" the application is takeover transparent. In case of autocommit mode "off" (probably most applications are running in this mode) the exception is thrown exactly like CLP behaves but the connection still exists.  But there is a question - is it possible to create a C++/ODBC application takeover resistant also for autocommit mode "off".



niedziela, 8 lipca 2012

JSPWiki in the Cloud and download/upload

Finally I added download and upload option to my JSPWiki cloud implementation. You have to click "More .." option in the right-up corner of page tab. "Download" wiki content as XML file and opposite function - "Upload". Unfortunately, what seemed quite easy at the beginning ended up with huge refactoring. During testing of "Upload" function additional refresh references functionality is fired and it revealed circular dependency in spring beans created. It took me a lot of time to understand the dependency and to entangle them. It was necessary to modify the creation pattern for several beans and ended up with a bean factory.
Also wiki uploading is not simple, there could be several use scenarios.

  1. Remove current wiki content and upload new one. Something like similar "backup" and "restore" option in databases, new wiki content is a clone of the content read from source file.
  2. Merge existing content with the new one. But what to do if there exists a page with the same name. May be the solution is to add an "overwrite" option, globally set  or per case. But there is a risk that wiki content could be a chaotic mesh of old and new pages.
  3. Merge content and in the case of the pages with the same name page content read from source is added as a new version to the existing. It sounds much better then before but there is a risk that the number of version will double, quadruple etc after every upload.
  4. Merge content and in the case of the pages with the same page recognize also page version. Overwrite page content at the page and version level. 
  5. Merge content and add only the last version from the input.
  6. ....
There could be a lot of scenarios. Finally I decided to implement the scenario number 4. I cannot tell for certain if it is the best one but for the time being it seems to me as being the most sound.

The next and very important task is to change the database schema and reduce the number of reads by adding a cache for pages. The current implementation is very inefficient in terms of number of reads and Google App Engine quota for free access is hit very quickly causing the freezing the application for 24 hours until quotas are reset.