Blog do projektu Open Source JavaHotel

niedziela, 29 lipca 2012

JSPWiki in the Clouds and new database schema

Introduction
I decided to change a database schema drastically. The database schema implemented so far was  typical relational with tables, rows and one to many relationship. But - on the other hand - for what reason is to have a dedicated table for Wiki user groups if they are read all at once and never searched at the table level ? The same for users. Also it does not make a lot of sense to have a table and separated rows for every wiki page version. We are interested only at the latest version and the history (all versions) at once. There is never any search for a particular version.
New database schema
Groups
New database entity is defined here.


@Entity 
@MappedSuperclass
public abstract class AbstractEntBlob {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Key key;

    @Basic(optional = false)
    private Blob contentByte;

    public byte[] getContent() {
        return contentByte.getBytes();
    }

    public void setContent(byte[] memberList) {
        this.contentByte = new Blob(memberList);
    }

    public Key getKey() {
        return key;
    }
    
}
It is one record for all groups and Blob column keeping list of all groups in a normalized form. So in order to read all groups it is enough to read one record instead of reading list of rows.

Users
The same as above. One Blob column keeps list of all users in a normalized format.

Wiki page
Entity is defined here. It is the same superclass as before with additional 'page name' attribute. All page versions are kept as a normalized list in a Blob column. So in order to read all page history it is enough to read one record. To get the current version simply find the version with the greatest version number by applying trivial java written algorithm.
The content of the page is stored in a separate entity, one row for every page and every version.
@NamedQueries({ @NamedQuery(name = "FindPageContent", query = "SELECT P FROM WikiOnePage P WHERE P.pageKey = :1 AND P.version = :2") })
@Entity
public class WikiOnePage {
  
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Key key;

    @Basic(optional = false)
    private long pageKey;

    @Basic(optional = false)
    private int version;
    
    @Basic(optional = false)
    private Text content;

    public void setPageKey(long pageKey) {
        this.pageKey = pageKey;
    }

    public int getVersion() {
        return version;
    }

    public void setVersion(int version) {
        this.version = version;
    }

    public String getContent() {
        return content.getValue();
    }

    public void setContent(String content) {
        this.content = new Text(content);
    }

}
Conclusion
The current version is available here. There is a slight performance improvement but the number of reads from datastore is reduced 5-10 times. From the user point of view there is no any difference.

But this approach has several disadvantages. Because relational columns are reduced so also the possibility to run a relational query on the new schema is reduced. It does not matter here but can be a significant degradation in a different context.
Also replacing rows in a relational table with normalized java list in a single blob column means putting a limit on the number of entities - it depends on the size of single entity and limitation on the size of blob column. It does not matter here (it is unlikely to have millions of users and one wiki page wirh millions versions) but can be important in a different context.

So escaping from relational schema to gain some benefits on performance is not always the best approach. One has to leverage advantages and disadvantages.

Brak komentarzy:

Publikowanie komentarza