Blog do projektu Open Source JavaHotel

niedziela, 31 sierpnia 2014

MSSQL to DB2 migration, sp_executesql

Introduction
Assume migrating the following SP (stored procedure) from MSSQL to DB2 (full source code)

CREATE PROCEDURE getCustomers(
@CUSTID INTEGER = NULL,
@CUSTNAME VARCHAR(100) = NULL,
@CUSTLASTACT DATETIME = NULL
)
AS
BEGIN
DECLARE @SELECT NVARCHAR(MAX);
DECLARE @WHERE VARCHAR(MAX);

SET @SELECT = 'SELECT * FROM CUSTOMER';

SET @WHERE = '';

IF @CUSTID IS NOT NULL
SET @WHERE += ' AND CUSTID = @pCUSTID';

IF @CUSTNAME IS NOT NULL
SET @WHERE += ' AND CUSTNAME = @pCUSTNAME';

IF @CUSTLASTACT IS NOT NULL
SET @WHERE += ' AND CUSTLASTACT >= @pCUSTLASTACT';

IF LEN(@WHERE) > 0 BEGIN
PRINT @WHERE
SET @WHERE = SUBSTRING(@WHERE,5,9999);
PRINT @WHERE
SET @SELECT = @SELECT + ' WHERE ' + @WHERE;
END;

DECLARE @PARAMDEF NVARCHAR(MAX);

SET @PARAMDEF = '
@pCUSTID INT,
@pCUSTNAME VARCHAR(100),
@pCUSTLASTACT DATETIME';

PRINT @SELECT;
-- PRINT @PARAMDEF;

EXECUTE sp_executesql @SELECT, @PARAMDEF,
@pCUSTID = @CUSTID,
@pCUSTNAME = @CUSTNAME,
@pCUSTLASTACT = @CUSTLASTACT;


END
This SP prepares simple SQL query statement adjusting WHERE clause to the input parameters accordingly. There are 8 combinations of WHERE clause here. All combinations are covered by single sp_executesql statement.
At first glance it is very easy to translate this SP to DB2 syntax. The DB2 equivalence of sp_executesql is PREPARE, EXECUTE or OPEN {cursor} command. But unfortunately DB2 does support named parameter markers in PREPARE statement for dynamic statements.
The number of variables must be the same as the number of parameter markers in the prepared statement. The nth variable corresponds to the nth parameter marker in the prepared statement.
 Parameters are assigned to markers in order from left to right. The number of parameter markers should correspond to the number of parameter values in EXCUTE or OPEN{cursor} statement. Otherwise exception is raised. So single instance of sp_executesql should be replaced by 8 instances of EXECUTE statement covering all cases of WHERE clause.
Of course - it is not feasible.
First solution
One solution is to develop the SQL statement in the fly.
IF pCUSTLASTACT IS NOT NULL THEN
  SET WHERE = WHERE || ' AND CUSTLASTACT >= ' || pCUSTLASTACT;
END IF;
But this solution is vulnerable to SQL injection risk and open a security gap.
Another solution is to add ELSE clause to have a constant number of parameter markers (full source code).
 IF pCUSTLASTACT IS NOT NULL THEN
SET pWHERE = pWHERE || ' AND CUSTLASTACT >= ?';
ELSE
SET pWHERE = pWHERE || ' AND ? IS NULL';
END IF;

IF LENGTH(pWHERE) > 0 THEN
-- PRINT @WHERE
SET pWHERE = SUBSTR(pWHERE,5,9999);
-- PRINT @WHERE
SET pSELECT = pSELECT || ' WHERE ' || pWHERE;
END IF;

PREPARE STMT FROM pSELECT;
OPEN DC USING pCUSTID,pCUSTNAME,pCUSTLASTACT;
But I'm not happy with that and do not like blurring simple SQL query only to meet syntax requirements and decided to develop a simple DB2 implementation of sp_executesql routine.
Simple DB2 implementation of sp_executesql
The solution contains two SP : external Java and SQL
  • Keep original MS/SQL query statement untouched (with named parameter markers)
  • Prepare associate map: marker name -> parameter value (equivalence of list of parameters to the sp_executesql procedure)
  • Java external procedure scans query from left to right and replaces named parameter markers to DB2 ? marker and prepares list of parameter values in the proper order.
  • Example:  input query "WHERE CUSTID = @pCUSTID AND CUSTLASTACT=@pCUSTLASTACT" is replaced with "WHERE CUSTID=? AND CUSTLASTACT=?" and output list contains corresponding parameter values in proper order is created accordingly.
  • SQL_EXECUTESP routine declares a cursor and invokes OPEN DC clause regarding only the number of markers in the input query string.
PREPARE STMT FROM MQUERY;
IF PARS IS NULL THEN SET CARD = 0;
ELSE
SET CARD = CARDINALITY(PARS);
END IF;
IF CARD = 0 THEN
OPEN DC;
ELSEIF CARD = 1 THEN
OPEN DC USING PARS[1];
ELSEIF CARD = 2 THEN
OPEN DC USING PARS[1],PARS[2];
ELSEIF CARD = 3 THEN
OPEN DC USING PARS[1],PARS[2],PARS[3];
ELSE
ERRVAL = RAISE_ERROR('70001',CA || ' number of markers in the statement too big';
END IF; 
The source code :
  • deftype.sql  : contains type definition and external Java SP signature.
  • PrepareStm.java : external Java SP procedure body (called internally by SQL_EXECUTESP)
  • SQL_EXECUTESP : SQL main procedure
Finally our original MSSQL SP  after migrating to DB2 using solution above (source code).

CREATE OR REPLACE PROCEDURE getCustomers (
pCUSTID INTEGER DEFAULT NULL,
pCUSTNAME VARCHAR(100) DEFAULT NULL,
pCUSTLASTACT TIMESTAMP DEFAULT NULL
)
DYNAMIC RESULT SETS 1
BEGIN
DECLARE pSELECT VARCHAR(32000);
DECLARE pWHERE VARCHAR(32000);
DECLARE NAMEDPAR PARVARARGS;

SET pSELECT = 'SELECT * FROM CUSTOMER';

SET pWHERE = '';

IF pCUSTID IS NOT NULL THEN
SET pWHERE = pWHERE || ' AND CUSTID = @pCUSTID';
END IF;

IF pCUSTNAME IS NOT NULL THEN
SET pWHERE = pWHERE || ' AND CUSTNAME = @pCUSTNAME';
END IF;

IF pCUSTLASTACT IS NOT NULL THEN
SET pWHERE = pWHERE || ' AND CUSTLASTACT >= @pCUSTLASTACT';
END IF;

IF LENGTH(pWHERE) > 0 THEN
-- PRINT @WHERE
SET pWHERE = SUBSTR(pWHERE,5,9999);
-- PRINT @WHERE
SET pSELECT = pSELECT || ' WHERE ' || pWHERE;
END IF;

SET NAMEDPAR['@pCUSTID'] = pCUSTID;
SET NAMEDPAR['@pCUSTNAME'] = pCUSTNAME;
SET NAMEDPAR['@pCUSTLASTACT'] = pCUSTLASTACT;

CALL SQL_EXECUTESP(pSELECT,NAMEDPAR);

END

Final remarks

  • The main advantage of this approach is that original SQL query is kept untouched (opposite to the ELSE approach). Also the execution flow is the same. This lessens the danger of introducing regression errors during migration.
  • It is not necessary to replace :{marker name} with ? sign. DB2 also supports markers with semicolon as a prefix although the 'name' does not have any meaning.
  • It impacts the performance because additional algorithm is performed (parameter matching). In case of warehouse where the main workload is related to the query execution it does not matter a lot (additional execution time is a small constant value). But if performance penalty is too expensive it is possible to optimize the Java routine, replace Java with C++ external or even to get rid of Java routine at all and replace it with pure SQL code.
  • While calling Java external routine array of maps (PARVALS type) is replaced by two PARVARARGS arrays. Unfortunately, external Java interface support only ARRAY of simple type values (chars, integers etc) indexed by integers and does not support more complex ARRAYs.
  • SQL_EXECUTESP routine declares CURSOR .. WITH RETURN TO CLIENT (not CALLER). It means that result set can be caught only by the client application (Java, C++, CLI) and cannot be intercepted by another SQL calling routine.

Brak komentarzy:

Prześlij komentarz