Thursday, February 26, 2009

Top-N Queries and Analytic Functions

Noted Oracle expert Tom Kyte has written wonderful column on getting first n-rows and n-to-m rows using ROWNUM. ( References: 1,2 )

Personally, I'm big fan of Top-N query and found this very useful in PeopleSoft. Below are my top 3 picks:

a) In Streaming/ Parallel processing of AE programs: Top N-Query approach can provide breakthrough performance upgrade in case of Conversion Programs. Consider a situation where we've to upload hundreds of thousands of data rows using a CI based conversion App Engine program. Depending on volume of data, and mapping-rules and validations this single process will run for hours or even for days.
A simple solution to this complex problem would be launching multiple instances of this conversion program simultaneously for splitted data-set. We can split the whole data-set into chucks of 1000 rows ( using Top-N query) and assign a flag value with it . Now, with each flag value one conversion program will be scheduled using ProcessRequest class API. This approach will fasten the entire conversion process substantially.

b) In Web Service Pagination: Pagination is a common word in web-service world. In real-life scenario it works like this : For a particular input from end-user, say there are 105 rows of result data. Now, end user is requesting for 10 rows per page to be displayed. In this case- 100 rows will be splitted in 11 pages. 10 pages with 10 rows each and 11th page with 5 rows. User will be given an option to pick any one page out of these 11 pages. When user will select 5th page to be displayed, data rows from 51 till 60 will be displayed on the page. Likewise, user can change the request to see 20 rows per page and in this case he'll have option to see 1 out of 6 pages at a time.
Once you go through the recommend article ( mentioned above) you know how to get the pagination done. Though it would be certainly difficult to implement pagination using this method if your response message is a container having complex rowset based messages. ROWNUM approach will work well if the response message is simple rowset based and pagination criteria is not complex. ( The level from where chunking should be done- also determines the complexity of Pagination).

c) In Reports: Unlike XMLP whose RTF-template provides numerous built-in features, most of the other reporting tools just represents the fed data. ROWNUM approach can be very useful in creating composite-reports when we're using SQR, App Engine, PSQuery. Think about a report wherein we need to show 'List of the top 5 experienced employees from each departments and the average experience of employees in that department.'
We can always populate one staging table with the help of top-n query which will further feed the report along with other desired data-set.

Tuesday, February 24, 2009

Writing Efficient App Engine Program

Few high-level guidelines to be followed:

a) Apply Set Processing wherever it can be applied:
  • Use Group by, Having, Exists clauses effectively
  • Take extra care while writing sub queries and complex joins
  • Don't forget to join PROCESS_INSTANCE, in case you've made it a key
  • Fine tune the SQL ( Refer: Oracle Documentation)
b) Use Temp Table
  • It improves AE performance significantly- Best Suited for Set Based Processing
  • Facilitates parallel processing
  • Custom Indexes can be created to achieve faster result
  • Practice to make PROCESS_INSTANCE a key and employ the %Table meta-SQL
  • Automatic data cleanse
  • Dedicated Temp Table is preferred
c) Use Meta-SQL
  • Make a habit of using Meta-SQL like %Join, %CurrentDateIn, %Table. It gives program more flexibility.
  • It makes AE program more robust and platform independent and improves performance.
  • Be aware of the limitation of these Meta-SQL, e.g %EffdtCheck doesn't work well with PS_JOB table
  • %TruncateTable is faster than bulk delete statement. In case of Temp table use %TruncateTable(%Table(XXX_TAO))
d) Drop/Rebuild Indexes:
  • If you're planning for massive Insert, you may wish to drop indexes and triggers from the table first and recreate them once the insert is done.
  • It makes AE processing much faster. Just be careful that dropping indexes makes it prone to Duplicate rows. (You may leave primary key untouched).
e) Refrain using PeopleCode
  • If the goal can be achieved using SQL, do not use PeopleCode.
f) Setting Commits
  • It's recommended to use frequent and early Commit in case of Set based processing. It reduces load from Database server and enhance performance.
g) Reuse Statement
  • Valid Only for SQL actions
  • By dedicating a persistent cursor to that statement we can reuse the SQL Statement.
  • When we select the ReUse property for a SQL action, %BIND fields is converted into real bind variables (like :1, :2, etc). This enables PeopleSoft Application Engine to compile the statement only once and dedicate a cursor, and re-execute it with new data multiple times. This reduction in compile time can result in dramatic improvements to performance.
h) Use %UpdateStats
  • For better performance, refresh statistics on temp tables using %UpdateStats(record name ,[HIGH/LOW]) after each bulk load of data.
i) Use CollectGarbage()
  • Specific to AE using App Classes.
  • Use the CollectGarbage function to remove any unreachable application objects created by the Application Classes and hence release the memory. ( Also Refer: Unix Process Limit for PeopleSoft)

j) Use SQL Hints ( Tuning SQL)
  • SQL Hints e.g. /* +APPEND */,/* +FIRST_ROWS(n) */, provides a mechanism which instruct the CBO to choose a certain query execution plan based on the specific criteria. ( Refer: Oracle Documentation)
k) Streaming
  • Streaming is a process in which a large chunk of data is divided into multiple streams and all the stream processes in parallel.
l) Use Trace File
  • Use trace file generated by Trace/TOOLSTRACESQL/TOOLSTRACEPC advisably. Find out areas where process is taking longer processing-time or where performance can be improved.

SQLExec() ... Small Tips

As we know, SQLExec() returns only a single row of data and if the SQL statement retrieves more than one row of data, SQLExec sends only the first row to its output variables. Any subsequent rows are discarded. So, let's use SQLEXEC() function wisely.
We need to retrieve DEPTID, JOBCODE, from the most effective dated row less than current date, for &emplid=101 ( from PS_JOB). So, one of the easiest way to do that would be:

SQLExec("SELECT DEPTID, JOBCODE FROM PS_JOB WHERE EMPLID=:1 AND EFFDT<=%CurrentDateIn ORDER BY EFFDT DESC, EFFSEQ DESC", &emplid, &empl_dept, &empl_jobcd);
ORDER BY clause can play key role in case of SQLExec() function.
Can we use /*+ FIRST_ROWS */ hint for better performance when we've a complex SQL statement ? Think about it !

Monday, February 23, 2009

Using Julian Date in PeopleSoft

Julian Date at any given point in time is defined as a Time-Interval ( Day Count + Fraction of Day) between the given time and 1st Jan, 4713 BC Greenwich noon, BC, Julian proleptic calendar -4712. 1st Jan, 4713 BC Greenwich noon is considered to be day 0 and the count starts from there.

How to get Julian Day (Day Count)?
If I run queries like :
a) SELECT TO_CHAR(SYSDATE,'J') FROM DUAL;
The return value is: 2454886
b) SELECT TO_DATE(2454886,'J') FROM DUAL;
The return value is: 02/23/2009 ( Today's Date)
b) SELECT TO_CHAR(SYSDATE,'J')-TO_CHAR(SYSDATE-1,'J') FROM DUAL;
Return Value: 1

Now, since we know how to use Julian Date concept, let’s proceed with an example.

We need to get all the Employees who have been transferred today. (Normally we see such business requirement to produce similar reports by nightly batch process).

We'll make some assumptions here, which may vary as per business need:
a) Transfer is considered when there is a change in DEPTID or SETID_DEPT (in Job record)
b) JOB.ACTION='XFR'
c) JOB.EFFDT=SYSDATE or JOB.ACTION_DT=SYSDATE AND JOB.EFFDT < SYSDATE

One of the simplest solution for above requirement is :

SELECT A.EMPLID
, A.EMPL_RCD
, A.EFFDT
, A.EFFSEQ
FROM PS_JOB A
, PS_JOB J
WHERE A.EFFDT= (
SELECT MAX(A_ED.EFFDT)
FROM PS_JOB A_ED
WHERE A_ED.EMPLID=A.EMPLID
AND A_ED.EMPL_RCD=A.EMPL_RCD
AND A_ED.EFFDT<=%CurrentDateIn)
AND A.EFFSEQ=(
SELECT MAX(A_EF.EFFSEQ)
FROM PS_JOB A_EF
WHERE A_EF.EMPLID=A.EMPLID
AND A_EF.EMPL_RCD=A.EMPL_RCD
AND A_EF.EFFDT=A.EFFDT)
AND A.ACTION='XFR'
AND (A.EFFDT=%CurrentDateIn
OR (A.ACTION_DT=%CurrentDateIn
AND A.EFFDT<%CurrentDateIn)) AND A.EMPLID=J.EMPLID AND A.EMPL_RCD=J.EMPL_RCD AND (TO_CHAR(J.EFFDT,'J')||J.EFFSEQ) =
( SELECT MAX(TO_CHAR(J_ED.EFFDT,'J')||J_ED.EFFSEQ)
FROM PS_JOB J_ED
WHERE J.EMPLID = J_ED.EMPLID
AND J.EMPL_RCD = J_ED.EMPL_RCD
AND (TO_CHAR(J_ED.EFFDT,'J')||J_ED.EFFSEQ) < ( TO_CHAR(A.EFFDT,'J')||A.EFFSEQ))
AND (A.DEPTID<>J.DEPTID OR A.SETID_DEPT<>J.SETID_DEPT)
Here as you can see we first converted Effdt to Julian Date and then concatenated it with EffSeq. The resulting value was used as a boundry condition to get the most effective dated second last row.
Julian Date concept can be widely used in Effdt'd record to put a limiting criterion.

Sunday, February 22, 2009

%UpdateStats( ) and PLAN_TABLE

We all are well aware of the 'power' of Application Engine Trace. No matter its a simple or most difficult AE program, (most of the time) we put a Trace parameter to get one or more out of the followings :
  • Application Engine Step and SQL trace
  • Application Engine Statement Timings trace
  • PeopleCode Detail Timings trace
  • Database Optimizer trace
Recently, I was working on a custom AE program with over 100 sql actions. Temp Tables, Meta SQL, HINTS, %UpdateStats() and other standard approaches were followed to enhance the performance. During the course of fine-tuning, as I wanted to get most out the trace functionality, I put a trace value - TRACE 4351. To my surprise, the process went to no success. The error log read:
"SQL Error: ORA-00604: error occurred at recursive SQL level 1 ORA-12899: value too large for column "SYSADM"."PLAN_TABLE"."STATEMENT_ID" (actual: 31, maximum: 30) "

What went wrong ? After an hour of analysis my conclusion was:

a) Trace Value 4096( which I was using)—Initiates the database optimizer "explain", storing the results in the Explain Plan Table of the current database. I was on Oracle, so it was PLAN_TABLE.
b) When outputting to a table, PeopleSoft updates the trace rows as follows:
* EXPLAIN PLAN SET STATEMENT_ID: PeopleSoft updates the STATEMENT ID column:
EXPLAIN PLAN SET STATEMENT_ID = ApplId.Section.Step.Type FOR sqlstmt
* PLAN_TABLE’s REMARKS column: PeopleSoft updates the REMARKS column:
PLAN_TABLE's REMARKS column = 'ProcessInstance-RunControlId(QueryNo)'
queryno is a count of how many SQL statements have been traced up to a particular point.
c) In PLAN_TABLE, the field STATEMENT_ID is VARCHAR(30). One can see the table schema or refer Oracle Documentation.
d) PeopleSoft recommends - To output the trace to a table on Oracle, (i) a PLAN_TABLE table must exist, and (ii) the STATEMENT_ID must be of type VarChar2(254), instead of VarChar2(30).

Hence the Problem !

Sending Emails From SQR

My first encounter with SQR was very enriching. There was an existing SQR program which was generating a report based on some business rules. The assignment was to split single report to multiple based on DEPTID and to send out those reports to the concerned Department managers. Creating multiple reports was okay, but I wasn't sure how to email those reports to the correct recipient. I had an option to create a Job, with two processes- One will be SQR that will generate the output file and another will be App Engine that will send out the email with report as an attachment.

Though it was a good idea, somehow I wasn't convinced with it. Creating a new process just to send emails- I wasn't in favor of this. After few days of research, I came up with a procedure which could send emails from SQR. Below is sqr procedure for Windows NT :

!********************************************
!Procedure to send Email through SQR
!********************************************
Begin-PROCEDURE SEND-EMAIL

let $enter=chr(10)
let $EMAIL=


let $BODY = 'Hi ,'||$enter||' This is a test Mail . '
let $system_command ='psmail -TO"'||$EMAIL||'" -SUBJECT"Test Mail'||'" -BODY"'||$BODY||'"'

Call system using $system_command #call_status
if #call_status <> 0
show ' '
show 'Error Sending Mail'
show ''
show #call_status
show $system_command
show ' '
end-if

End-PROCEDURE ! SEND-EMAIL

!********************************************
!Procedure to send Email based on role
!********************************************
Begin-PROCEDURE SEND-EMAIL

let $enter=chr(10)
let $email_ids= ''

begin-select
Z.EMAILID

if $email_ids= ''
let $email_ids= &Z.EMAILID
else
let $email_ids= $email_ids ||';' ||&Z.EMAILID
end-if

FROM PS_ROLEUSER_VW Y, PSOPRDEFN Z
WHERE Y.ROLENAME = ''
AND Y.ROLEUSER = Z.OPRID
AND Z.EMAILID <> ' '
AND Z.ACCTLOCK=0

end-select

let $EMAIL=$email_ids
let $BODY = 'Hi ,'||$enter||' This is a test Mail . '
let $system_command ='psmail -TO"'||$EMAIL||'" -SUBJECT"Test Mail'||'" -BODY"'||$BODY||'"'

Call system using $system_command #call_status
if #call_status <> 0
show ' '
show 'Error Sending Mail'
show ''
show #call_status
show $system_command
show ' '
end-if

End-PROCEDURE ! EMAIL-Report

!*********************************************************
!Procedure to send Email with File as an attachment
!*********************************************************
Begin-PROCEDURE EMAIL-REPORT

! $outFileName- Complete File name (with Path)
! TestFile.txt- Alias Name for output file ( will be File Name in Email attachment )

let $BODY = 'Test Report'
let $system_command ='psmail -TO"'||
$emailid||
'" -SUBJECT" Test Attachment'||'" -BODY"'||$BODY||'" -FILE"'||$outFileName
||'" -ALIAS"TestFile_' ||'.txt'||'"'
Call system using $system_command #call_status


if #call_status <> 0
show ' '
show 'Error Sending Mail'
show ''
show #call_status
show $system_command
show ' '
end-if

End-PROCEDURE !EMAIL-REPORT
!*********************************************************

Saturday, February 21, 2009

PeopleSoft Can Be Fun...

Welcome to my Blog PeopleSoft Can Be Fun ! The title is inspired by one of my all-time favorite book- Physics Can Be Fun, By Ya Parelman. In this book, author talks about Tao-of-Physics in great deal taking reference to several classic scientific literatures. Likewise, this blog is an attempt to share with a larger community - my experience with PeopleSoft and the very little knowledge I possess. Happy Reading!