Thursday, February 26, 2009

Top-N Queries and Analytic Functions

Noted Oracle expert Tom Kyte has written wonderful column on getting first n-rows and n-to-m rows using ROWNUM. ( References: 1,2 )

Personally, I'm big fan of Top-N query and found this very useful in PeopleSoft. Below are my top 3 picks:

a) In Streaming/ Parallel processing of AE programs: Top N-Query approach can provide breakthrough performance upgrade in case of Conversion Programs. Consider a situation where we've to upload hundreds of thousands of data rows using a CI based conversion App Engine program. Depending on volume of data, and mapping-rules and validations this single process will run for hours or even for days.
A simple solution to this complex problem would be launching multiple instances of this conversion program simultaneously for splitted data-set. We can split the whole data-set into chucks of 1000 rows ( using Top-N query) and assign a flag value with it . Now, with each flag value one conversion program will be scheduled using ProcessRequest class API. This approach will fasten the entire conversion process substantially.

b) In Web Service Pagination: Pagination is a common word in web-service world. In real-life scenario it works like this : For a particular input from end-user, say there are 105 rows of result data. Now, end user is requesting for 10 rows per page to be displayed. In this case- 100 rows will be splitted in 11 pages. 10 pages with 10 rows each and 11th page with 5 rows. User will be given an option to pick any one page out of these 11 pages. When user will select 5th page to be displayed, data rows from 51 till 60 will be displayed on the page. Likewise, user can change the request to see 20 rows per page and in this case he'll have option to see 1 out of 6 pages at a time.
Once you go through the recommend article ( mentioned above) you know how to get the pagination done. Though it would be certainly difficult to implement pagination using this method if your response message is a container having complex rowset based messages. ROWNUM approach will work well if the response message is simple rowset based and pagination criteria is not complex. ( The level from where chunking should be done- also determines the complexity of Pagination).

c) In Reports: Unlike XMLP whose RTF-template provides numerous built-in features, most of the other reporting tools just represents the fed data. ROWNUM approach can be very useful in creating composite-reports when we're using SQR, App Engine, PSQuery. Think about a report wherein we need to show 'List of the top 5 experienced employees from each departments and the average experience of employees in that department.'
We can always populate one staging table with the help of top-n query which will further feed the report along with other desired data-set.

No comments:

Post a Comment