Monday, 13 January 2014

Display Complex Hierarchical Data with Server-Side Pagination SSRS

Displaying hierarchical data has been made easy with Reporting Services 2008.  It usually involves creating a self-referenced table with ParentID associated to the primary key (e.g., ManagerID to EmployeeID) and then setting the recursive parent in Reporting Services.  To format the hierarchical data, a common approach is to indent descendants and add visibility toggles to the ancestors:
















This seems to be straightforward enough; however, if you try to render a large dataset with thousands of rows, you will notice that the entire dataset is rendered on one page and causes performance issues.

The performance is poor because the automatic paging based on the report’s interactive size is disabled when the visibility toggle is turned on.   This limitation makes it difficult to efficiently display a complex hierarchy.  In this article, we will present an approach leveraging the HIERARCHYID data type to store the hierarchical data and performs custom paging in SQL server.


Prestage Data

To simulate complex hierarchical structural, we will use the Customer table from AdventureWorks database:



















SELECT
  CustomerID,
        FirstName + ' '
            + COALESCE(MiddleName, '')
            + ' ' + LastName
            + ' Inc.' AS CustomerName,
        ABS(CAST(NEWID() AS BINARY(6)) % 29484) AS ParentCustomerID
INTO    Customer$
FROM    AdventureWorks.Sales.vIndividualCustomer
GROUP BY CustomerID,
        FirstName,
        MiddleName,
        LastName

Note that we added a new field, ParentCustomerID populated by a random number between 0 and 29483 (max. of CustomerID)  to serve as our self-referencing ID.  Since AdventureWorks’ CustomerID starts at 11,000, anyParentCustomerID below 11,000  are not valid.  To fix this, we will add a root node to the table and set invalid ParentCustomerID to the newly added root node:


INSERT  Customer$ VALUES (-1, 'root', NULL)

UPDATE  Customer$
SET     ParentCustomerID = -1    -- root node
WHERE   ParentCustomerID < 11000 -- 11000 is MIN CustomerID


Model Hierarchical Data

To model our data using the HIERARCHYID data type, we need to populate the field with nodes mimicking a tree structure:


















This can be done by recursive CTE:



;WITH P(CustomerID, CustomerName, ParentCustomerID, Node)
AS
(
    SELECT  CustomerID,
            CustomerName,
            ParentCustomerID,
            HIERARCHYID::GetRoot() AS Node
    FROM    Customer$
    WHERE   ParentCustomerID IS NULL
    
    UNION ALL
 
    SELECT  C.CustomerID,
            C.CustomerName,
            C.ParentCustomerID,
            CAST(P.Node.ToString() + CAST(ROW_NUMBER() OVER (PARTITION BY C.ParentCustomerID ORDER B
 C.CustomerID) AS VARCHAR(30)) + '/' AS HIERARCHYID)
    FROM    Customer$ AS C
            INNER JOIN P ON C.ParentCustomerID = P.CustomerID
)
SELECT  *
INTO    Customer
FROM    P

To optimize the table for efficiency, add a primary key and indicies with the following code listing:

-- Add PK
ALTER TABLE Customer
ALTER COLUMN CustomerID INT NOT NULL
 
ALTER TABLE Customer
ADD CONSTRAINT PK_Customer PRIMARY KEY CLUSTERED (CustomerID ASC)
 
-- Add computed field for BFS index
ALTER TABLE Customer
ADD HierarchyLevel AS Node.GetLevel()
 
-- Add breadth-first index
CREATE UNIQUE INDEX IX_Customer_BFS
ON Customer(HierarchyLevel, Node)
 
-- Add depth-first index
CREATE UNIQUE INDEX IX_SiebelAccount_DFS
ON Customer(Node)


Display Hierarichal Data

With the hierarchal data populated, we can start creating a report.  For the demonstration purpose, we will simply use count of descendants as the aggregate in the report.

-- Add a stored procedure to SQL Server for retrieving customer hierarchies:
CREATE PROCEDURE dbo.GetCustomers_1
AS
BEGIN
    SELECT  CustomerID,
            CustomerName,
            CASE
                WHEN ParentCustomerID = -1 THEN NULL
                ELSE ParentCustomerID
            END AS ParentCustomerID
    FROM    Customer
    WHERE   HierarchyLevel > 0
END

-- Create a new report.  Add a new Data Source pointing to our database Organization, then add a new dataset Customers using the stored procedure dbo.GetCustomers_1.
-- In the report design view, drag a matrix to the canvas.
-- For the simplicity, delete [Column Group] from the column groups.
-- Drag CustomerID to the Rows quadrant of the tablix to create a row group on CustomerID. Change the row group name to RowGroup_CustomerID.  Change the row group text box association from CustomerID to CustomerName.
















-- Right click on [RowGroup_CustomerID], then click on Group Properties:
  • Under the Sorting tab, sort the group by [CustomerName].
  • Under Advanced tab, use set ParentCustomerID as recursive parent.
-- Add left paddings to textbox [CustomerName]: =CStr(5 + (Level()*12)) + "pt"
-- In the Data quadrant of the tablix, set the expression to =Count(Fields!CustomerID.Value, "RowGroup_CustomerID", Recursive).

Run the report and note that the report is broken up into 420 pages due to the default interactive size of 8.5in x 11in.
















Now, let’s add toggles to the ancestors to create drill-down effect.  Right click on [RowGroup_CustomerID], click on Group Properties.  Under the Visibility tab, set “When the report is initially run” to Hide, check “Display can be toggled by this report item”, and select CustomerName from the dropdown menu.  Run the report again.

It should be obvious that there is a problem.  Although the visibility toggle displays as expected, the pagination disappears.   In Visual Studio, the report renders our testing dataset up to Chad C Jai Inc., while when deployed, the report renders the entire dataset on one page.  This can cause a performance issue when working with complex hierarchy.  Specifically, in our testing environment, the average refresh time for clicking on the visibility toggle is around 410 ms.

Server-Side Pagination

The performance issue we see in the previous section can actually be resolved by server-side paging.   In this section, we will create a new stored procedure, dbo.GetCustomers_2, with the following code snippets to calculate page number for each record in SQL Server and the return that to Reporting Services.

Because of the nature of hierarchical data, it makes sense to group all descendants on the same page as their ancestors.  To that end, we need to know top-level ancestors’ page number first:

  



-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::
--  Prestage the returning dataset
-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::       
SELECT  CustomerID,
        CustomerName,
        CASE
            WHEN ParentCustomerID = -1 THEN NULL
            ELSE ParentCustomerID
        END AS ParentCustomerID,
        Node,
        NULL AS Page
INTO    #R
FROM    Customer
WHERE   HierarchyLevel > 0
  
-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::
--  Set page number for top-level nodes
-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::    
;WITH I AS
(
    SELECT  CustomerID,
            (ROW_NUMBER() OVER (ORDER BY CustomerName))/100
                + 1 AS Page
    FROM    #R
    WHERE   ParentCustomerID IS NULL
)
UPDATE  #R
SET     Page = I.Page
FROM    #R INNER JOIN I ON #R.CustomerID = I.CustomerID
For simplicity, we hardcoded page size to be 100 in this example, but the value can be parameterized in a real world application.  After the top-level ancestors’ page numbers are determined, we can now propagate the page numbers to their descendants.  To keep this process clear, we will first set aside the top-level ancestors' page numbers: 

SELECT  CustomerID AS TopAncestorId,
        Node AS TopAncestorNode,
        Page
INTO    #T
FROM    #R
WHERE   Page IS NOT NULL

Then we will use HIERARCHYID’s GetAncestor() method to JOIN descendants to their top-level ancestors and set the page numbers: 

UPDATE  R
SET     Page = T.Page
FROM    #R R
        INNER JOIN #T T
            ON R.Node.GetAncestor(R.Node.GetLevel() - 1)
            = TopAncestorNode
WHERE   R.Page IS NULL

Finally, return the temp table #R back to the Reporting Services

SELECT * FROM #R

Configure Report with Server-Side Pagination

After creating the new stored procedure to calculate page numbers, we will modify our report to add a new row group outside of[RowGroup_CustomerID] for pagination.  First, swap out the stored procedure for the dataset to usedbo.GetCustomers_2 and refresh fields.  Then right-click on [RowGroup_CustomerID], hover Add Group, then click on Parent Group.  In the Group by dropdown, select Page and then click on OK.  Right click on the newly added row group[Page] and click on Group Properties.  Change the default group name Page to RowGroup_Page.  Under the Page Breaks tab, check Between each instance of a group.  Click on OK to exit out the Group Properties panel.  You will notice that there is a new column added to the tablix for [RowGroup_Page].  Since we only need the group for paging not for display, right click on the column, click on Delete Columns, and choose Delete columns only.   Run the report again and you will notice that the dataset is nicely grouped into 70 pages.  In our testing environment, the average refresh time for clicking on the visibility toggle is reduced to around 5 ms.

















This article presents an approach to perform server-side pagination for displaying hierarchal data in Reporting Services.  By applying the technique, the report performance can be greatly improved.  In our testing environment, for example, the server-side pagination results in 98% improvement in efficiency.  We hope this article can benefit developers who find the need to display complex hierarchical data in Reporting Services.

About Author:
Sameer Kothari is enthusiast .net developer who works as associate consultant with Systems Plus Pvt. Ltd. He in free time reads and write on various web technologies. He can be contacted at: sameer.kothari@spluspl.com

No comments:

Post a Comment