Displaying
hierarchical data has been made easy with Reporting Services 2008. It
usually involves creating a self-referenced table with ParentID associated
to the primary key (e.g., ManagerID to EmployeeID) and then
setting the recursive parent in Reporting Services. To format the
hierarchical data, a common approach is to indent descendants and add
visibility toggles to the ancestors:
This seems to be straightforward enough; however, if you try to render a large dataset with thousands of rows, you will notice that the entire dataset is rendered on one page and causes performance issues.
The performance is poor because the automatic paging based on the report’s interactive size is disabled when the visibility toggle is turned on. This limitation makes it difficult to efficiently display a complex hierarchy. In this article, we will present an approach leveraging the HIERARCHYID data type to store the hierarchical data and performs custom paging in SQL server.
SELECT CustomerID,
This seems to be straightforward enough; however, if you try to render a large dataset with thousands of rows, you will notice that the entire dataset is rendered on one page and causes performance issues.
The performance is poor because the automatic paging based on the report’s interactive size is disabled when the visibility toggle is turned on. This limitation makes it difficult to efficiently display a complex hierarchy. In this article, we will present an approach leveraging the HIERARCHYID data type to store the hierarchical data and performs custom paging in SQL server.
Prestage Data
To
simulate complex hierarchical structural, we will use the Customer table from
AdventureWorks database:
SELECT CustomerID,
FirstName +
' '
+
COALESCE(MiddleName,
'')
+
' ' + LastName
+
' Inc.' AS CustomerName,
ABS(CAST(NEWID()
AS BINARY(6))
% 29484) AS ParentCustomerID
INTO Customer$
FROM AdventureWorks.Sales.vIndividualCustomer
GROUP BY
CustomerID,
FirstName,
MiddleName,
LastName
Note
that we added a new field, ParentCustomerID populated by a random
number between 0 and 29483 (max. of CustomerID) to serve as our
self-referencing ID. Since AdventureWorks’ CustomerID starts at
11,000, anyParentCustomerID below 11,000 are not valid. To fix
this, we will add a root node to the table and set invalid ParentCustomerID to
the newly added root node:
INSERT Customer$ VALUES (-1, 'root',
NULL)
UPDATE Customer$
SET ParentCustomerID = -1 -- root node
WHERE ParentCustomerID < 11000 -- 11000 is MIN CustomerID
Model Hierarchical Data
To
model our data using the HIERARCHYID data type, we need to populate the field
with nodes mimicking a tree structure:
This
can be done by recursive CTE:
;WITH P(CustomerID, CustomerName, ParentCustomerID, Node)
AS
(
SELECT CustomerID,
CustomerName,
ParentCustomerID,
HIERARCHYID::GetRoot() AS Node
FROM Customer$
WHERE ParentCustomerID IS NULL
UNION ALL
SELECT C.CustomerID,
C.CustomerName,
C.ParentCustomerID,
CAST(P.Node.ToString() + CAST(ROW_NUMBER() OVER (PARTITION BY C.ParentCustomerID ORDER B
C.CustomerID) AS VARCHAR(30)) + '/' AS HIERARCHYID)
FROM Customer$ AS C
INNER JOIN P ON C.ParentCustomerID = P.CustomerID
)
SELECT *
INTO Customer
FROM P
To
optimize the table for efficiency, add a primary key and indicies with the
following code listing:
-- Add PK
ALTER TABLE Customer
ALTER COLUMN CustomerID INT NOT NULL
ALTER TABLE Customer
ADD CONSTRAINT PK_Customer PRIMARY KEY CLUSTERED (CustomerID ASC)
-- Add computed field for BFS index
ALTER TABLE Customer
ADD HierarchyLevel AS Node.GetLevel()
-- Add breadth-first index
CREATE UNIQUE INDEX IX_Customer_BFS
ON Customer(HierarchyLevel, Node)
-- Add depth-first index
CREATE UNIQUE INDEX IX_SiebelAccount_DFS
ON Customer(Node)
Display Hierarichal Data
With
the hierarchal data populated, we can start creating a report. For
the demonstration purpose, we will simply use count of descendants as the
aggregate in the report.
-- Add
a stored procedure to SQL Server for retrieving customer hierarchies:
CREATE PROCEDURE dbo.GetCustomers_1
AS
BEGIN
SELECT CustomerID,
CustomerName,
CASE
WHEN ParentCustomerID = -1 THEN NULL
ELSE ParentCustomerID
END
AS ParentCustomerID
FROM Customer
WHERE HierarchyLevel > 0
END
-- Create
a new report. Add a new Data Source pointing to our database Organization,
then add a new dataset Customers using the stored procedure dbo.GetCustomers_1.
-- In
the report design view, drag a matrix to the canvas.
-- For
the simplicity, delete [Column Group] from the column groups.
-- Drag CustomerID to
the Rows quadrant of the tablix to create a row group on CustomerID.
Change the row group name to RowGroup_CustomerID. Change the row
group text box association from CustomerID to CustomerName.
-- Right
click on [RowGroup_CustomerID], then click on Group Properties:
- Under the Sorting tab, sort the group by [CustomerName].
- Under Advanced tab, use set ParentCustomerID as recursive parent.
-- Add
left paddings to textbox [CustomerName]: =CStr(5 + (Level()*12)) +
"pt"
-- In
the Data quadrant of the tablix, set the expression to =Count(Fields!CustomerID.Value,
"RowGroup_CustomerID", Recursive).
Run
the report and note that the report is broken up into 420 pages due to the
default interactive size of 8.5in x 11in.
Now,
let’s add toggles to the ancestors to create drill-down effect. Right
click on [RowGroup_CustomerID], click on Group Properties.
Under the Visibility tab, set “When the report is initially run” to Hide, check
“Display can be toggled by this report item”, and select CustomerName from the
dropdown menu. Run the report again.
It
should be obvious that there is a problem. Although the visibility toggle
displays as expected, the pagination disappears. In Visual Studio,
the report renders our testing dataset up to Chad C Jai Inc., while when deployed,
the report renders the entire dataset on one page. This can cause a
performance issue when working with complex hierarchy. Specifically, in
our testing environment, the average refresh time for clicking on the
visibility toggle is around 410 ms.
Server-Side Pagination
The
performance issue we see in the previous section can actually be resolved by
server-side paging. In this section, we will create a new stored
procedure, dbo.GetCustomers_2, with the following code snippets to
calculate page number for each record in SQL Server and the return that to
Reporting Services.
Because
of the nature of hierarchical data, it makes sense to group all descendants on
the same page as their ancestors. To that end, we need to know top-level
ancestors’ page number first:
-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::
-- Prestage the returning dataset
-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::
SELECT CustomerID,
CustomerName,
CASE
WHEN ParentCustomerID = -1 THEN NULL
ELSE ParentCustomerID
END AS ParentCustomerID,
Node,
NULL AS Page
INTO #R
FROM Customer
WHERE HierarchyLevel > 0
-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::
-- Set page number for top-level nodes
-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::
;WITH I AS
(
SELECT CustomerID,
(ROW_NUMBER() OVER (ORDER BY CustomerName))/100
+ 1 AS Page
FROM #R
WHERE ParentCustomerID IS NULL
)
UPDATE #R
SET Page = I.Page
FROM #R INNER JOIN I ON #R.CustomerID = I.CustomerID
For
simplicity, we hardcoded page size to be 100 in this example, but the value can
be parameterized in a real world application. After the top-level
ancestors’ page numbers are determined, we can now propagate the page numbers
to their descendants. To keep this process clear, we will first set aside
the top-level ancestors' page numbers:
SELECT CustomerID AS TopAncestorId,
Node AS TopAncestorNode,
Page
INTO #T
FROM #R
WHERE Page IS NOT NULL
Then
we will use HIERARCHYID’s GetAncestor() method to JOIN
descendants to their top-level ancestors and set the page numbers:
UPDATE R
SET Page = T.Page
FROM #R R
INNER JOIN #T T
ON R.Node.GetAncestor(R.Node.GetLevel() - 1)
= TopAncestorNode
WHERE R.Page IS NULL
Finally,
return the temp table #R back to the Reporting Services
SELECT * FROM #R
Configure Report with Server-Side Pagination
After
creating the new stored procedure to calculate page numbers, we will modify our
report to add a new row group outside of[RowGroup_CustomerID] for
pagination. First, swap out the stored procedure for the dataset to usedbo.GetCustomers_2 and
refresh fields. Then right-click on [RowGroup_CustomerID], hover Add
Group, then click on Parent Group. In the Group by dropdown, select Page
and then click on OK. Right click on the newly added row group[Page] and
click on Group Properties. Change the default group name Page to
RowGroup_Page. Under the Page Breaks tab, check Between each instance of
a group. Click on OK to exit out the Group Properties panel. You
will notice that there is a new column added to the tablix for [RowGroup_Page].
Since we only need the group for paging not for display, right click on the
column, click on Delete Columns, and choose Delete columns only.
Run the report again and you will notice that the dataset is nicely
grouped into 70 pages. In our testing environment, the average refresh
time for clicking on the visibility toggle is reduced to around 5 ms.
This
article presents an approach to perform server-side pagination for displaying
hierarchal data in Reporting Services. By applying the technique, the
report performance can be greatly improved. In our testing environment,
for example, the server-side pagination results in 98% improvement in
efficiency. We hope this article can benefit developers who find the need
to display complex hierarchical data in Reporting Services.
About Author:
Sameer Kothari is enthusiast .net developer who works as associate consultant with Systems Plus Pvt. Ltd. He in free time reads and write on various web technologies. He can be contacted at: sameer.kothari@spluspl.com
Sameer Kothari is enthusiast .net developer who works as associate consultant with Systems Plus Pvt. Ltd. He in free time reads and write on various web technologies. He can be contacted at: sameer.kothari@spluspl.com
No comments:
Post a Comment