Friday, June 17, 2016

Whats New in SSIS 2016 - Expose SSIS Package Data Externally as a View

Introduction

SQL 2016 has brought in lots of new and exciting features on the plate. One of the pretty cool feature that was added to SSIS in SQL 2016 is the ability to expose the data directly as a SQLServer database view object which can be accessed using T-SQL queries.

This is made possible by the introduction of two new components
1. A new linked server provider SSISOLEDB which is capable of exposing the SSIS data to SQLServer database
2. A new data flow component called Data Streaming Destination which will stream the data to be consumed through the linked server based on the above provider

Once you've installed SQL 2016 you can connect to the instance using management studio and when you expand the Server Objects - Linked Servers - Providers you can see this new provider listed as per below



Similarly you can launch SSDT (SQLServer Data Tools) 2015 and start an Integration Services project and create a sample package with a single data flow task. Inside data flow task tab you can see the new destination component available


Now that we have seen the two components lets see how they can be used together for exposing SSIS data as a SQLServer database object.

Illustration

Consider the scenario where we have data coming from different sources which needs to be exposed to  our database as an object.
For the sake of this article lets consider the below scenario
1. Customer details coming from a SQL database
2. Customet Satisfaction score data coming from an Excel spreadsheet

The requirement is to merge the data from the above sources to be accessed within third database most often a datawarehouse for analytical and reporting needs.
For this kind of requirement the obvious way to approach is to create a SSIS package which can extract the required data from the various source and do the merge operation based on the related columns.
Such a package for the above case would look like this


The data flow tasks will look like below


The sources involved here are SQL database which has the Customer details as per below

CREATE TABLE OrderDetails
(
OrderID int IDENTITY(1,1),
CustomerName varchar(100),
CustomerEMail varchar(100),
OrderDesc varchar(100),
OrderPrice Decimal(10,2),
OrderDate datetime
)
INSERT OrderDetails (CustomerName,CustomerEMail,OrderDesc,OrderPrice,OrderDate)
VALUES ('Customer 1','abc@xyz.com','Order ABC',130,'20151104'),
('Customer 2','kytrtetet@treby.com','Order PQR',2150.50,'20151203'),
('Customer 5','887rtert767@hakii.co.hk','Order for DEF and Co',1785,'20160112'),
('Customer 7','hjgjh@kaz.co.in','M/S Bhavya Comp Ltd',890,'20160120'),
('Customer 9','ewrewrr@oiuiiuo.co.in','XYZ',1125,'20160205'),
('Customer 1','abc@xyz.com','DEF',3400,'20160322'),
('Customer 9','ewrewrr@oiuiiuo.co.in','MNU',215,'20160312'),
('Customer 14','ytrytr@terry.co.in','PUK',345,'20160415'),
('Kirby Stevens','kirby.stevens@jackoss.co.eu','WKP',2300,'20160420'),
('Kerney Michelle','kerney@hkrky.co.ae','VER',485,'20160512'),
('Customer 1','abc@xyz.com','RTU',1320,'20160514'),
('Mark Bowerman','mark.2134@kayko.com','BJY',1130,'20160603'),
('Customer 9','ewrewrr@oiuiiuo.co.in','SAR',765,'20160610')

And the query used is as below to get the count and total value of orders per customer

SELECT CustomerName,
CustomerEMail,
COUNT(OrderDesc) AS OrderCount,
SUM(OrderPrice) AS TotalOrderPrice
FROM OrderDetails
GROUP BY CustomerName,
CustomerEMail
ORDER BY CustomerEMail

Source 2 is an excel spreadsheet with customer satisfaction data as below


And for this illustration we use the above query to get the average customer satisfaction score and  response count

SELECT email,
AVG(satscore) AS AvgSatScore,
COUNT(*) AS responseCount
FROM `Sheet1$`
GROUP BY email
ORDER BY email

From these two sources of data would be merged based on the common columns between the two resultsets as shown below



Once this is done we would get the required resultset. Now inorder to stream the resultset we will use Data Streaming Destination component and include the required columns to be accessed through the exposed view.


Data Streaming Destination will include the required columns as shown above
Once this is done we need to deploy the package to the Integration Services Catalog. 
The package will get deployed to the Integration Services Server as shown below


Once this is done we need to stream the package data as a view within a database in the associated instance.
This can be done using data feed publishing wizard which comes with the SQL 2016 installation. 


We can launch data feed publishing wizard and it will come up with the below window


Click Next and it will ask you to select the server and package to be published

Once selected it will also give you the option to modify the default values set for the connection managers and parameters in the package which I'm not changing for this example.



Clicking Next will bring it to Publish settings screen where we will specify the view name through which we need to expose the package data. 


One thing to note here is that Use 32Bit Runtime is set to True as one of the sources involved here is an Excel spreadsheet and the ACE driver installed is of 32 bit version in the server. In case you've the 64 bit version installed then you don't need to tweak this setting.
On clicking Next it goes to the validation screen and validates the various steps involved


Click on Next and finally click Finish to publish the package as a view


Once publishing is done you can connect to the server instance through SQL Management Studio and go to the specified database and expand views to see the view you published



Right click and Select data from the view and it will execute the package on the background and will provide you with the below result



You can confirm this by expanding Integration Services Catalog and finding the package within your deployed project and then right clicking and choosing the All Executions report which will show the log for successful execution of the package when view gets referred in the query



Conclusion

As shown by the illustration above you can use the new data feed publishing wizard available in SQL 2016 for exposing a SSIS package resultset as a view object within an associated relational database. This is a very useful feature which can be used for publishing a resultset consisting of data extracted from a variety of sources.




No comments:

Post a Comment