Saturday, August 5, 2017

SSIS Tips: Handling UTF 8 Based File Data

Introduction

Quite a few times there were cases where we had to transfer CSV data containing non English based characters to SQLServer database. Though it seems straightforward there are few things we need to consider while working with CSV containing non English characters. This post explains the steps that shall be followed while transferring UTF 8 based data from a CSV onto a SQLServer based relational database system

Scenario

Consider the case where we've a CSV file with the below data


As you see from the above the file has a column which stores non English based data. The requirement was to get this data transferred to a SQLServer database for doing further manipulations with them.
Given this requirement our straightforward way would be to use a data flow task with a flat file source and an OLEDB destination to connect to the SQLServer db.
Lets see what happens when we try to use the standard DFT task for this case
Given below is the screenshot of the standard data flow task using non-unicode datatype I tried to use


As you see from the above it complains of data conversion issue between unicode and non-unicode data types. 
Now lets try using an unicode datatype and see what happens


Now we get an error stating that it cant convert between the two code pages. This clearly implies we cant use the default OLEDB destination in the case of UTF - 8 based file data.
Now lets try using the next possible option i.e. with ADO .Net destination. 
The data flow for this case is as below


The column will be of unicode datatype (nchar, nvarchar) in the database.
Once executed you can check the table to see that data is getting populated correctly


As seen from the above results data got transferred correctly this time to the SQLServer table.

Conclusion

By virtue of the above illustration, its clear that for transferring UTF-8 based file data to SQLServer using SSIS we need to do the below

1. Use UTF-8 based codepage (65001) in the Flat File connection manager
2. Use ADO .Net destination task inside data flow task
3. Use a Unicode based datatype for the table column like nchar, nvarchar


References





Tuesday, June 20, 2017

SQL 2017 Tips: Avoiding Gaps in IDENTITY, SEQUENCE Based Column Values

Scenario


One of the common conversations I've come across in a typical work day is as below

Emp X: I just found that the IDENTITY column values for couple of tables have jumped to a new range
Emp Y: Can you check if there was any server restart?
Emp X: Oh yes! I see it now... There was a planned server restart at night.

The reason for this behavior is explicit.

Server restart causes identity based columns to skip values. This is due to the fact that identity columns pre-allocates (caches) next set of values and on server restart the cache gets flushed. Once server comes back it resumes at the next value from the maximum value in the cache which creates gaps in identity sequence. This behavior started from 2012 version onwards.

This blog discusses a new enhancement available in SQL 2017 CTP which can be used to solve the above issue

Solution

This behavior was explained to be by design and proposed solution was to use trace flag 272 which will allow the functionality to continue as per earlier version. The MS connect link which discusses this is given below


For setting this we had to go to SQLServer configuration manager and set trace flag 272 in the startup parameters as discussed here

http://sqlblog.com/blogs/kalen_delaney/archive/2014/06/17/lost-identity.aspx

In SQL 2017, we have a good news. Microsoft has included this as a new option in SCOPED CONFIGURATION options for the database. The option is called IDENTITY_CACHE and is included in the documentation below


This has two advantages over the previous trace flag based method

1. We can set it from management studio itself rather than setting it as a startup option from SQL Configuration Manager.
2. The most important advantage is that we can scope it to database level rather than setting it at the instance level. This makes it possible for us to have other databases co-existing in the same server instance with new behavior enabled i.e. with gaps.

The complete syntax for the feature is as below

ALTER DATABASE SCOPED CONFIGURATION SET IDENTITY_CACHE=OFF ;

Once you set it at a database level, the server restart doesn't cause any gaps in the IDENTITY objects in the database, which is in line with the behavior before SQL 2012 version.

This small but very useful enhancement will certainly help the developers to set IDENTITY value behavior in a database as per their requirement.

Friday, June 16, 2017

SQL 2017 Tips: Creating Table On The Fly Onto A Specific FileGroup

Introduction

One of the frequently used and the most popular features in SQLServer is the ability to create a table on the fly based on a query. This has saved my day on numerous occasions. Only thing I was missing in this was the ability to organize the table into its own filegroup of my choice

Why are filegroups necessary?

There are various advantages of organizing tables into separate filegroups, the main among them being  the below


  • For large tables it makes sense to move them to a separate filegroup
  • In case of tables where data size is almost same as non-clustered index size, we shall consider keeping table in one file group and index on another
  • Organizing tables based on portfolio so that tables can be easily migrated / archived if required.
  • Maintaining some data which is read only along with read write data in same db. In this case we can keep these tables in a separate read only file group

The Solution


The SQL 2017 CTP2 release includes a solution to the above limitation.

The SELECT ... INTO syntax for creating the table is now extended to include the filegroup information.
The modified syntax would be like below

SELECT column1, column2,.. columnN
INTO NewTableName
ON FileGroupName 
FROM Table1 t1
JOIN Table2 t2
ON condition...
...


If you see the code above, the highlighted part designates the filegroup to which this new table has to be associated. Only caveat is that the filegroup must pre-exist in the database where the above query is being executed otherwise SQLServer will throw an error

Illustration

Lets see how we can create a new table on the fly and associate it to a newly created filegroup on the database
Steps are as outlined below

1. Adding a new filegroup to existing database Test2017

ALTER DATABASE [Test2017] ADD FILEGROUP [SupplimentaryDataFG] 
GO

 You can check the db properties to ensure this is created as per the below




2. Associating a new file to the newly created filegroup

ALTER DATABASE [Test2017]   
ADD FILE   
(  
    NAME = test1dat3,  
    FILENAME = 'D:\SupplData.ndf',  
    SIZE = 5MB,  
    MAXSIZE = 100MB,  
    FILEGROWTH = 5MB  
)
TO FILEGROUP [SupplimentaryDataFG]; 

The file will get created as per below



3. Creating a test table on the fly on the newly created filegroup

SELECT * INTO TestTable
ON SupplimentaryDataFG
FROM 
(VALUES
('Test',1),
('Foo',2),
('Foobar',3)

)t(Description,Code)


Checking the table properties to see the filegroup info


Another way to check this is to use the script as option in SSMS. Script as CREATE option for the table will give the below code

USE [Test2017]
GO

/****** Object:  Table [dbo].[TestTable]    Script Date: 6/16/2017 7:09:03 AM ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[TestTable](
[Description] [varchar](6) NOT NULL,
[Code] [int] NOT NULL
) ON [SupplimentaryDataFG]
GO

The highlighted part clearly shows the table to be in the new filegroup we created and associated it to

Conclusion

From the above illustration its pretty evident that we can create a table on the fly and associate it to any filegroup we have pre-created in the database. 
This is a very useful small enhancement which has been introduced in SQL 2017 which helps us to organize the table data into multiple filegroups simultaneously at the time of their creation on the fly using queries. 

References

Wednesday, May 31, 2017

Graph Tables in SQL 2017

Graph based tables is one of the most exciting features that has been included in the latest release of SQLServer. With graph tables its much easier to create data model to deal with complicated scenarios like multi linked hierarchies, many to many relationships, etc.
My latest article is on what graph tables are, how they're implemented and includes an illustration using a popular use case.
You can find the complete article here

https://social.technet.microsoft.com/wiki/contents/articles/37993.t-sql-graph-based-tables-in-sql-2017.aspx

Feel free to read and share and let me know your comments

Wednesday, May 3, 2017

SSRS Tips: Creating 2012/2014 BI Templates Using VS 2015

Introduction

With the release of VS 2015 shell, its now possible to utilize the BI templates available init for creating SSRS reports, SSIS packages and SSAS projects compatible with server versions 2012 till 2016 from the same unified environment.
This quick post discusses on two things that you need to consider while creating projects using VS 2015 shell which are targeted to deploy at a lower version server.

Scenario

I had recently used VS 2015 shell for development of SSRS reports and SSIS packages for one of my projects. Our production server was still on 2014 version. As per official documentation of Microsoft VS 2015 shell can be used for creating packages, reports etc which can be deployed onto 2012, 2014 or 2016 versions. The link below has the details


As per the above I used SSDT 2015 for creating SSRS reports. After creating the reports I tested locally and once I was happy I decided to deploy the reports to the server which was in 2014 version.
For this I went to Project  Properties and tweaked the below properties


My expectation was that by setting this property I would be able to deploy the reports to 2014 server. But to my surprise I got the below message when I tried to deploy the rdl file


The definition of this report is not valid or supported by this version of Reporting Services. The report definition may have been created with a later version of Reporting Services, or contain content that is not well-formed or not valid based on Reporting Services schemas


Solution

I didn't understand why this was happening and did some analysis on other settings to see if there's anything I could find. Fortunately i was able to spot the below property




This comes under Tools options and determines the Server Version to which report is to be deployed. The default value set was 13.0 which corresponds to SQL 2016. Once I changed it to 12.0 and tried deploying, the report got successfully deployed to the server. 

Something to note next time when you do deployment of SSRS,SSIS or SSAS project from VS 2015 shell to make sure deployment is successful. 

Conclusion

Hope this quick tip will save you sometime in avoiding the above specified issue while deploying BI projects using VS 2015 shell templates