Wednesday, June 4, 2014

Common T-SQL Tasks when building Data Warehouse

I’m often scrambling to find code that I use during the beginning of data warehouse design to accomplish tasks that are easily done in T-SQL. Almost all deal with data conversion/transformation.  Typically I’m creating Date or Unknown Surrogate Keys, Replace values in columns, and having to document this at the same time. 
 
Four SQL Tasks I do all the time (but commonly forget the syntax)...

1.      Converting dates to the YYYYMMDD format. I do this task constantly and recommend this naming convention for date dimension surrogate keys.. This example assumes DateSK will be the name of the Date Surrogate Key.

 SELECT CONVERT(VARCHAR(8), GETDATE(), 112) as DateSK

2.      Remove or replacing a string with a blank (or something else). In this example, I will replace the instances of XYZ with nothing (which is the same as removing).

                SELECT REPLACE (ColumnName,’XYZ’,’’)

3.      Pull all the column names as well as Database Name, Table Name, and data type to prepare for data dictionaries when building data warehouses.

SELECT
       TABLE_CATALOG as [Database],
       Table_name  as [Table Name] ,
       column_name  as [Column Name],
       DATA_TYPE as [Data Type]
                        FROM  information_schema.columns
                        WHERE  table_name = 'Resource_Table$ '
                        ORDER BY ordinal_position
 
4.      Sometimes you simply need to insert data into a dimension with a specific values on a column that specified as INDENTITY. This is common when you have dimension but now you need to add UNKNOWN entries where the Surrogate Key as -1 or some value that represent unknowns.
 
            SET IDENTITY_INSERT Products ON
            INSERT into Products (ProductsSK, ProductName)
            VALUES(-1, 'Hammer')
            SET IDENTITY_INSERT Products OFF

Those are the four tasks I commonly use as but often forget the syntax.

No comments: