Saturday, May 17, 2014

SSIS Lookups with Integers vs. non-integers


One of the most common SSIS Data Flow objects that a SSIS developer will use is the Lookup Data Flow task. When I’m developing for your typical data warehouse project where we need to populate the fact tables by joining the primary surrogate keys to foreign surrogate keys, or primary keys to foreign keys, we typically (or should always) use integers as surrogate or primarily key. Now this may be obviously since primary and surrogate keys often are identity type columns and the required data type is integer. However, there is nothing stopping you from using columns that are not integer data types as lookup relationships. In the image below, you can see the typical mapping relationship for SKU (In this example SKU was a Varchar data type)



Naturally, you would then create another data flow task that will sent matches columns to a slowing changing dimensions (SCD) data flow task, and if you’re using the SCD data flow task provided by SQL 2012, then your likely going to use the same column as your business key for the SCD wizard. This would subsequently create the OLE DB Command that uses the selected business key as the joining column. This is where the problem starts because you are now joining non-integer types in SSIS. This will become a major problem when you start to use lots of data as the updates will be extremely slow. The image below points to the Where statement that should only use integers



Now I’m not saying using non-integer data types won’t work, I’m, saying that it will work SLOWLY compared to integers. For example, just updating 65K rows had taking about 6 minutes compared to less than 10 seconds in my labs. Yes, it’s really a night and day difference!
Basically, avoid using alphanumeric/strings (or essentially any non-integer data type) for lookups relationships when possible. This is particularly true when using OLE DB Command (for your SCD operations) to update tables that match when doing the lookup.  You'll find DBAs that will say it makes no difference. However, when you'll joining fact tables that could contain millions of rows, using integers makes the big difference.

Sunday, May 4, 2014

Fun with SSRS Rectangle objects.


One of the most misunderstood features of SSRS is the use of the Rectangles. At first glance, it may appear that rectangles are simply shapes you can add to your reports in the same way you would add circles, lines, and other shapes. That can’t be further from the truth. In fact, rectangles are one of the more powerful objects in SSRS. The main purpose of rectangles is not only to display a box, but instead group all other SSRS objects (tablix, matrix, graphs, text box, etc.) in to a defined area on the report body to allow for “grouped” configurations. Here is an example of a use a rectangles. Let’s say we have three SSRS objects - a chart, a tablix, and a text box.
When we run the report, the spacing changes so that the first tablix is displayed and unused space is added unnecessarily. Without using rectangles as containers, we are at the mercy of where the report rendering process displays the object.
 

 If I added the chart and the second tablix in the rectangle (i.e. green box) in the design mode I force the SSRS to display the tablix at a certain location.


When we run it the second time, the two sections are not generating the excessive space between to two charts but instead keeping the two objects within the rectangle.



As we see in these examples, If we place these on the report body, and display the reports, SSRS will display the objects right to left in processing the display, and once that item is complete, it will show the next object. This can produce unexpected object locations.  Again, to force SSRS to display report objects together, we include the report objects inside the rectangle. This keeps all these items grouped together.
 
Another example of rectangle object usefulness is to allow you to move around sections of report objects when editing. This can be really a time saver when you have spent a lot of time lining up report objects in different sections through spacing and you need to move that entire grouping of report objects without have to drag individual sections and repositioning the reports. If you always start with grouping you report layout inside rectangles, you an easily move these reports object simply by moving the rectangles. That approach saves me hours across report projects.
I always felt if Microsoft would have named the rectangle object the “Item Grouping” object that would make it clear to the report author that this object is not simply to draw a rectangle, but as a container grouping feature to define how objects are displayed on the report.

Saturday, May 3, 2014

New Microsoft Certification

I just recently completed my final exam for the Microsoft Business Intelligence certification. Even though I had the option to do the upgrade path (taking two exams instead of five), I elected to take the five exam route simply because it would allow me to re-study some areas in T-SQL that have been enhanced in SQL 2012.

Some of the new T-SQL aggregate and programmability commands. Also, I was able to earn this nice MSCE logo for business purposes.



It took a lot of hard work and studying to obtain this certification. If fact, it was clearly the most difficult MS exam process/track that I've ever taken.