Once something takes more than an hour to figure out, it is always a good candidate for a new blog post. This one definitely took longer than an hour.
I am supporting a data mart with compensation data for an international company. There is a currency dimension table that gets populated from a view in an operational database. The currency key is an IDENTITY column and gets populated with sequential values, starting at 1 with an increment of 1.
The reports have regional specialties and a report template typically only works with one of the currencies. Because the list of currencies never changed, some report developer decided to hard-code the currency keys into the stored procedures that produce the data sets for the compensation reports. For example, I found joins like this one:
LEFT JOIN [CMP_vw_Exchange_Rate] cc ON aa.[Salary_Date_Key] = cc.Date_Key AND cc.[Currency_Key] = 5 -- for Euro
When the data mart was migrated from SQL Server 2008 to SQL Server 2012, some reports started to show compensation figures that were widely inaccurate. I eventually found out that the wrong exchange rates were applied, but why? Nothing had changed in the code or the data. When I dug in further, I noticed that the dimension keys were different.
As you can see, the AUD used to have key 2 in 2008, now it has key 1. Why is that?
To find an explanation, I looked at the SSIS package that populates the dimension table. The package does three separate things with the table:
- It deletes the old values (TRUNCATE)
- It inserts the “NA” row explicitly with the key -1, by wrapping the insert statement in a SET IDENTITY_INSERT ON statement
- It inserts the results of the currency view from the operational system
I was able to show that these 3 steps produce different results between SQL 2008 and SQL 2012 with the following script:
create table ##test ( currency_key int identity(1,1), currency_code char(10) ); set identity_insert ##test on insert ##test (currency_key, currency_code) values (-1, 'NA') set identity_insert ##test off insert into ##test values ('AUD'); insert into ##test values ('CAD'); select * from ##test; drop table ##test;
In SQL 2008, AUD has key 2, in SQL 2012 it has key 1. The reason is that the IDENTITY_INSERT leaves a gap in SQL 2008 when it probably shouldn’t. This was “fixed” in 2012.
Because legacy systems can contain countless traps like this one that are impossible to detect and fix, there is a substantial risk that other things are broken. I got my hopes up about a broad safety measure when I read about Trace Flag 272 here and here. When I applied it to my development instance of SQL 2012, my keys did not revert back to the SQL 2008 values, unfortunately. Either I did not make the configuration change correctly, or it has no influence on this particular issue.
As a crude workaround I am now inserting a dummy value after the IDENTITY_INSERT OFF operation, like so:
insert ##test values ('dummy') delete from ##test where currency_code = 'dummy'
This increments the seed by 1 and the remaining rows have the same keys as in the 2008 instance. However, other dimensions may have the same issue that may not surface for months or years. Fingers crossed. Thanks, Microsoft.