I am having quite a lot of problems when running a ProcessUpdate and then a ProcessIndex on a multidimensional cube. The ProcessIndex takes hours to run on a large cube.
So I wanted to test an alternative approach. I really wanted to use ProcessAdd on my cube. But I also know that my data will be added, changed and deleted. So how can I make this work?
First I wrote a simple MDX to see the internet sales.
select [Measures].[Internet Sales Amount] on 0 from [Adventure Works]
And the result I got was $29,358,677.22
To simulate an update I will add a new row that changes the original row. The original row might be a sales of 10 dollars. So if this row should be updated to a sale of 7 dollars I will add a new identical row with -3 dollars. So in the cube it will look like a 7 dollar sale.
To test this I added a new row in the database
INSERT INTO factinternetsales SELECT productkey, orderdatekey, duedatekey, shipdatekey, customerkey, promotionkey, currencykey, salesterritorykey, salesordernumber, 2, revisionnumber, orderquantity, unitprice, extendedamount, unitpricediscountpct, discountamount, productstandardcost, totalproductcost, -1000, taxamt, freight, carriertrackingnumber, customerponumber, orderdate, duedate, shipdate FROM factinternetsales WHERE ( salesordernumber = 'SO43697' )
Please note the sales amount of -1000 and the SalesOrderLineNumber that is set to 2. This will decrease my internet sales with 1000.
Then you must change your partition query so that this only returns your new row. I changed my partition query on “Internet_Sales_2011” to this
SELECT [dbo].[factinternetsales].[productkey], [dbo].[factinternetsales].[orderdatekey], [dbo].[factinternetsales].[duedatekey], [dbo].[factinternetsales].[shipdatekey], [dbo].[factinternetsales].[customerkey], [dbo].[factinternetsales].[promotionkey], [dbo].[factinternetsales].[currencykey], [dbo].[factinternetsales].[salesterritorykey], [dbo].[factinternetsales].[salesordernumber], [dbo].[factinternetsales].[salesorderlinenumber], [dbo].[factinternetsales].[revisionnumber], [dbo].[factinternetsales].[orderquantity], [dbo].[factinternetsales].[unitprice], [dbo].[factinternetsales].[extendedamount], [dbo].[factinternetsales].[unitpricediscountpct], [dbo].[factinternetsales].[discountamount], [dbo].[factinternetsales].[productstandardcost], [dbo].[factinternetsales].[totalproductcost], [dbo].[factinternetsales].[salesamount], [dbo].[factinternetsales].[taxamt], [dbo].[factinternetsales].[freight], [dbo].[factinternetsales].[carriertrackingnumber], [dbo].[factinternetsales].[customerponumber] FROM [dbo].[factinternetsales] WHERE salesordernumber = 'SO43697' AND salesorderlinenumber = 2
Then run a ProcessAdd using XMLA
<Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine"> <Parallel> <Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300" xmlns:ddl400="http://schemas.microsoft.com/analysisservices/2012/engine/400" xmlns:ddl400_400="http://schemas.microsoft.com/analysisservices/2012/engine/400/400"> <Object> <DatabaseID>Adventure Works</DatabaseID> <CubeID>Adventure Works</CubeID> <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID> <PartitionID>Internet_Sales_2005</PartitionID> </Object> <Type>ProcessAdd</Type> <WriteBackTableCreation>UseExisting</WriteBackTableCreation> </Process> </Parallel> </Batch>
If you then run the MDX it will return $29,357,677.22.
So how can I make this work in my deployment routine?
In my database I can split my fact table in 2. So that I have one table (current) with data that changes. And one table (archive) with data that does not change. I will do the same in my cube. I will make a partition for data that changes (from the current table). This partition will be processed with ProcessAdd and only return new rows. The other partition will have a regular SQL that returns everything from the archive table. The Archive partition should only be processed when new data is added. Maybe find a way that the archive table only contains real rows? And not the rows for updating and deleting?
If you only have small amounts of data you can of course keep your current table in your DWH with original rows from your source. And then drop and re-create your current SSAS partition. Then run a ProcessAdd.
Any comments?
I tested this in SSAS multidimensional but I am quite sure it will work in SSAS Tabular too