Using subqueries in MDX

Did you know that you can write subqueries in MDX? It might be handy if you have some complex dimension filters.

Lets take a basic sample

SELECT [Measures].[Internet Total Sales] ON 0,
[Date].[Calendar Year].[Calendar Year] ON 1
FROM [Adventure Works Internet Sales Model]

And the result will look like this

Subqueries 1

If you only want to get sales for the year of 2007 it can be achieved with a subquery.

SELECT [Measures].[Internet Total Sales] ON 0,
[Date].[Calendar Year].[Calendar Year] ON 1
FROM 
( 
	SELECT [Date].[Calendar Year].&[2007] ON 0
	FROM [Adventure Works Internet Sales Model]
)

Subqueries 2

Yes I know this can be achieved in simpler ways 🙂 But I had some problems figuring out better samples

Search for dimensions members in MDX using inStr()

Sometimes it might be handy to search dimensions members in MDX. The MDX below will show you an example on how to get all products that contains the name ‘Bike’

SELECT [Measures].[Sales Amount] on 0,
( 
  Filter( 
	[Product].[Product].[Product].ALLMEMBERS, 
	Instr( [Product].[Product].currentmember.Properties( 'Member_Caption' ), 'Bike' )  > 0  
  )
) on 1
from [Adventure Works]

And here is some other examples

List products that does not contain the word ‘Bike’

Instr( [Product].[Product].currentmember.Properties( 'Member_Caption' ), 'Bike' )  = 0  

List products that is named ‘Bike Wash – Dissolver’

Instr( [Product].[Product].currentmember.Properties( 'Member_Caption' ), 
'Bike Wash - Dissolver' )  = 1

Cheers,
Sindre

An alternative approach on how to get “ProcessAdd” to work on update / delete in your partitions

I am having quite a lot of problems when running a ProcessUpdate and then a ProcessIndex on a multidimensional cube. The ProcessIndex takes hours to run on a large cube.

So I wanted to test an alternative approach. I really wanted to use ProcessAdd on my cube. But I also know that my data will be added, changed and deleted. So how can I make this work?

First I wrote a simple MDX to see the internet sales.

select [Measures].[Internet Sales Amount] on 0
from [Adventure Works] 

And the result I got was $29,358,677.22

To simulate an update I will add a new row that changes the original row. The original row might be a sales of 10 dollars. So if this row should be updated to a sale of 7 dollars I will add a new identical row with -3 dollars. So in the cube it will look like a 7 dollar sale.

To test this I added a new row in the database

INSERT INTO factinternetsales 
SELECT productkey, 
       orderdatekey, 
       duedatekey, 
       shipdatekey, 
       customerkey, 
       promotionkey, 
       currencykey, 
       salesterritorykey, 
       salesordernumber, 
       2, 
       revisionnumber, 
       orderquantity, 
       unitprice, 
       extendedamount, 
       unitpricediscountpct, 
       discountamount, 
       productstandardcost, 
       totalproductcost, 
       -1000, 
       taxamt, 
       freight, 
       carriertrackingnumber, 
       customerponumber, 
       orderdate, 
       duedate, 
       shipdate 
FROM   factinternetsales 
WHERE  ( salesordernumber = 'SO43697' )   

Please note the sales amount of -1000 and the SalesOrderLineNumber that is set to 2. This will decrease my internet sales with 1000.

Then you must change your partition query so that this only returns your new row. I changed my partition query on “Internet_Sales_2011” to this

SELECT [dbo].[factinternetsales].[productkey], 
       [dbo].[factinternetsales].[orderdatekey], 
       [dbo].[factinternetsales].[duedatekey], 
       [dbo].[factinternetsales].[shipdatekey], 
       [dbo].[factinternetsales].[customerkey], 
       [dbo].[factinternetsales].[promotionkey], 
       [dbo].[factinternetsales].[currencykey], 
       [dbo].[factinternetsales].[salesterritorykey], 
       [dbo].[factinternetsales].[salesordernumber], 
       [dbo].[factinternetsales].[salesorderlinenumber], 
       [dbo].[factinternetsales].[revisionnumber], 
       [dbo].[factinternetsales].[orderquantity], 
       [dbo].[factinternetsales].[unitprice], 
       [dbo].[factinternetsales].[extendedamount], 
       [dbo].[factinternetsales].[unitpricediscountpct], 
       [dbo].[factinternetsales].[discountamount], 
       [dbo].[factinternetsales].[productstandardcost], 
       [dbo].[factinternetsales].[totalproductcost], 
       [dbo].[factinternetsales].[salesamount], 
       [dbo].[factinternetsales].[taxamt], 
       [dbo].[factinternetsales].[freight], 
       [dbo].[factinternetsales].[carriertrackingnumber], 
       [dbo].[factinternetsales].[customerponumber] 
FROM   [dbo].[factinternetsales] 
WHERE  salesordernumber = 'SO43697' 
       AND salesorderlinenumber = 2   

Then run a ProcessAdd using XMLA

<Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <Parallel>
    <Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300" xmlns:ddl400="http://schemas.microsoft.com/analysisservices/2012/engine/400" xmlns:ddl400_400="http://schemas.microsoft.com/analysisservices/2012/engine/400/400">
      <Object>
        <DatabaseID>Adventure Works</DatabaseID>
        <CubeID>Adventure Works</CubeID>
        <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
        <PartitionID>Internet_Sales_2005</PartitionID>
      </Object>
      <Type>ProcessAdd</Type>
      <WriteBackTableCreation>UseExisting</WriteBackTableCreation>
    </Process>
  </Parallel>
</Batch>

If you then run the MDX it will return $29,357,677.22.

So how can I make this work in my deployment routine?

In my database I can split my fact table in 2. So that I have one table (current) with data that changes. And one table (archive) with data that does not change. I will do the same in my cube. I will make a partition for data that changes (from the current table). This partition will be processed with ProcessAdd and only return new rows. The other partition will have a regular SQL that returns everything from the archive table. The Archive partition should only be processed when new data is added. Maybe find a way that the archive table only contains real rows? And not the rows for updating and deleting?

If you only have small amounts of data you can of course keep your current table in your DWH with original rows from your source. And then drop and re-create your current SSAS partition. Then run a ProcessAdd.

Any comments?

I tested this in SSAS multidimensional but I am quite sure it will work in SSAS Tabular too

Getting started with aggregations in SSAS multidimensional

In this post I will describe how to get started with aggregations in SSAS multidimensional. Since this is a rather complicated thing I will start with a really basic sample.

First I will modify my Adventure Works database. I will create a new aggregation design with no aggregations. This can be done by running the aggregation wizard and selecting “No aggregations”. Then save the aggregation design and assign it to the partitions.

It will look like something like this

Aggregations 1

Do a full process on the measure group

Then run a test query against this cube using one of the measures from that measure group

SELECT [Measures].[Internet Sales Amount] ON 0,
NON EMPTY [Product].[Subcategory].[Subcategory] ON 1
FROM [Adventure Works]
WHERE [Date].[Calendar].[Calendar Year].&[2007]

Subscube

As you can see here I get a query subcube event. I only get one of these events because this query is really simple. For more complex queries you get loads of them. In a typical production environment you will get loads of them before the aggregations are optimized.

My main goal now is to eliminate this subcube event by designing aggregations.

To do this I open the aggregations. Then chooses “Advanced view”. And then “AggregationDesign EMPTY”

Aggregations 2

Right click in the gray area in the middle and choose “New aggregation”

In the check boxes check the following. Product -> Subcategory and Date -> Calendar Year

Aggregations 3

Then save and do a full process

Re-run the previous query and watch profiler

Aggregations 4

Now you can see that my query did hit the aggregation.

If you choose to run the queries multiple times on each step make sure you clear the cache in between.

SSAS multidimensional: Are you sure your aggregations are okay?

A few days back I wanted to optimize a SSAS multidimensional cube. I was testing different queries to see if any of them used my aggregations. As I understand SSAS multidimensional provides results on different ways.

    1. From cache
    2. From aggregations
    3. Build the result from storage engine / formula engine

Number 1 is the fastest way and number 3 is the slowest. So it’s quite important for query performance that you have well considered aggregations. Please note that aggregations adds time to cube processing. When I saw that my queries did not hit any aggregations I started to Google around. And when reading some forum posts I started to suspect that my aggregations was unprocessed. I was running a query similar to this to check the size of my aggregations

SELECT * FROM SystemRestrictSchema($system.discover_partition_stat ,
DATABASE_NAME = 'Adventure Works' ,
CUBE_NAME = 'Adventure Works' ,
MEASURE_GROUP_NAME = 'Internet Sales' ,
PARTITION_NAME = 'Internet_Sales_2013')

Aggregations

This result looks to be okay. But the result from my cube was that almost every size was set to 0. How can I reproduce and identify this issue? I did read the documentation on processing options and figured out that “all flexible relationships we be dropped when performing a process update on a dimension”.

To figure out what was going on I tried to change a value in a dimension with a flexible relations and then run a process update on the dimension. Then followed by a full process of my test partition.

Aggregations 2

Now you can see that most of the size is set to 0. How can I get them back now? Process partion / measure group / cube with “Process index”. And the your aggregations should be fine. If you run “Process default” on your dimension or full process on the cube the aggregations should be fine.

I will certainly check my processing routines after this 🙂 Does anyone have any comments on this?

What is the difference between “non empty” and “nonempty” in MDX?

In this post I will demonstrate the difference between “non empty” and “nonempty” in MDX. Not everyone knows the difference between these functions.

Lets write a simple MDX

SELECT { 
	[Measures].[Reseller Sales Amount], 
	[Measures].[Discount Amount] 
 } ON COLUMNS, 
 [Geography].[City].[City] ON ROWS 
 FROM [Adventure Works]

And the result will look like this

nonempty 1

As you can see there are cities with no sales and no discount. If I add the “NONEMPTY” function I will remove all rows where both measures are NULL.

SELECT { 
	[Measures].[Reseller Sales Amount], 
	[Measures].[Discount Amount] 
 } ON COLUMNS, 
 NON EMPTY [Geography].[City].[City] ON ROWS 
 FROM [Adventure Works] 
 

nonempty 2

But to remove all rows with NULL in discount amount I must use the “nonempty”. To make this work I will have to create a new measure that prints NULL if the value are 0 🙂

WITH MEMBER [Measures].[Discount Amount2]
 AS
 IIF([Measures].[Discount Amount] = 0, NULL, [Measures].[Discount Amount])
 
 SELECT { 
	[Measures].[Reseller Sales Amount], 
	[Measures].[Discount Amount2] 
 } ON COLUMNS, 
 NONEMPTY ([Geography].[City].[City], [Measures].[Discount Amount2]) ON ROWS 
 FROM [Adventure Works] 

nonempty 3

The “NONEMPTY” and “NON EMPTY” can be replaced by a filter. Like this

WITH MEMBER [Measures].[Discount Amount2]
 AS
 IIF([Measures].[Discount Amount] = 0, NULL, [Measures].[Discount Amount])
 
 SELECT { 
	[Measures].[Reseller Sales Amount], 
	[Measures].[Discount Amount2] 
 } ON COLUMNS, 
 FILTER ([Geography].[City].[City], NOT IsEmpty([Measures].[Discount Amount2])) ON ROWS 
 FROM [Adventure Works]

Identify products with no sale in MDX

To identify products with no sales in MDX we can use the isBlank function and a filter.

Here is a short sample

SELECT [Measures].[Internet Sales Amount] ON 0,
FILTER(
	[Product].[Product].[Product], 
	ISEMPTY([Measures].[Internet Sales Amount])
) ON 1
FROM [Adventure Works]

And if you want to have products with sales you can use the NON EMPTY function.

SELECT [Measures].[Internet Sales Amount] ON 0,
NON EMPTY [Product].[Product].[Product] ON 1
FROM [Adventure Works]

Use “descendants” in MDX

In the sample today I will show you how to use the “descendants” function in MDX. In this sample I want to count how many days and months I have sold stuff. And then calculate the average sales per month and date.

WITH MEMBER DaysWithSales AS
  Count ( Descendants ( [Ship Date].[Calendar].CurrentMember, [Ship Date].[Calendar].[Date] ) )

MEMBER MonthWithSales AS
  Count ( Descendants ( [Ship Date].[Calendar].CurrentMember, [Ship Date].[Calendar].[Month] ) )

MEMBER [Avg Sales Per Month] AS
  divide ( [Measures].[Internet Sales Amount], [Measures].[MonthWithSales] )

MEMBER [Avg Sales Per Day] AS
  divide ( [Measures].[Internet Sales Amount], [Measures].[DaysWithSales] )

SELECT NON EMPTY [Ship Date].[Calendar].[Calendar Year].&[2013].Children ON 1,
{ 
   [Measures].[Internet Sales Amount], 
   [Measures].[MonthWithSales], 
   [Measures].[DaysWithSales], 
   [Measures].[Avg Sales Per Month], 
   [Measures].[Avg Sales Per Day] 
} ON 0
FROM [Adventure Works]

And the result will look like this

Descendants

“Descendants” have lots of different parameters. Please check them out on MSDN 🙂

Use “except” in MDX to remove dimension members

In this post I will explain how the “except” function works in MDX.

I have a basic query that looks like this

SELECT [Measures].[Internet Sales Amount] ON 0,
NON EMPTY [Product].[Subcategory].[Subcategory] ON 1
FROM [Adventure Works]

And the output will look like this

except 1

But the customer wants me to remove bikes from the resultset. To do this I extend my query to use the “filter” function.

	
SELECT [Measures].[Internet Sales Amount] ON 0,
NON EMPTY Except (
  [Product].[Subcategory].[Subcategory],
  { [Product].[Subcategory].&[2], [Product].[Subcategory].&[1], [Product].[Subcategory].&[3] }
) ON 1
FROM [Adventure Works]

except 2

Lets sort the resultset using the order function

SELECT [Measures].[Internet Sales Amount] ON 0,
NON EMPTY Order (
  Except (
    [Product].[Subcategory].[Subcategory],
    { [Product].[Subcategory].&[2], [Product].[Subcategory].&[1], [Product].[Subcategory].&[3] }
  ),
  [Measures].[Internet Sales Amount],
  DESC
) ON 1
FROM [Adventure Works]

except 3

Nice!!

Get first date with sales and last date with sales in MDX

To get the first date with sales and last date with sales we can use the “tail” and the “head” function in MDX. These functions are quite handy if you want to have a default date in your reports.

Here is the code:

WITH MEMBER [Last Date With Sales] AS
  Tail (
    Filter ( [Date].[Calendar].[Date].Members, [Measures].[Internet Sales Amount]  NULL ),
    1
  ).Item ( 0 ) .MEMBER_CAPTION
MEMBER [First Date With Sales] AS
  Head (
    Filter ( [Date].[Calendar].[Date].Members, [Measures].[Internet Sales Amount]  NULL ),
    1
  ).Item ( 0 ) .MEMBER_CAPTION
SELECT
{ [Measures].[First Date With Sales], [Measures].[Last Date With Sales] } ON 0
FROM [Adventure Works]

And the result will look like this

Head and tail