Function App in Azure Data Factory

One thing that I really like with Data Factory is the possibility to execute a Function App. A Function App is really flexible and can be used to extend the available functionality in Data Factory a lot. You could, for example, process your SSAS Tabular models, do advanced file handling or send emails.

Below are some easy steps on how to execute a Function App within Data Factory.

Search for “Function App” in the search box

Function App 0

Continue reading

Using GitHub in Azure Data Factory

Azure Data Factory integrates very well with both GitHub and Azure DevOps. If you have multiple developers working in the same factory, you can even merge changes from different branches. Try doing that on an SSIS package!

Okay, so how do we start?

Setting up integration with GitHub can be done when you create your new Data Factory or you can set it up after it is created.

In this sample, I will show how to integrate with GitHub after the Data Factory is created.

Continue reading

Getting started with Azure Data Factory

In this post, I will show you how to get started with Azure Data Factory. We will use the sample data from the AdventureWorksLT database. Please read this post on how to get access to it.

First, we will start by creating a table. I am creating this table in my AdventureWorksLT database for simplicity


CREATE TABLE dbo.FactSales
(
FactSalesId int NOT NULL IDENTITY (1, 1),
OrderDate date NULL,
DueDate date NULL,
ShipDate date NULL,
OrderQty smallint NULL,
UnitPrice money NULL,
ProductId int NULL,
ProductName nvarchar(50) NULL
) ON [PRIMARY]

Then we will open a browser and navigate to portal.azure.com. In the search box write “Data factories” and click on it

Create Data Factory 1

Continue reading

Import data from AWS S3 to AWS Aurora

This post describes an easy way of importing a CSV-file saved in Amazon S3 to a table in Amazon Aurora.

Technologies used in this post:

  • Amazon Aurora is a cloud-based relation database which is compatible with both MySQL and PostgreSQL
  • Amazon S3 is a cloud-based object storage

To get started we need to download a sample CSV file. I this case I will use this dataset: https://www.stats.govt.nz/assets/Uploads/Births-by-statistical-area-2-and-area-unit-for-comparison.csv

First, you need to upload this file to an S3 Bucket

Continue reading

Print all column names in SSMS (SSMS trick)

Every once in a while I come across some “hidden” and nice features in SSMS. One of them is really nice if you want to print all columns names of a table.

So instead of writing all the column names by hand you could follow these easy steps.

– Open SSMS
– Expand you database
– Expand your table

expand-ssms

And then drag the folder called “Columns” to your query window

all-columns

Using filters in DAX

In this post, I will show some samples on different ways to use filtering in DAX. If you write ineffective filters you will most likely always run into performance issues. We will also see the difference between using SUMMARIZE() and ROW() in queries that only return one row.

Let’s pretend one of your customers wants to know the sum of sales for the product “AWC Logo Cap”. And the minimum sales.

The benchmarking is done by running the different queries in DAX Studio multiple times. I have cleared the cache between the executions.

Sample query 1.

DEFINE
    MEASURE 'FactInternetSales'[MyFilteredNumber] =
        CALCULATE (
            SUM ( FactInternetSales[UnitPrice] ),
            FILTER ( DimProduct, [EnglishProductName] = "AWC Logo Cap" )
        )
    MEASURE 'FactInternetSales'[MyFilteredNumberMin] =
        CALCULATE (
            MIN ( FactInternetSales[UnitPrice] ),
            FILTER ( DimProduct, [EnglishProductName] = "AWC Logo Cap" )
        )
EVALUATE
 (
    FILTER (
        SUMMARIZE (
            FactInternetSales,
            DimProduct[EnglishProductName],
            "Filtered result", 'FactInternetSales'[MyFilteredNumber],
            "Filtered result min", 'FactInternetSales'[MyFilteredNumberMin]
        ),
        'FactInternetSales'[MyFilteredNumber] > 0
    )
)

This query produces the correct answer. But is does not perform very well.

It took 32 MS.

Continue reading