Simple Performance Test on Azure SQL Data Warehous

We are working to port existing applications to Azure SQL Data Warehouse. In order to better understand the performance/workload management characteristics/capabilities of Azure SQL Data Warehouse, I have set up what I consider a very simple test.

I loaded a static table, our business's calendar, containing about 20k rows (i.e., very small for a parallel data warehouse). I then generated all the possible queries of that single table using a pattern like:

SELECT current_timestamp,COUNT(1) FROM 
  ( SELECT C1, ..., Cn , COUNT(1) AS _A_ROW_COUNT 
FROM schema.view_to_table GROUP BY C1, ..., Cn) DER

Givens:

DWU set to 1000.
35 concurrent threads launched.
All threads running in small_rc. (I.e., using 1 slot per query.)
Using sqlcmd with initial connection and then committing after every SELECT
Running on a non-Azure VM with connection via Express route. Chose the outer SELECT COUNT() construct to ensure network traffic is minimal.
Usage of heap table provided better results than the default columnstore (as expected). (Need to test with clustered index.)
Table is distributed by the primary key column.

Background/Biases - I have worked with many other MPP databases.

Results

Queries are running in 10-20 seconds which is much longer than I would expect for such simple work.
When I submitted each thread, I slept between each new thread. The initial queries ran much faster, and the average run times deteriorated dramatically as the number of threads went up to 35.

How do I understand what bottleneck exists?

Of course I will re-run the tests at other DWU settings to see if it effects an exclusively small_rc workload.

Appendix - Example Query Plan

<?xml version="1.0" encoding="utf-8"?>
<dsql_query number_nodes="10" number_distributions="60" number_distributions_per_node="6">
  <sql>SELECT current_timestamp,COUNT(1) FROM ( SELECT GREGORIAN_DATE, WM_MONTH, MON_MULT, FRI_MULT , COUNT(1) AS _A_ROW_COUNT FROM AR_WM_VM.CALENDAR_DAY GROUP BY GREGORIAN_DATE, WM_MONTH, MON_MULT, FRI_MULT) DER</sql>
  <dsql_operations total_cost="0.260568" total_number_operations="8">
    <dsql_operation operation_type="RND_ID">
      <identifier>TEMP_ID_21523</identifier>
    </dsql_operation>
    <dsql_operation operation_type="ON">
      <location permanent="false" distribution="AllDistributions" />
      <sql_operations>
        <sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_21523] ([col] DATE ) WITH(DATA_COMPRESSION=PAGE);</sql_operation>
      </sql_operations>
    </dsql_operation>
    <dsql_operation operation_type="SHUFFLE_MOVE">
      <operation_cost cost="0.258648" accumulative_cost="0.258648" average_rowsize="3" output_rows="2155.4" GroupNumber="76" />
      <source_statement>SELECT [T1_1].[col] AS [col]
FROM   (SELECT dateadd(dd, CAST ((364) AS INT), [T2_1].[calendar_date]) AS [col]
        FROM   [db_ARdev1].[AR_CORE_DIM_TABLES].[calendar_dim] AS T2_1) AS T1_1</source_statement>
      <destination_table>[TEMP_ID_21523]</destination_table>
      <shuffle_columns>col;</shuffle_columns>
    </dsql_operation>
    <dsql_operation operation_type="ON">
      <location permanent="false" distribution="Control" />
      <sql_operations>
        <sql_operation type="statement">CREATE TABLE [tempdb].[QTables].[QTable_3ff26b5253004eec9d9ca50492bab1e2] ([col] BIGINT ) WITH(DATA_COMPRESSION=PAGE);</sql_operation>
      </sql_operations>
    </dsql_operation>
    <dsql_operation operation_type="PARTITION_MOVE">
      <operation_cost cost="0.00192" accumulative_cost="0.260568" average_rowsize="8" output_rows="1" GroupNumber="93" />
      <location distribution="AllDistributions" />
      <source_statement>SELECT [T1_1].[col] AS [col]
FROM   (SELECT   COUNT_BIG(CAST ((1) AS INT)) AS [col]
        FROM     (SELECT   0 AS [col]
                  FROM     [tempdb].[dbo].[TEMP_ID_21523] AS T3_1
                           INNER JOIN
                           (SELECT CASE
                                    WHEN ([T4_1].[wm_week_day_nbr] = CAST ((3) AS SMALLINT)) THEN CAST ((1) AS INT)
                                    ELSE CAST ((0) AS INT)
                                   END AS [col],
                                   CASE
                                    WHEN ([T4_1].[wm_week_day_nbr] = CAST ((7) AS SMALLINT)) THEN CAST ((1) AS INT)
                                    ELSE CAST ((0) AS INT)
                                   END AS [col1],
                                   [T4_1].[calendar_date] AS [calendar_date],
                                   [T4_1].[fiscal_month_nbr] AS [fiscal_month_nbr]
                            FROM   [db_ARdev1].[AR_CORE_DIM_TABLES].[calendar_dim] AS T4_1) AS T3_2
                           ON ([T3_2].[calendar_date] = [T3_1].[col])
                  GROUP BY [T3_2].[calendar_date], [T3_2].[fiscal_month_nbr], [T3_2].[col], [T3_2].[col1]) AS T2_1
        GROUP BY [T2_1].[col]) AS T1_1</source_statement>
      <destination>Control</destination>
      <destination_table>[QTable_3ff26b5253004eec9d9ca50492bab1e2]</destination_table>
    </dsql_operation>
    <dsql_operation operation_type="ON">
      <location permanent="false" distribution="AllDistributions" />
      <sql_operations>
        <sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_21523]</sql_operation>
      </sql_operations>
    </dsql_operation>
    <dsql_operation operation_type="RETURN">
      <location distribution="Control" />
      <select>SELECT [T1_1].[col1] AS [col],
       [T1_1].[col] AS [col1]
FROM   (SELECT CONVERT (INT, [T2_1].[col], 0) AS [col],
               isnull(CONVERT (DATETIME, N'2016-10-03 13:04:34.203', 0), CONVERT (DATETIME, N'2016-10-03 13:04:34.203', 0)) AS [col1]
        FROM   (SELECT ISNULL([T3_1].[col], CONVERT (BIGINT, 0, 0)) AS [col]
                FROM   (SELECT SUM([T4_1].[col]) AS [col]
                        FROM   [tempdb].[QTables].[QTable_3ff26b5253004eec9d9ca50492bab1e2] AS T4_1) AS T3_1) AS T2_1) AS T1_1</select>
    </dsql_operation>
    <dsql_operation operation_type="ON">
      <location permanent="false" distribution="Control" />
      <sql_operations>
        <sql_operation type="statement">DROP TABLE [tempdb].[QTables].[QTable_3ff26b5253004eec9d9ca50492bab1e2]</sql_operation>
      </sql_operations>
    </dsql_operation>
  </dsql_operations>
</dsql_query>

At DWU 1000 you get, 32 max concurrent queries and 40 concurrency slots, so some of your queries are going to have to queue.

What indexing and distribution choices have you made? This table is small so it sounds like a better candidate for a clustered index and not clustered columnstore (the default). Also make sure you have created your statistics.

Where are you calling sqlcmd from, eg an Azure VM so it's "closer" to the DW, or from your laptop in which case you may be waiting for network round trips.

Review the concurrency DMV:sys.dm_pdw_exec_requests Review the waits DMVs:sys.dm_pdw_waits

This recent answer looks useful too.

I have done an annotation of your sample EXPLAIN plan. turn on line numbers in SSMS or view in something like Sublime text for best effect:

Line 3 is the query being analyzed.
Line 4 lists the total number of operations or steps in the plan as 8. Each operation is held in a dsql_operation element within the XML.
Line 5 starts operation 1, RND_ID, or RandomIdOperation. This operation simply creates a unique name for temporary objects used in the query plan. The identifier is TEMP_ID_21523.
Line 8 starts operation 2, ON, or OnOperation. This performs an action on a database or object. This particular step creates a temp table [TEMP_ID_21523] on all nodes as specified in line 9. The DDL to create the temp table on all nodes is on line 11. This temp table only has one column, called 'col' of datatype DATE.
Line 14 is operation 3, a Data Movement Service (DMS) Operation called SHUFFLE_MOVE, or ShuffleMoveOperation. SHUFFLE_MOVE redistributes a distributed table.
Line 16 gives the statement used in the SHUFFLE_MOVE. It's moving data from a calculated column from table [AR_CORE_DIM_TABLES].[calendar_dim] into the temp table [TEMP_ID_21523], which we know exists on all nodes.
Line 22 starts the next operation 4, another ON, or OnOperation. This operation creates another temp table on the control node, with one BIGINT column. The DDL for this table is provided on line 25.
Line 28 starts operation 5, a PARTITION_MOVE, or PartitionMoveOperation. This DMS operation moves data from a distributed table to a single table on the Control node. This operation is used for aggregation operations on the Control node. This particular step moves data from temp table [TEMP_ID_21523] which exists on all nodes to destination temp table [QTable_3ff2...] which is on the control node.
Lines 31 to 49 list the SQL used to do this.
Line 53 starts operation 6, another ON, or OnOperation. This step drops the temp table [TEMP_ID_21523] which exists on all nodes.
Line 59 starts operation 7 of 8, a RETURN or ReturnOperation. This operation which occurs on the control node, sends query results from the control node to the user who submitted the query. The SQL returned is shown in lines 61-67.
Line 69 starts the last operation 8 of 8, another ON, or OnOperation. This particular step drops the temp table [QTable_3ff2...] which exists on the control node.

For your query, the PARTITION_MOVE or SHUFFLE_MOVE step are the most likely causes for performance issues and improving performance would involve removing or improving them.

To go any further I would need to know the DDL for the table [AR_CORE_DIM_TABLES].[calendar_dim] and the view [AR_WM_VM].[CALENDAR_DAY GROUP] so I can work out the distribution and if any calculated columns are being used.

This annotation is based on a similar one in the APS help file sections on EXPLAIN plans and Understanding Query Plans where some of the text is copied from. I have adapted it for your plan.