PASS Summit 2016 – Day 2 Keynote

2016-10-27-07-54-38

PASS Summit Day 2 started with Keynote of Dr. Dewitt from MIT. I personally am so exited to hear this keynote. Sitting again on blogger’s table will write highlights of today’s keynote here in this post. Please refresh this post to get updated information.

8:18 – Grant Fritchey stepped up in stage

8:21 – Over 36K members of PASS Virtual Chapters

8:23 – Denise McInerney stepped up in stage

8:26 – New Logo for SQL PASS

2016-10-27-08-26-04

8:27 – New website for PASS demos showed, it will be launching next year (SOON!)

8:29 – Summit 2017 Oct 31 to Nov 3  announced and live now

8:30 – Dr. Dewitt stepped up on stage from MIT talking about Data Warehousing in Cloud

2016-10-27-08-31-33

8:32 – Why DW in Cloud?

2016-10-27-08-34-04

8:34 – Why? Reduce Time to insights

8:35 – Why? Dynamically adjust capactiy

8:36 – Scalable DW Fundamentals:

  • Alternative architectures
  • Partitioned tables
    • The basis for scalable execution
  • Patitioned Parallelism
    • software building blocks for scalable database systems
  • Handling hardware failures

8:38 – Two alternative scalable DW designs

  • Shared-Nothing
    • Microsoft APS, Teradata, Netezza
  • Shared-Storage
    • Microsoft SQL DW, Snowflake, DataBricks…

8:40 – Shared-Nothing architecture diagram

2016-10-27-08-40-37

8:41 – Shared-Storage architecture

2016-10-27-08-42-07

8:43 – Partitioned Tables

2016-10-27-08-44-45

8:46 – Round-Robin Partitioning

2016-10-27-08-46-53

8:47 – Hash (Key) Partitioning

2016-10-27-08-47-50

8:48 – Table Replication

8:50 – Partitioned Paralleslism

Used to parallelize the execution of relational operators (selects, joins, aggregates,…)

By both shared-storage and shared-nothing systems

Pipelining is used between operators to avoid unnecessary disk I/Os.

An example:

2016-10-27-08-51-59 8:53 – Turning to joins; was hardest part of early days of parallel data processing

2016-10-27-08-55-23

9:05 – Node Failures with Shared Storage

2016-10-27-09-05-43

9:07 – A look at competitiros: Amazon Redshift, Snowflake, Microsoft SQL DW

9:07- Redshift classic shared-nothing design

2016-10-27-09-08-33

9:09 – whitin a slice; columns stored in 1MB blocks. Min and Max value of each block retained in a “zone” map. Rich collection of compression options (RLS, Dictionary…)

Two sort options: Compound sort key, and “interleaved” sort key

9:13 – Handling node failures in Redshift

2016-10-27-09-13-04

Redshift summary

2016-10-27-09-13-52

9:14 – 2nd Comparison option: Snowflake: Shared-storage design

compute decoupled from storage

Highly elastic

Leverages AWS

9:18 – Table Storage in Snowflake

2016-10-27-09-18-55

9:20 Virtual Warehouses

2016-10-27-09-21-06

9:26 – Snowflake Summary

2016-10-27-09-25-58

9:27 – Microsoft SQL DW

2016-10-27-09-28-01

DWU Performance Metric

2016-10-27-09-29-54

9:32 – Scaling up in SQL DW

2016-10-27-09-32-32

9:32- SQL DW Summary

2016-10-27-09-33-03

9:37 – Wrap up of comparison

9:38 – Azure SQL DW is by far the best query engine in the planet!

2016-10-27-09-40-239:41 – Dr. Dewitt Thanks to all people.

Thank you Dr. This was the best session in my life. Now I have to take a bit time off to consume part of this awesome presentation 🙂

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for 12 continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand, Power BI Summit, and Data Insight Summit.
Reza is author of more than 14 books on Microsoft Business Intelligence, most of these books are published under Power BI category. Among these are books such as Power BI DAX Simplified, Pro Power BI Architecture, Power BI from Rookie to Rock Star, Power Query books series, Row-Level Security in Power BI and etc.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.

Leave a Reply