I’m Speaking at Virtual PASS Summit 2020

Published On: 2020-08-06By:

PASS Summit has gone virtual this year, but that isn’t keeping PASS from delivering a good lineup of speakers and activities. I’m excited to be presenting a pre-con and two regular sessions this year. I know virtual delivery changes the interaction between audience and speaker, and I’m going to do everything I can to make my sessions more than just standard lecture and demo to keep things interesting.

Building Power BI Reports that Communicate Insights and Engage People (Pre-Con)

If you are into Power BI or data visualization, check out my pre-con session. It’s called Building Power BI Reports that Communicate Insights and Engage People. Unless we’ve had data visualization training, the way we learn to make reports is by copying reports that others have made. But that assumes other people were designing intentionally for human consumption. Another issue is that we often mimic example reports from tool vendors. That can be very helpful with the technical aspects of getting content on the page, but we often overlook the design aspects of reports that can make or break their usability and effectiveness in communicating information. My pre-con will begin with discussion on how humans interpret data visualizations and how you can use that to your advantage to make better, more consumable visualizations. We’ll take those lessons and apply them specifically to Power BI and then add on some tips and tricks. Throughout the day, there will be hands-on exercises and opportunities for group conversation. And you’ll receive some resources to take with you to help you continue to improve your report designs.

Agenda slide from my pre-con session: 1) Defining Success, 2) Message & Story, 3) Designing a Visual, 4) Refine Your Report 5) Applied Power BI 6) Power BI Tricks 7) Wrap-Up
Agenda for my PASS Summit pre-con titled Building Power BI Reports that Communicate Insights and Engage People

This session is geared toward people that have at least basic familiarity with Power BI Desktop (if you can populate a bar chart on a report page, that’s good enough). If you have never opened Power BI Desktop, we might move a little fast, but you are welcome to join us and give it a try. If you are pretty good with Power BI Desktop, but you want to improve your data visualization skills, this session could also be a good fit for you. I hope you’ll register and join my pre-con.

Implementing Data-Driven Storytelling Techniques in Power BI

Data storytelling is a popular concept, but the techniques to implement storytelling in Power BI can be a bit elusive, especially when you have data values that change as the data is refreshed. In this session, we’ll talk about what is meant by story. Then I’ll introduce you to tool-agnostic techniques for data storytelling and show you how you can use them in Power BI. We’ll also discuss the visual hierarchy within a page and how that affects your story. You can view my session description here.

Inclusive Presentation Design

I’m also delivering a professional development session for those of us that give presentations. Most speakers have good intentions and are excited to share their knowledge and perspective, but we often exclude audience members with our presentation design. Join me in this session to discuss how to design your presentation materials with appropriate content formatted to maximize learning for your whole audience. You’ll gain a better understanding of how to enhance your delivery to make an impact on those with varying abilities to see, hear, and understand your presentation. You can view my presentation description here.

Other Pre-Cons from My Brilliant Co-Workers

If you aren’t into report design, my DCAC coworkers are delivering pre-cons that may interest you.

Denny Cherry is doing a pre-con session on Microsoft Azure Platform Infrastructure.

John Morehouse is talking about Avoiding the Storms When Migrating to Azure.

I hope you’ll join one of us for a pre-con as well as our regular sessions. With PASS Summit being virtual, the lower price and removal of travel requirements may make this conference more accessible to some who haven’t been able to attend in past years. Be sure to get yourself registered and spread the word to colleagues.

Contact the Author | Contact DCAC

Refreshing a Power BI Dataset in Azure Data Factory

Published On: 2020-07-09By:

I recently needed to ensure that a Power BI imported dataset would be refreshed after populating data in my data mart. I was already using Azure Data Factory to populate the data mart, so the most efficient thing to do was to call a pipeline at the end of my data load process to refresh the Power BI dataset.

Power BI offers REST APIs to programmatically refresh your data. For Data Factory to use them, you need to register an app (service principal) in AAD and give it the appropriate permissions in Power BI and to an Azure key vault.

I’m not the first to tackle this subject. Dave Ruijter has a great blog post with code and a step-by-step explanation of how to use Data Factory to refresh a Power BI dataset. I started with his code and added onto it. Before I jump into explaining my additions, let’s walk through the initial activities in the pipeline.

ADF pipeline that uses web activities to gets secrets from AKV, get an AAD auth token, and call the Power BI API to refresh a dataset. Then and Until activity and an If activity are executed.
Refresh Power BI Dataset Pipeline in Data Factory

Before you can use this pipeline, you must have:

  • an app registration in Azure AD with a secret
  • a key vault that contains the Tenant ID, Client ID of your app registration, and the secret from your app registration as separate secrets.
  • granted the data factory managed identity access to the keys in the key vault
  • allowed service principals to use the Power BI REST APIs in in the Power BI tenant settings
  • granted the service principal admin access to the workspace containing your dataset

For more information on these setup steps, read Dave’s post.

The pipeline contains several parameters that need to be populated for execution.

ADF pipeline parameters

The first seven parameters are related to the key vault. The last two are related to Power BI. You need to provide the name and version of each of the three secrets in the key vault. The KeyVaultDNSName should be https://mykeyvaultname.vault.azure.net/ (replace mykeyvaultname with the actual name of your key vault). You can get your Power BI workspace ID and dataset ID from the url when you navigate to your dataset settings.

The “Get TenantId from AKV” activity retrieves the tenant ID from the key vault. The “Get ClientId from AKV” retrieves the Client ID from the key vault. The “Get Secret from AKV” activity retrieves the app registration secret from the key vault. Once all three of these activities have completed, Data Factory executes the “Get AAD Token” activity, which retrieves an auth token so we can make a call to the Power BI API.

One thing to note is that this pipeline relies on a specified version of each key vault secret. If you always want to use the current version, you can delete the SecretVersion_TenantID, SecretVersion_SPClientID, and SecretVersion_SPSecret parameters. Then change the expression used in the URL property in each of the three web activities .

For example, the URL to get the tenant ID is currently:

@concat(pipeline().parameters.KeyVaultDNSName,'secrets/',pipeline().parameters.SecretName_TenantId,'/',pipeline().parameters.SecretVersion_TenantId,'?api-version=7.0')

To always refer to the current version, remove the slash and the reference to the SecretVersion_TenantID parameter so it looks like this:

@concat(pipeline().parameters.KeyVaultDNSName,'secrets/',pipeline().parameters.SecretName_TenantId,'?api-version=7.0')

The “Call Dataset Refresh” activity is where we make the call to the Power BI API. It is doing a POST to https://api.powerbi.com/v1.0/myorg/groups/{groupId}/datasets/{datasetId}/refreshes and passes the previously obtained auth token in the header.

This is where the original pipeline ends and my additions begin.

Getting the Refresh Status

When you call the Power BI API to execute the data refresh, it is an asynchronous call. This means that the ADF activity will show success if the call is made successfully rather than waiting for the refresh to complete successfully.

We have to add a polling pattern to periodically check on the status of the refresh until it is complete.

We start with an until activity. In the settings of the until loop, we set the expression so that the loop executes until the RefreshStatus variable is not equal to “Unknown”. (I added the RefreshStatus variable in my version of the pipeline with a default value of “Unknown”.) When a dataset is refreshing, “Unknown” is the status returned until it completes or fails.

ADF Until activity settings

Inside of the “Until Refresh Complete” activity are three inner activities.

ADF Until activity contents

The “Wait1” activity gives the dataset refresh a chance to execute before we check the status. I have it configured to 30 seconds, but you can change that to suit your needs. Next we get the status of the refresh.

This web activity does a GET to the same url we used to start the dataset refresh, but it adds a parameter on the end.

https://docs.microsoft.com/en-us/resGET https://api.powerbi.com/v1.0/myorg/groups/{groupId}/datasets/{datasetId}/refreshes?$top={$top}

The API doesn’t accept a request ID for the newly initiated refresh, so we get the last initiated refresh by setting top equal to 1 and assume that is the refresh for which we want the status.

The API provides a JSON response containing an array called value with a property called status.

In the “Set RefreshStatus” activity, we retrieve the status value from the previous activity and set the value of the RefreshStatus variable to that value.

Setting the value of the RefreshStatus variable in the ADF pipeline

We want the status value in the first object in the value array.

The until activity then checks the value of the RefreshStatus variable. If your dataset refresh is complete, it will have a status of “Completed”. If it failed, the status returned will be “Failed”.

The If activity checks the refresh status.

If activity expression in the ADF pipeline

If the refresh status is “Completed”, the pipeline execution is finished. If the pipeline activity isn’t “Completed”, then we can assume the refresh has failed. If the dataset refresh fails, we want the pipeline to fail.

There isn’t a built-in way to cause the pipeline to fail so we use a web activity to throw a bad request.

We do a POST to an invalid URL. This causes the activity to fail, which then causes the pipeline to fail.

Since this pipeline has no dependencies on datasets or linked services, you can just grab my code from GitHub and use it in your data factory.

Contact the Author | Contact DCAC

Power BI Data Viz Makeover: From Drab to Fab

Published On: 2020-06-18By:

On July 11 at 3pm MDT, Rob Farley and I will be hosting a webinar on report design in Power BI. We will take a report that does not deliver insights, discuss what we think is missing from the report and how we would change it, and then share some tips from our report redesign.

Rob and I approach data visualization a bit differently, but we share a common goal of producing reports that are clear, usable, and useful. It’s easy to get caught up building shiny, useless things that show off tech at the expense of information. We want to give you real examples of how to improve your reports to provide the right information as well as a good user experience.

We’ll reserve some time to answer your questions and comments at the end. Come chat Power BI data viz with us.

You can register for the webinar at https://www.powerbidays.com/virtualevent/colorado-power-bi-days-2020-07-11/.

Come for the data viz tips, stay for the witty banter!

Contact the Author | Contact DCAC

Data Visualization, Context, and Domain Expertise

Published On: 2020-06-04By:

I recently posted a graph to twitter and asked people to explain it.

Let’s look at the graph.

Bar chart showing low levels of steps in April until April 25th, when they increase about 3x and remain at that level through May.
Chart from Fitbit showing my step count from April 1 through May 23.

The graph is from Fitbit. It shows the number of steps I took each day between April 1 and May 23. We can see that I had a very low number of daily steps between April 1 and April 24. Then there is a spike where my steps almost quadruple on April 25. They decrease a bit for a couple of days while remaining well above the previous average. Then my steps increase again, staying up around 10,000 steps.

The Responses

The responses I received to my tweet largely fell into 3 categories:

  1. Complaints about the x-axis label
  2. Simple interpretations of the graph saying that the steps increased on April 25 and remained higher, often accompanied by statements that there isn’t enough data to explain why that happened.
  3. Guesses as to why the steps increased and then remained higher.

The X-Axis Label

Many of my twitter friends create data visualizations for fun and profit. It didn’t surprise me that they weren’t happy with the x-axis.

There are multiple x-axis labels that show the month and year, but the bars are at the day level. It’s unusual to see the Apr ’20 label repeated 4 times as we see in this graph. It’s not necessarily inaccurate, but its imprecision goes against convention.

The fact that multiple people commented on it demonstrates to me that it is more distracting than helpful. The way you format your data visualizations can be distracting. This is why I tweet and talk about bad charts and how to improve them for human consumption.

Literal Interpretation

Some people were only comfortable sticking with the information available in the chart. They acknowledged that the steps went up. Some said there were too many possible explanations to narrow it down to a certain reason why.

Speculative Explanations

I enjoyed the many guesses as to why my steps increased:

  • I suddenly got motivated to exercise more
  • I moved my office further from my bedroom
  • I’m building a really big staircase
  • The device used to track my steps changed
  • I started playing Just Dance every day
  • Covid-19 lockdown ended

A few people who know me (or at least pay attention to my twitter feed) had some insight.

I did get a new dog during the timeframe, but I got her on April 28th, not April 25th.

Also, the weather did warm up about 12 degrees Fahrenheit over the timeframe.

The Necessary Context

For the curious, here’s the real explanation.

I lost my dog Buster on April 4. He was with me for over 9 years, and he was my best friend. He was suddenly not feeling well at the end of March, and he was diagnosed with cancer. He declined rapidly, and I stayed with him on the living room floor until it was time to say goodbye. During those first 4 days of April, I really only left the living room to take Buster outside. I slept a lot that weekend and didn’t move much because I was sad.

With losing Buster, everything associated with Covid-19, and some other personal issues, I was depressed for the next few weeks. But I was also very busy with work. I had no energy to do anything else after work. And there wasn’t much to do since my city and state were on lockdown for Covid-19.

On April 25, I decided that the only way to get out of the emotional hole I was in was to get up and do something, so I walked a few miles around a nearby park. I came home and looked on PetFinder.com to see if there was a dog I’d like to adopt, and I came across a bulldog mix at Foothills Animal Shelter. I hadn’t cleaned my house since Buster died (see: depression). So I spent the rest of the weekend cleaning and dog-proofing just in case I brought the dog home.

On April 28, I adopted Izzy, a bulldog/boxer mix.

Izzy likes to walk. We walk between 2 and 4 miles each day. She is most of the reason the step count remained high throughout May.

Nice Dog. So What?

I hope what you’ll take away from this story is that to really deliver insights, you need to know the subject of your data visualizations. You need domain expertise. And it helps to have your own observations or other datasets to support the events you are visualizing.

If you don’t know me, any of the speculations could be the right answer. And the most you could do with my Fitbit data is to provide descriptive analysis, simply saying what happened without going into why. Many people who follow me on Twitter knew I recently got a dog. That explains the increase in step count from May 28 going forward. But it doesn’t address May 25th. Without the additional context of my step count in other months, you don’t know what my average step count is outside of this view. You wouldn’t know if my average count is normally closer to 3,000 or 10,000 because you don’t have that data. This is a perfect example of where you would need more data, more months of this data as well as additional datasets, to understand what is really going on. Sometimes there are actual datasets we can acquire, like weather data or Covid-19 lockdown dates. But there is no dataset for me losing Buster or struggling with depression.

This is part of why I prefer the term “data-informed decisions” over “data-driven decisions”. We often don’t have all the data to really understand what is going on. Technology has improved (see: Power BI) to make it quicker and easier to mash up data to provide a more complete picture. But we’ll still have to make decisions based upon incomplete data. If we have domain expertise, we may need to review data and ask questions to get better insights, and then rely on our knowledge and experiences to complete the picture.

This chart is also a good representation of a common issue in business intelligence: we often settle for only descriptive analytics. It may even have been a struggle just to get there. Let’s say I’m trying to become more active and using step count as a metric. You see this chart and see the increase in steps and say “That’s great. Do whatever you did last month to increase your steps even more.” Am I supposed to get another dog? Get less depressed?

Let’s pretend that my chart is not about my step count but is an operational report for an organization. That one chart tells you a trend of a single measure. We need more data, more visuals for this information to be impactful. The additional data adds necessary context. If this were a Power BI report, we might use interactivity to provide navigation paths to explore common questions about the data and to help identify what is influencing the current trend. Then you could use the report to facilitate a more productive conversation. I’m not addressing AI here, but after understanding the data and decisions made from it, you might introduce some machine learning to automate the analysis process.

Just having a report on something is not enough. The goal of data visualization is not to show off your data (if your service/product is data, that’s a different thing). It’s to help provide meaningful information to people so they can make decisions and take action. In order to do that, we need to understand our audience, the domain in which they are operating, and the questions they are trying to answer. Data visualization tools make it easy to get things on the page, but I hope you are designing your visualizations purposefully to facilitate data-informed decisions.

Contact the Author | Contact DCAC
1 2 3 9

Video

Globally Recognized Expertise

As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.

And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.

Awards & Certifications

Microsoft Partner   Denny Cherry & Associates Consulting LLC BBB Business Review    Microsoft MVP    Microsoft Certified Master VMWare vExpert
American Business Awards People's Choice American Business Awards Gold Award    American Business Awards Silver Award    FT Americas’ Fastest Growing Companies 2020       Best Full-Service Cloud Technology Consulting Company
   Insights Sccess Award    Technology Headlines Award    Golden Bridge Gold Award    CIO Review Top 20 Azure Solutions Providers