If you have ever rebooted a virtual machine and feel like you are in the dark to it’s current state, you aren’t alone. Thankfully, Azure provides a number of tools to aid in your experience in the cloud when it comes to “reboot darkness”. Boot diagnostics allows you to see the state of your virtual machine as it boots up. This is very similar to looking at the console in VMWare vCenter, you can see if the box is at least up and functional. Essentially, it collects serial log information from the virtual machine as well as screen shots. This helps to diagnose startup issues.
When you create a virtual machine in Azure, boot diagnostics is enabled by default. You have to explicitly disable it if you desire to do so, which I recommend not doing. You will find that boot diagnostics can be handy to have available, especially if you run into issues on startup. As part of its design it utilizes a storage account, so keep in mind there is a storage cost involved , however, the feature itself is free at no charge. Thankfully, the boot diagnostics are only available on standard storage, so you won’t be paying for premium storage, keeping costs minimal. You can also use an existing storage account if you desire, or you can create a new one. I tend to just let it automatically create a new one for me as that’s just easy
If your virtual machine is already up and running, you can find the boot diagnostics blade under “Support + troubleshooting”. You can see from below where it’s located. This example shows a SQL Server 2019 virtual machine that is running in my Azure subscription.
The next blade will show you an active console of the virtual machine. From here you are able to determine what the current status of the virtual machine might be. You will also noticed that you can gain access to the serial log (shown below), which will give you more detailed information about the boot process.
Once we click on Boot Diagnostics, we will then see the initial startup screens of the server:
If all goes according to plan and everything boots correctly, we will see something like this:
Granted, if your VM is a different operating system, like Linux, the image shown will be different. The virtual machine is now up and ready to be logged into with proper credentials.
When you don’t have control over the underlying hardware as in Azure virtual machines, it can be frustrating to reboot your servers and not have any clue on whether or not it is up and functional. Using boot diagnostics will help to ease that frustrations so that you can then focus on any issues that might arise.
It’s not always obvious when you need a data gateway in Azure, and not all gateways are labeled as such. So I thought I would walk through various applications that act as a data gateway and discuss when, where, and how many are needed.
Note: I’m ignoring VPN gateways and application gateways for the rest of this post. You could set up a VPN gateway to get rid of some of the data gateways, but I’m assuming your networking/VPN situation is fixed at this point and working from there. This is a common case in my experience as many organizations are not ready to jump into Express Route when first planning their BI architecture.
Let’s start with what services may require you to use a data gateway.
You will need a data gateway when you are using Power BI, Azure Analysis Services, PowerApps, Microsoft Flow, Azure Logic Apps, Azure Data Factory, or Azure ML with a data source/destination that is in a private network that isn’t connected to your Azure subscription with a VPN gateway. Note that a private network includes on-premises data sources and Azure Virtual Machines as well as Azure SQL Databases and Azure SQL Data Warehouses that require use of VNet service endpoints rather than public endpoints.
and you have a data source in a private network, you need at least one gateway. But there are a few considerations that might cause you to set up more gateways.
Your services must be in the same region to use the same gateway. This means that your Power BI/Office 365 region and Azure region for your Azure Analysis Services resource must match for them to all use one gateway. If you have resources in different regions, you will need one gateway per region.
You may want high availability for your gateway. You can create high availability clusters so when one gateway is down, traffic is rerouted to another available gateway in the cluster.
You may want to segment traffic to ensure the necessary resources for certain ad hoc live/direct queries or scheduled refreshes. If your usage and refresh patterns warrant it, you may want to set up one gateway for scheduled refreshes and one gateway for live/direct queries back to any on-premises data sources. Or you might make sure live/direct queries for two different high-traffic models go through different gateways so as not to block each other. This isn’t always warranted, but it can be a good strategy.
Data Factory Self-hosted Integration Runtime
If you are using Azure Data Factory (V1 or V2) or Azure ML with a data source in a private network, you will need at least one gateway. But that gateway is called a Self-hosted Integration Runtime (IR).
Self-hosted IRs can be shared across data factories in the same Azure Active Directory tenant. They can be associated with up to four machines to scale out or provide higher availability. So while you may only need one node, you might want a second so that your IR is not the single point of failure.
Or you may want multiple IRs to boost throughput of copy activities. For instance, copying from an on-premises file server with one IR node is about 195 Megabytes per second (MB/s). But with 4 IR nodes, it can be as fast as 505 MB/s.
Factors that Affect the Number of Data Gateways Needed
The main factors determining the number of gateways you need
Number of data sources in private networks
(including Azure VNets)
Location of services in Azure and O365 (number
of regions and tenants)
Desire for high availability
Desire for increased throughput or segmented
If you are importing your data to Azure and using an Azure
SQL DB with no VNet as the source for your Power BI model, you won’t need an On
Premises Data Gateway. If you used Data Factory to copy your data from an on-premises
SQL Server to Azure Data Lake and then Azure SQL DB, you need a Self-Hosted
If all your source data is already in Azure, and your source
for Power BI or Azure Analysis Services is Azure SQL DW on a VNet, you will
need at least one On-Premises Data Gateway.
If you import a lot of data to Azure every day using Data Factory, and you land that data to Azure SQL DW on a VNet, then use Azure Analysis Services as the data source for Power BI reports, you might want a self-hosted integration runtime with a few nodes and a couple of on-premises gateways clustered for high availability.
Have a Plan For Your Gateways
The gateways/integration runtimes are not hard to install. They are just often not considered, and projects get stalled waiting until a machine is provisioned to install them on. And many people forget to plan for high availability in their gateways. Make sure you have the right number of gateways and IR nodes to get your desired features and connectivity. You can add gateways/nodes later, but you don’t want to get caught with no high availability when it really matters.
Keys and secrets (AKA passwords) are an essential part of data protection management not only on-premises, but within the cloud as well. One of the many advantages of cloud is the ability to have a secure, persisted key store. If you have used a password manager like Keepass or 1Password, you can consider Azure Key Vault to be an enterprise level password manager, but also a lot more. One of the functions that Azure Key Vault supports is for you to keep small secrets such as passwords, tokens, connection strings, API keys as well as encryption keys and certificates in a safe tightly controlled secure location in the cloud. It is a centralized location for storing all your management keys removing the need for application owners to store and manage keys. This in turn helps by reducing the risk of keys being accidentally disclosed or lost.
This service allows you to manage not only your keys but also those who have access to them. You can grant granular permissions to each key to only the users and applications who need access. It also allows for separation of duties as shown in the diagram below.
Monitoring for compliance and audit is another crucial component to key management. Azure Key Vault also provides logging into what and whom accesses what is in your vault. By enabling logging for Key Vault, it saves data in an Azure storage account you create and stores all the information in needs for reporting within a retention range you set. My next blog in this series will show you step by step how to set up and configure logging using Azure Log Analytics.
As with any critical component of your infrastructure, your keys and secrets should be safe guarded against failures. Thankfully, Azure gives us the ability to store these keys with georedundancy in case of a disaster. You no longer have to worry about where those keys are stored and backing up those keys off site. However, one large caveat to storing your keys in the cloud is that you must always have internet access. Storing and using keys requires the application layer to retrieve those keys for use, redundant strong internet access is essential to any cloud operations.
Key Vault is also great for creating a secure login to SQL DB. My co-worker Joey D’Antoni (B|T) blogged about it recently here. In this recent blog he also dives in automation using this secured method and give you a great PowerShell script where he defines a variable called password, and gets from the Key Vault, and then passes it into the –SQLAdministratorCredentials in New-AzureRMSQLServer.
Lastly, part of key management is key rotation. Every company has a different rotation strategy, however, most of the time changing out these keys is a manual time-consuming process. Azure Automation can help you with this in conjunction with Azure Key Vault. This link gives you all the steps you need to set this up.
Azure Key Vault is definitely a service worth looking to it. It is a relatively low-cost alternative to managing and storing your companies passwords, tokens, connection strings, API keys as well as encryption keys and certificates. Plu it is a great way to get your company’s footprint into the cloud.
I’m honored to have one of my PASS Summit sessions chosen to be part of the PASS Data Expert Series on February 7. PASS has curated the top-rated, most impactful sessions from PASS Summit 2018 for a day of solutions and best practices to help keep you at the top of your field. There are three tracks: Analytics, Data Management, and Architecture. My session is in the Analytics track along with some other great sessions from Alberto Ferrari, Jen Underwood, Carlos Bossy, Matt How, and Richard Campbell.
The video for my session, titled “Do Your Data Visualizations Need a Makeover?”, starts at 16:00 UTC (9 AM MT). I’ll be online in the webinar for live Q&A and chat related to the session.
I hope you’ll register and chat with me about data visualizations in need of a makeover on February 7.
As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.
And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.