One of the benefits of cloud computing is flexibility and scale—I don’t need to procure hardware or licenses as you get new customers. This flexibility and platform as a service offerings like Azure SQL Database allow a lot of flexibility in what independent software vendors or companies selling access can provide to their customers. However, there is a lot of work and thought that goes into it. We have had success with building out these solutions with customers at DCAC, so in this post, I’ll cover at high level some of the architectural tenants we have implemented.
Authentication and Costing
The cloud has the benefit of providing detailed billing information, so you know exactly what everything cots. The downside to this is that the database provided is very granular and detailed and can be challenging to breakdown. There are a couple of options here—you can create a new subscription for each of your customers which means you will have a single bill for each customer, or you can place each of your customers into their own resource, and use tags to identity which customer is associated with that resource group. The tags are in your Azure bill and this allows you to break down your bill by each customer. While the subscription model in cleaner in terms of billing, however it adds additional complexity to the deployment model and ultimately doesn’t scale.
The other thing you need to think about is authenticating users and security. Fortunately, Microsoft has built a solution for this with Azure Active Directory, however you still need to think about this. Let’s assume your company is called Contoso, and your AAD domain is contoso.com. Assuming you are using AAD for your own business’s users, you don’t want to include your customers in that same AAD. The best approach to this is to create a new Azure Active Directory tenant for your customer facing resources—in this case called cust.contoso.com. You would then add all of the required accounts from contoso.com to cust.contoso.com in order to manage the customer tenant. You may also need to create a few accounts in the target tenant, as there are a couple of Azure operations that require an admin from home tenant.
Deployment of Resources
One of the things you need to think about is what happens when you onboard a new customer. This can mean creating a new resource group, a logical SQL Server, and a database. In our case, it also means enabling a firewall rule, and enabling performance data collection for the database, and a number of other configuration items. There are a few ways you can do this—you can use an Azure Resource Manager (ARM) template, which contains all of your resource information, which is a good approach that I would typically recommend. In my case, there were some things that I couldn’t do in the ARM template, so I resorted to using PowerShell and Azure Automation to perform deployments. Currently our deployment is semi-manual as someone manually enters the parameters into the Azure Automation runbook, but it could be easily converted to be driven by an Azure Logic App, or a function.
Deployment of Data and Data Structures
When you are dealing with multiple databases across many customers, you desperately want to avoid schema drift that can happen. This means having a single project for all of your databases. If you have to add a one-off table for a customer, you should still include it in all of your databases. If you are pushing data into your tables (as opposed the data being entered by the application or users) you should drive that process from a central table (more to come about this later).
Where this gets dicey is with indexes, as you have may have some indexes that are needed for specific customer queries. In general, I say the overhead on write performance of having some additional indexes is worth the potential benefit on reads. How you manage this is going to depend on the number of customer databases you are managing—if you are you have ten databases, you might be able to manage each databases indexes by themselves. However, as you scale to a larger number of databases, you aren’t going to be able to manage this by hand, Azure SQL can add and drop indexes it sees fit, which can help with this, but isn’t a complete solution.
Hub Database and Performance Data Warehoue
Even if you aren’t using a hub and spoke model for deploying your data, having a centralized data repository for metadata about your client databases. One of the things that is a common task is collecting performance data across your entire environment. While you can use Azure SQL Diagnostics to capture a whole lot of performance information in your environment, with one of our clients we’ve taken a more comprehensive approach combining the performance data from Log Analytics, Audit data that also goes to Log Analytics, and the Query Store data from each database. While log analytics contains data from the Query Store, there was some additional metadata that we wanted to capture that we could only get from the Query Store directly. We use Azure Data Factory packages (which were built by my co-worker Meagan Longoria (b|t) to a SQL Database that serves as a data warehouse for that data. I’ve even built some xQuery to do some parsing of execution plans, to identity which tables are most frequently queried. You may not need this level of performance granularity, but it is a talk you should have very early in your design phase. You can also use a 3rd party vendor tool for this—but the costs may not scale if your environment grows to be very large. I’m going to do a webinar on that in a month or so–I need to work it out the details, but stay tuned.
You want to have the ability to quickly do something across your environment, so having some PowerShell that can loop through all of your databases is really powerful. This code allows you to make configuration changes across your environment, or if you use dbatools or invoke-sqlcmd to run a query everywhere. You also probably need to get pretty comfortable with Azure PowerShell, as you don’t want to have to change something in the Azure Portal across 30+ databases.
I have received feedback that some folks think I just want to burn PASS down, or that I don’t want a for profit company involved with a community organization. Neither of those things are remotely what I’m thinking—I’ve only been loud and writing about it here, because I want PASS to survive, which is going to be near impossible with a loss of its main revenue source (in-person PASS Summit) and its expenses (C&C) which haven’t dropped nearly enough in the face of the aforementioned revenue loss. What do I see as a future for PASS?
Virtual Summit is going to happen in 2020, it’s also probably going to lose money. It’s effectively a sunk cost at this point, so I’m not going to waste any time talking about that. In 2021, PASS has a tough decision to make—large international conferences are unlikely to be a thing until 2022, when the covid vaccine has been broadly distributed. Planning a virtual conference in 2021 is risky as well, given that most of the competition is free. I think doing a low cost (and lower overhead) smaller scale event using a much cheaper platform like Microsoft Teams or even GoToWebinar would be a good small bet, without much risk. I also think a small conference the size of the old SQL Rally (a few hundred people and run in a hotel, not a conference center) could be viable for Q4 of 2021.
The reason for doing this would be an effort to try to keep the fundamental networking aspect of PASS going, while reducing financial risk. The original SQL Rally was a community organized event—by keeping it small, you not only reduce costs, but you also reduce the time to plan, which allows you to have a better assessment of the pandemic situation. PASS could also think about leveraging larger SQL Saturdays like Atlanta and Dallas, amongst others to be candidates for a Rally, as these events have community organizers who are very experienced at running larger scale events.
The Managing Organization
I’ve said what I’m going to say about C&C, but it’s very clear that PASS as an organization is untenable with its current cashflow situation. This means costs need to be drastically cut wherever possible. PASS won’t be in the business of planning large in person conferences until 2022, and therefore doesn’t require a large event management firm dedicated to its management. I would recommend hiring a full-time executive director (yes, I know I said we need to reduce spending) to manage the organization and manage vendor relationships. C&C currently has a seat on the on the PASS exec board with a title of Executive Director, which is a conflict of interest, and I would propose ending that immediately. The Executive Director role needs to be someone who understands both data and analytics and building communities. Finding this person will be a challenge, but I believe they are out there. I would also move to stop using the custom developed platforms PASS is using and move to using Software as a Service platforms where possible. Sessionize is probably the most obvious solution here, but there are others.
The Role of the Board
As I was reading the by-laws and guidance for the PASS BoD, I came across this paragraph.
Role of PASS Board Members
“What PASS does not need from the Board is tactical execution or day to day management of organizational activities”—I can’t imagine running a SQL Saturday and completely outsourcing everything to a third party—I feel the same way about our community organization, especially in this time of crisis. I think this is completely wrong, and the main reason why PASS is in the situation it is right now. The Board of Directors needs to take an active role in managing the organization, period. We, as a community organization are in a situation where the organization might go bankrupt and die, and while this is largely due to a black swan event (the pandemic), it is also due to decisions made that are in the interest of the managing firm, and not the community. When I was heavily involved in running my PASS chapter, I had a board member, who’s portfolio was chapters, who took an active interest in the chapters, and their needs and worked his tail off to make things better. Unfortunately, he was not re-elected, and things never got any better from there.
The board needs to take an active role—while the day to day operations of the org, would be managed by the executive director, and eventually some administrative staff, in a time where the organization needs to be austere with its spending, being on the board should require you to get your hands dirty. I would also try to involve the community—there are lots of projects, that over time could have been open sourced, but there has always been push back from the board. Given the success of community managed projects like DBATools, I see no reason to not engage volunteers who are willing to help, especially on community facing projects.
I’ve been involved in PASS for nearly 15 years now—I want it to survive, because having a centralized community organization is a good thing and makes the community stronger. The central organization also provides governance and helps with sponsors. PASS cannot survive financially in its current state, and we as a community must band together to help it survive and foster the changes to make it a sustainable organization. While we are doing that, we can make it a better community org.
Sorry for spammy, SEO title, we got to pay the bills. Sometimes it’s fun to just write some code to solve problems, and not think about the world’s larger problems for a few hours. Last week, I learned something new from a client—that you can change managed disks in Azure in Premium Storage to Standard Storage if the VM connected to those disks is powered off. This is a cost savings of nearly $100/month per month per disk (assuming 1 TB disks) and since the SQL Server image in the marketplace uses two 1 TB disks, this can save you a good amount of money from your Azure spend.
This code will loop through each resource group in your subscription and look for resource groups with the Tag “Use:Demo”. If you aren’t familiar with Tags in Azure (or AWS) they are a metadata application layer that allows you to more easily identity and filter resources. The most common use case is to make your Azure bill easier to navigate. However, you can also incorporate tagging into your management operations, as you see in this example.
After it identifies each resource group with that tag, it will then look for VMs in those resource groups, and power them down if they are running, and then migrate each premium disk on the VM to Standard. I have similar code in Github to do the opposite, however, I haven’t glammed it up to support the tagging functionality yet.
This code is available at DCAC’s GitHub here. To take this a step further you could create an Azure Automation runbook to deploy this code. In order to do that you would need to import the modules Az.Resources and Az.Compute into your automation account.
If you saw any of my angry tweets last night, it’s not just because the Saints weren’t good. I’ve been writing a lot about PASS and C&C the for-profit event management firm that runs virtually all of PASS’ operation. I personally think C&C imposes a financial burden on the Microsoft Data Platform community that will ultimately kill PASS. I want to run for the board of directors (once you agree to run for the board you have to agree not to speak or write poorly of PASS, but it doesn’t say anything about C&C) to try to return PASS to being a community oriented organization. PASS has been a great organization and the connections I have made have been a great foundation for the career success that myself and many others have achieved. The reason I agreed to speak at PASS Summit this year was to help enable the organization’s survival, despite my lasting frustrations with C&C.
PASS had a couple of options for doing PASS Summit virtually, and they’ve failed at every turn. The best option would have been to do a super low-cost virtual summit, using Microsoft Teams, and tried to keep the pricing at level the average DBA could pay out of pocket. This big reduction in revenue is bad for C&C’s business, but frankly given that there likely won’t be a big conference until 2022, C&C should be operating on an austerity budget, since PASS’ main income source has been severely constrained.
The Burden on Speakers
I’ve lost count of how many webinars I’ve done this year—it’s been a lot. 98% have been live—in some cases with some really dicey demos, like I did at Eight KB. Doing a webinar or a user group meeting is a decent amount of effort, but no more than doing in-person session. However, PASS Summit has asked speakers to record their sessions—recording a session takes me at a minimum 2-3x the amount of time to execute than to simply deliver a session. Setting up cameras, lighting, and doing small amounts of editing all add up to considerable amounts of time. Additionally, you have to render the video and then upload it to the site. I say this with experience, because I just recorded three sessions for SQLBits.
You might ask why I was willing to record sessions for Bits, but not PASS Summit. That’s a good question—SQLBits is truly a community run event, for the community, by the community. Sure it can be rough around the edges, but it’s a great event, and in general the conference is great to work with. Additionally, SQLBits always pays for speaker’s hotel rooms, it’s nominal in the cost of an international trip, but it’s something that makes you feel wanted as a speaker and I remember it. PASS Summit, unless you have a preconference session (precon) doesn’t offer any renumeration at all to speakers, nor have they ever. All that being said, after recording my Bits sessions, I said “I’m never doing that for free again”. In addition after doing the work for your session, you have to show up and do Q&A for your session.
Why You Shouldn’t Speak at PASS Summit (and TimeZones are hard)
PASS has asked speakers to record their sessions just six weeks before the conference. These recordings will only ever be seen by paid attendees of the conference, and possibly PASS Pro members. Speakers received a highly confusing email informing them of this late last night, which included the time and date of their sessions. It wasn’t clear if “live sessions” still needed to be recorded—which is even more confusing to speakers. Speakers weren’t consulted about the need to record their sessions when the revised speaker agreement went out. This burden has been imposed at the last minute. I haven’t gotten any official communications since July when I received my speaker code. It’s not fair to impose this on speaker’s this late in the process, especially when you aren’t compensating them for their time. Also, this is insignificant, but we were supposed to get the slide template in July, and it’s still not in my inbox. I’ve have no communications from PASS about Summit since July.
Precons are all starting in the speaker’s native time zone, which will limit the audience for many precon speakers—European speakers are starting at early a 3 AM EST, which means basically no one in North America (PASS’ main market). Most regular conference sessions are 8-5 PM EST—which probably is a decent compromise, but still greatly limits the west coast in the morning and other regions of the world like Asia. There are some evening and overnight sessions but those are extremely limited compared to EST business hour sessions. All schedules for a worldwide event are going to be a compromise, but I feel like some creativity could have been used to better support a virtual audience. For example, Ignite has replays of all its sessions available for broader time zone coverage. As far as I know, no speakers were consulted during the making of this schedule.
Doesn’t This Hurt the Community?
A successful PASS Summit is a good thing for the community. However, with the poor management of C&C, the marketing for the event has been poor, and with most other events either going to free or freemium models, PASS continues to charge a premium for the event. The platform that PASS is using hasn’t been demoed to speakers or attendees, to show how it would have value over a free conference like EightKB or Ignite.
I’m not going to speak at PASS Summit. I’m going to record my session, and put it on YouTube, so everyone can watch the session. And I’ll do a live Q&A to talk about it—it’s a really cool session about a project I’ve worked on to aggregate query store data across multiple databases. I challenge other speakers to follow me—the conference is so bad and so expensive, because C&C is trying to prop itself up on the back of the community. C&C needs to go away before we can move forward. I was frustrated before, but this Summit fiasco has really pushed me over the top.
Are you new to Azure Data Factory and wondering what you don't know you don't know? The learning curve with new technologies can sometimes lead to some major refactoring down the line once we realize our mistakes. Join Meagan Longoria and Kerry Tyler to learn how to set up your data factory for success. They will start by discussing naming conventions, parameterization, Key Vault usage, and deployment with Azure DevOps. Then they'll share their recommendations on pipeline hierarchies, activity dependencies, error handling, and monitoring. Watch this webinar to help your organization avoid Data Factory regrets!
Watch Denny and Joey from DCAC, and Rob Krug from Avast as they talk about enterprise security, where companies fail from a security perspective, and what small / medium companies can do to get enterprise-grade security features without breaking the bank.
As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.
And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.