Only use CLR when you need to

Published On: 2019-07-23By:

There are people out there that love SQL CLR, and there are people that hate SQL CLR. I have a slightly different opinion, which is that I hate how most people use SQL CLR.  The trick to SQL CLR is that you should only be using it for things that CLR does better than T-SQL does; which is mainly dealing with text manipulation such as RegEx.  T-SQL can’t do stuff like RegEx, so SQL CLR is the way to do that.  If you’re having SQL CLR do things like calling out to a web server or running SSIS packages, or pulling in stuff from a file server all from T-SQL, then you probably need to take a step back and look at what you’re trying to do, and possibly do a redesign of what you’re trying to do.

Denny

The post Only use CLR when you need to appeared first on SQL Server with Mr. Denny.


Contact the Author | Contact DCAC

Azure SQL DB Failover Group Orphaned Users

Published On: 2019-07-19By:

I have a client  that resides in the Azure data platform arena, focusing mainly on Platform as a Service (PaaS) and Azure SQL DB.  They have a decent landscape of Azure resources and currently utilize failover groups within SQL DB to facilitate high availability.  Under the covers, failover groups are essentially Availability Groups and have similar issues that you might encounter with the on-premises version.

A common issue that you might encounter, which my client did recently, revolves around orphaned users.  Orphaned users occur when the user object in the database has a different SID (security identifier) than what the login says it should be.  It is also possible that the login may not exist at the server level.  Orphaned users are also specifically related SQL Logins and not Active Directory authentication.  When dealing with on-premises databases, this was commonly found when restoring a database from one server to another and you had to manually adjust the database user to fix the incorrect SID.  Regarding Azure SQL DB and failover groups, orphaned users can also occur.  The login is first created on the primary server and then the database user is created in the user database.  The syntax would look like this:

-- While in Master
CREATE LOGIN [User2] WITH PASSWORD = 'ThisIsNotThePassword1234'
GO
-- While in the user database
CREATE USER [User2] FROM LOGIN [USER2]
GO

As soon as the database user is created the command is sent to the secondary replicas.  However, the login is not sent, and this causes the SID of the database user to not match any corresponding logins on the secondary server.  In addition,  just like availability groups, databases contained in a failover group are read-only. Therefore you cannot modify the data or the database including user objects.  You can modify the login in the Master database though, which allows you to resolve the issue.

Let’s take a look.

Setup

I’ve got two servers in my Azure subscription, one is located in the East US region and the other in the West US region, each hosting a SQL DB named “Demo1”.

I have configured a fail-over group between these two servers such that sqldbdemo-east is the primary and sqldbdemo-west is the secondary.

Using SSMS, we can connect to both servers and see the that User1 is a login for both servers.  User1 is also a database user for Demo1 on both servers (it’s not shown but trust me).  Since sqldbdemo-east  is the primary we can add a new user .

Before we get started remember with SQL DB you can’t use a USE statement so you must be in Master to create the login.

CREATE LOGIN [User2] WITH PASSWORD = ‘ThisIsNotThePassword1234’;

Once the login has been created, we can now create it within the database Demo1.  Make sure to change the context of the query to the database in question.

CREATE USER [User2] FROM LOGIN [USER2]

We can verify that the new user exists in Demo1 on sqldbdemo-east.

If we check the secondary database on sqldbdemo-west, we will also see that the user was created there.   Remember that this user was created by the failover group automatically.  You do not have to manually create this user; it will be done for you.

 

We can also compare the SIDs for both users to ensure that they are the same:

However, the issue comes into play because the login does not exist yet for User2 on the sqldbdemo-west.

 

Since there isn’t a login associated with the user, someone using the credentials for User2 will not be able to authenticate properly.  Most likely you will get this type of error:

This can be fixed by adding the login User2 to sqldbdemo-west.  We will use the same password that was used to create User2 on sqldbdemo-east.

Once the login is created and granted access to Master, we can then log into sqldbdemo-west as shown below.

Here we can see that User2 has been authenticated and I can see the databases on the server.  However, if I attempt to access Demo1, I get an error:

The SIDs Did It

This issue occurs because the SIDs for the two logins are not the same.  Since they are not the same, the user object in Demo1 on sqldbdemo-west is unable to authenticate against it.   Remember, that because it is a secondary replica, the database is read-only.  You will not be able to do anything with the user object.  Instead, you will have to drop and recreate the User2 login on secondary server with appropriate SID.   You can find the correct SID to use by looking in these places on the primary server:

  • Database level – Sys.database_principals
  • Database level – Sys.sysusers
  • Master level – Sys.sql_logins

Once you have obtained the correct SID value, while in the context of Master on the secondary server, do the following:

-- While in Master
DROP LOGIN [User2]
GO
CREATE LOGIN [User2] WITH PASSWORD = 'ThisIsNotThePassword1234', SID = 0x01060000000000640000000000000000CA6EAC7C69DC024DBB850F80F8E595E6
GO

Now that the database user SID matches the login SID, User2 can now authenticate to the server as well as access the database

Summary

The cloud offers up a lot of new things however orphaned users can happen there just like they do with on-premises instances.  This will only happen with SQL logins but the resolution is straight forward once you see where the issue lies.  The next time you have to create user accounts in a fail over group, make sure to use the right SID from the start and you will save yourself some headache.

 

 

© 2019, John Morehouse. All rights reserved.


Contact the Author | Contact DCAC

Understanding Columnstore Indexes in SQL Server Part 3

Published On: 2019-07-17By:

My last two blogs were on understanding columnstore and creating columnstore indexes. In this post, I will touch on one last topic, when to use columnstore indexes. As noted in my prior posts these indexes are designed for large data warehouse workloads, not normal OLTP workload tables. As with any index design it is important to know your data and what you will be using the data for. Be sure to look to what types of queries you will be running against it, as well as how the data is loaded and maintained. Let’s look at a few questions you should ask yourself before deciding to implement a columnstore index. Just because your table consist of millions of rows doesn’t mean columnstore is the right way to go.

First you need to know the data you are designing the index for.

Is your table large enough to benefit? As we reviewed in my first post usually this means in the millions of records range as these rows are divided into groups of rows, called a rowgroup.  A rowgroup has a minimum of 102,400 rows with a max of approximately 1 million rows.  Each rowgroup is changed into column segments. Therefore, having a columnstore index on a table with under a 1 million rows does not make sense in that if the table is too small you don’t get the benefits of compression that comes with the column segments. A general recommendation is to use columnstore indexes on the fact tables in your data warehouse, and on very large dimension tables, containing over 5 million rows.

Is your data volatile, meaning changing frequently? Rule of thumb says you want tables that rarely have data modifications, more specifically, where less then 10% of the rows are modified. Having large numbers of deletes can cause fragmentation, which adversely affect compression rates, thus reducing the efficiency of the index. Updates in particular are expensive, as they are processed as deletes followed by inserts, which will adversely affect the performance of your loading process.

What datatypes are in your table? There are several data types that are not supported within a columnstore index. Data types like varchar(max), nvarchar(max), or varbinary(max) were not supported until SQL Server 2017, and typically aren’t the best fit for this type of workload. Additionally, if you are using uniqueidentifiers (GUIDs) you won’t be able to create your index as they are still not supported.

Next what are you doing in your queries.

Are you doing aggregations or performing analytics on the data or are you looking for specific values? The standard B-tree rowstore indexes are best for singleton lookups (single values) and are sometimes used in tandem with a columnstore index. If you’re using an index to cover a where clause that does not look up a range of values and is just filling predicates, then columnstore does not benefit. This especially  true if you need to “cover” that query with included columns since columnstore does not allow included columns. However, columnstore is designed to quickly process aggressions especially on a grouped range of values. So, if you are performing aggregations or analytics usually columnstore can give you large performance gains as it can do full table scans to perform aggregations very fast.

Now there are times where you want to seek a specific value and perform aggregations (think average sale price in the last quarter for a specific product). In these cases you may benefit from a combination of a rowstore and columnstore index. Creating the columnstore to handle the grouping and aggressions and covering the index seek requirement with the rowstore. Adding these b-tree indexes will help your query performance, but they can dramatically impact the loading process—if your data loads are large enough it may be more efficient to drop the b-tree index, and rebuild it after loading the data into the columnstore.

Is this a data warehouse fact or dimension table? As we know a dimension table is usually used to find specific values, mostly lookups to match with an aggregated value from a fact table. If it’s a dimension table, typically you are going to use a b-tree based models, with the exception of very large dimensions. We tend to use columnstore on fact tables in the data warehouse as these tables are normally the source of aggregations.

Summary

Knowing your data is pivotal in deciding whether or not a columnstore index is best suited for your query needs. Don’t just jump to columnstore indexes now that you have an understanding of them, and your tables are large.  Make sure to take the time to choose the right indexing option for your usage patterns, the data and the overall environment.


Contact the Author | Contact DCAC

Understanding Columnstore Indexes in SQL Server Part 3

Published On: By:

My last two blogs were on understanding columnstore and creating columnstore indexes. In this post, I will touch on one last topic, when to use columnstore indexes. As noted in my prior posts these indexes are designed for large data warehouse workloads, not normal OLTP workload tables. As with any index design it is important to know your data and what you will be using the data for. Be sure to look to what types of queries you will be running against it, as well as how the data is loaded and maintained. Let’s look at a few questions you should ask yourself before deciding to implement a columnstore index. Just because your table consist of millions of rows doesn’t mean columnstore is the right way to go.

First you need to know the data you are designing the index for.

Is your table large enough to benefit? As we reviewed in my first post usually this means in the millions of records range as these rows are divided into groups of rows, called a rowgroup.  A rowgroup has a minimum of 102,400 rows with a max of approximately 1 million rows.  Each rowgroup is changed into column segments. Therefore, having a columnstore index on a table with under a 1 million rows does not make sense in that if the table is too small you don’t get the benefits of compression that comes with the column segments. A general recommendation is to use columnstore indexes on the fact tables in your data warehouse, and on very large dimension tables, containing over 5 million rows.

Is your data volatile, meaning changing frequently? Rule of thumb says you want tables that rarely have data modifications, more specifically, where less then 10% of the rows are modified. Having large numbers of deletes can cause fragmentation, which adversely affect compression rates, thus reducing the efficiency of the index. Updates in particular are expensive, as they are processed as deletes followed by inserts, which will adversely affect the performance of your loading process.

What datatypes are in your table? There are several data types that are not supported within a columnstore index. Data types like varchar(max), nvarchar(max), or varbinary(max) were not supported until SQL Server 2017, and typically aren’t the best fit for this type of workload. Additionally, if you are using uniqueidentifiers (GUIDs) you won’t be able to create your index as they are still not supported.

Next what are you doing in your queries.

Are you doing aggregations or performing analytics on the data or are you looking for specific values? The standard B-tree rowstore indexes are best for singleton lookups (single values) and are sometimes used in tandem with a columnstore index. If you’re using an index to cover a where clause that does not look up a range of values and is just filling predicates, then columnstore does not benefit. This especially  true if you need to “cover” that query with included columns since columnstore does not allow included columns. However, columnstore is designed to quickly process aggressions especially on a grouped range of values. So, if you are performing aggregations or analytics usually columnstore can give you large performance gains as it can do full table scans to perform aggregations very fast.

Now there are times where you want to seek a specific value and perform aggregations (think average sale price in the last quarter for a specific product). In these cases you may benefit from a combination of a rowstore and columnstore index. Creating the columnstore to handle the grouping and aggressions and covering the index seek requirement with the rowstore. Adding these b-tree indexes will help your query performance, but they can dramatically impact the loading process—if your data loads are large enough it may be more efficient to drop the b-tree index, and rebuild it after loading the data into the columnstore.

Is this a data warehouse fact or dimension table? As we know a dimension table is usually used to find specific values, mostly lookups to match with an aggregated value from a fact table. If it’s a dimension table, typically you are going to use a b-tree based models, with the exception of very large dimensions. We tend to use columnstore on fact tables in the data warehouse as these tables are normally the source of aggregations.

Summary

Knowing your data is pivotal in deciding whether or not a columnstore index is best suited for your query needs. Don’t just jump to columnstore indexes now that you have an understanding of them, and your tables are large.  Make sure to take the time to choose the right indexing option for your usage patterns, the data and the overall environment.


Contact the Author | Contact DCAC
1 2 3 423

Video

Globally Recognized Expertise

As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.

And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.

Awards & Certifications

Microsoft Partner       Insights Sccess Award    Technology Headlines Award    Golden Bridge Gold Award    CIO Review Top 20 Azure Solutions Providers    VMWare Partner
Microsoft Certified Master    Microsoft MVP
Share via