Back To Basics: The SELECT Statement

There are four basic commands in databases.  They are SELECT, INSERT, UPDATE and DELETE.  Probably the most important of these is the SELECT command.  The SELECT command is how the data in the database is retrieved and displayed.

(All these code samples can be run on all versions of Microsoft SQL Server from 7.0 up.)

Like a regular sentence there are a few basic parts of the SELECT statement.  First there is a list of columns that you want to see.  Then there is the FROM portion of the statement.  This is the table or tables which you will be getting the data from.  Then is the WHERE portion of the statement which is the data filter.

When you put all this together the SELECT statement will look something like this.

SELECT id, name
FROM sysobjects
WHERE type = 'U'

If you were going to read this as a normal sentence it would read something like:
I want to see the "id" and "name" columns, from the "sysobjects" table, where the "type" column has a value of "U".

Now if we need to look at data from two different tables which is combined together we add a JOIN statement between the FROM and WHERE portions of the SELECT statement.

SELECT sysobjects.name, syscolumns.name
FROM sysobjects
JOIN syscolumns on sysobjects.id = syscolumns.id
WHERE sysobjects.type = 'U'

You can see from the above statement that we are getting the table names and column names for all the user tables. When you look at the JOIN command you see that we are matching up the id column from both of the tables. Now in this example the column name is the same in both tables. This isn’t always the case. If you don’t know how the tables relate to each other, you can usually look at the foreign key constraints to see how the tables relate to each other.

If you want to get counts of data you can use the GROUP BY clause along with the COUNT() function. In this next statement we will count the number of columns in each table.

SELECT sysobjects.name, count(*)
FROM sysobjects
JOIN syscolumns on sysobjects.id = syscolumns.id
WHERE sysobjects.type = 'U'
GROUP BY sysobjects.name

When ever you use a mathematical function such as MIN(), MAX(), AVG(), COUNT(), etc you have to add the GROUP BY clause to the SELECT statement so that SQL Server knows how to roll up the data.

Now as a last part of the SELECT statement to look at today, we are going to filter our GROUP BY query further, by showing all the tables that have over 5 columns. To do this we use the HAVING clause. Without the HAVING clause we would have to do a very complex query as a sub-query in the WHERE clause which would be very inefficient.

SELECT sysobjects.name, count(*)
FROM sysobjects
JOIN syscolumns on sysobjects.id = syscolumns.id
WHERE sysobjects.type = 'U'
GROUP BY sysobjects.name
HAVING count(*) > 5

The HAVING phrase is very simple in it’s syntax. You put the math function that you want to use (in this case COUNT(*)) then <, >, <> or = and what ever you want to compare it to.

I hope that you find this post useful. I’ve barely scratched the surface of the SELECT statement, and what it can do. I encourage everyone to open up Books OnLine and read through the information on the SELECT statement. It includes more examples, and some of the other options which are available to you.

Denny

Share

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?