back to t-sql basics (1 of 2)

28 Nov

The first 236 pages don’t get into IUDs or Insert, Updates and Deletes. No data modification in the first 7 chapters which is how DBAs should start out. I am about 1/2 way through the book so I decided I would pause and type up some of my notes and thoughts.

T-SQL Fundamentals is one of the first technical books I have decided to actually read. All of the other technical books I have read I used as a reference manual and maybe focused on a chapter that relates to something that had me stumped.

Taking a pen to paper really helps me stay focused when reading. I use blank sheets of paper without lines so if I feel the urge to draw a picture or change a font size I can do so without limitation. The author stresses several times to do the exercises to really understand what is being covered. I agree, and that is part of my Learn One, Do One, Teach One path to mastery.

The first notable item that I have come across in the book is a definition of the resource database. This appears to be a behind the scenes database that stores system objects. Short explanaions of all the systems databases are as follows:

(1, p15)
1. master – stores meta information and configs
2. model – new db template
3. tempdb – work tables, sort space, row versions (rebuilt on sql restart)
4. msdb – sql agent data
5. Resource – system objects

Unique constraints – first thing that comes to mind is a primary key, however these can be things like SSN or phone number that are not the primary key.
(1, p21)
A foreign key is a link to another table.

Alter table Orders
add constraint 'FK_Blah'
foreign key (empid)
references dbo.employees(empid)

The order of which queries are processed is important when trying to understand the results.
(1, p26)
1. FROM clause – code should always schema qualify tables in the from clause “dbo.table” instead of “table”
2. WHERE clause – the filter
3. GROUP BY clause – SUM() AVG() agg functions are required for selected columns when grouping so only one value is returned
4. HAVING clause – another filter
5. SELECT – column or fields to return
6. ORDER BY clause – It is stressed several times that order is not guaranteed unless this is added

Agg functions ignore NULLs except for count(*)
(1, p42)
TOP – an option to limit results. processed as part of the select which is similar the time DISTINCT

(1, p45)
OVER clause – exposes a window of rows to an aggregate function

select orderid, custid, val,
100. * val/sum(val) OVER() as pct_all
100. * val / sum(val) OVER(PARTITION BY custid) as pct_cust
from sales.ordervalues;

note 100. causes an implicit conversion to decimal where as 100 would be implicit to int

This sample OVER query returns two percentages for each orderid so you can tell how big the order was compared to that customers orders(the partitioned one) and all orders.

OVER supports 4 ranking functions

DENSE_RANK – 5,6,6,8
RANK – 5,6,6,7
ROW_NUMBER – 5,6,7,8
NTILE – 1,1,1,1 (I assume this is like a percentile. if you pass in 100 you get 1 for top 1 percentile)


(1, p53)
*(multiply) / (divide) % (modulo)
+ –
= > < !=

(1, p54)
CASE statements

select prod, prodname, cat_id
CASE cat_id
  WHEN 1 then 'Bevs'
  WHEN 2 then 'produce'
END as category name
from products

(1, p58)
An important thing to note about nulls:

select * from blah where col  'MI'

Although NULL does not = MI this query will not return records where col is NULL

col = NULL <—-returns no rows
col IS null <—- returns proper results

NULLS are sorted before values in col such as 'MI' or 'AL'

Also, sql doesn't process left to right in where clauses. EX:

Where col1 0 AND col2/col1 > 2

(1, p69)
Character Data (just the highlights)
STUFF -> remove substring and insert new string
UPPER -> convert to uppercase
SELECT RIGHT(‘12345’, 3) -> 345
wildcards -> %, _, [^A-E]

(1, p79)

SQL Server recognizes string literals such as ‘20070212’ and will convert. Its suggested to always use YYYYMMDD so language options don’t get confused.

This query is bad for indexes

select blah from orders
WHERE YEAR(orderdate) = 2007

Change the where clause to orderdate = ‘20070101’ so the function doesn’t have to complete for every row.

select DATEADD(YEAR, 1, ‘20091010’) -> 2010-10-10 00:00:00.000
DATEDIFF(day, ‘20080101’, ‘20090101’) -> 365
DATEPART(month, ‘20080210’) –> int of 2
ISDATE(‘20090101’) 1=TRUE 0=FALSE

(1, p101)
The inner workings of these escaped me for quite a while. As soon as I thought I knew what was going on I would be challenged to write a multi-table query and struggle.

The different types of joins are Cross, Inner(default), Outer
Apply, Piviot and Unpiviot are other types of joins

A cross join creates a Cartesian product where you simply get a result = (tableA # of rows) * (tableB # of rows)

from sales.customers as C
cross join hr.employees as E


1 |1
1 |2
2 |1
2 |2

The book recommends using ANSI-92 format instead of ANSI-89.


select a.blah, b.blah
from table1 as a, table2 as b
where a.blah = b.blah

note: without the where clause it would be a cross join


select c.custid, count(*) as numorders
from sales.customers as c
left outer join sales.orders as o
on c.custid = o.custid
group by c.custid;

This will get a table with rows for every order and then all matching customer id’s will be grouped together and the second column will be a number of rows that were grouped.

Sub Queries (1, p133)
These can solve a lot of the same problems as joins. There are two flavors, self contained and correlated. Self contained are much easier, especially when debugging because the inner query can be ran separately from the outer query.

Most self contained inner queries return a scalar, or a single value.

select orderid, orderdate
from sales.orders
where orderid = (select max(o.orderid) from sales.orders as o)

Some self contained inner queries return multiple values by using the IN operator instead of =.

select orderid, orderdate
from sales.orders
where orderid IN (select o.orderid from sales.orders as o where o.lastname like 'D%')

Another example of a subquery with exists in a where clause. Most of the time we should avoid where clause function type processing but the EXISTS clause is performent.

where blah = blah
and exists (select blah)

Table expressions and derived tables

select orderyear, count(distinct custid) as numcusts
from (select year(orderdate) as order year, custid
      from sales.orders) as D
group by orderyear

(1, p167)
Common table expressions (CTEs) are better than derived tables because they have recursive capabilities

WITH USACusts as
select custid, companyname
from sales.customers
where country='usa'

Set operations

Operations performed on two query results. ORDER BY returns a cursor so the sets cannot use this.

(1, p197)
UNION (distinct is implicit)
-union all

INTERSECT – return rows that are in both query and query

EXCEPT – exclude results that are in query B

1. Ben-Gan, I. (2009). Microsoft SQL Server 2008 T-SQL Fundamentals. Redmond, Washington: Microsoft Press.

Leave a comment

Posted by on November 28, 2011 in SQL Dev


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: