Monthly Archives: September 2010

database throughput through the virtual layer

VMs or virtual machines are all the craze, and for good reason. When you RDP to a VM you are connecting to the guest. This guest OS looks just as it would if it were installed directly on the hardware without a virtual layer. Many guest VMs can reside on a single physical host. Some new large servers advertise hosting 100 VMs. So instead of one gigantic powerful instance you can now have 100 instances of all different sizes.

Throughput is the most important measurement for data warehousing or OLAP type environments. Large amounts of data is written to and read from these environments in a sequential fashion. When considering throughput make sure you know the conversions of bits and bytes such as Gbps and MBps. A stereotypical SAN guy only thinks in IOPS but make sure you tell him NO, we’re looking at bandwidth. Tell him (or her…) a 1 MB IO will make MBps = IOps. Hardware bottlenecks that can come into play are:

1. Virtual host layer: the hypervisor can give you a ton of metrics and configuration options, be thorough with your driver updates and research.
2. Processors: each core generally pushes 200MBps, VMWare encourages scaling out with their 8 core limitation.
3. PCIe: 1 lane can push 500MBps so make sure you are getting a few 8x or one 16x slot, then you can be sure its not your bus.
4. HBAs: these cards plug into the PCIe slot and are measured in Gbps, make sure you find the “negotiated speed”. Blade servers generally aren’t chosen for datawarehouses because of this bottleneck.
5. ISL: inner switch link, watch this for too much % utilization
6. SAN switch: the last line before your SAN
7. Storage processor: Comes into play when hitting SAN cache
8. San inter-module switches: may come into play if striping across modules in your SAN
9. Disk: iSCSI, FC etc. Each disk has its limits so make sure you have enough disks striped and mirrored to support the previous pieces

Now that we’ve walked the lines to your data let me touch on two software technologies. Windows MPIO and VMware round robining. The SAN Fabric has many paths it can take to get to the end. These two software technologies offer a way to utilize all the available paths. Windows multipathing is more advanced than vmware’s option and can work better with some SANs. You may want to consider putting your large objects in files that are striped across several LUNs. This will allow you to utilize several mirrored pairs of disk #9 for more throughput.

It is rare that anything is ever as good as advertised. This is true for throughput measurements. To figure out if you are getting what you are told use a handy tool called SQLIO from Microsoft. It is a simple .exe that has a handful of important parameters:

1. IO size in kB with -b: knowing what your SAN is good at can be helpful. For a SQL dw test use 512 for this parameter
2. param.txt with -F: path to a file that has a list of files to do your testing. Do a test with a lot of small files to test your SAN cache and do a test with a lot of large (2x your cache) files to test your disk.
3. io style: random or sequential, choose sequential for a SQL dw test
4. -k reads or writes: use W to test your SAN’s write caching ability, some SANs also have advanced read ahead features for large sequential reads
5. -s seconds to test: large test files take a while to build (hours). small tests over and over again are good but you might want to do a test for a few days to uncover any contention issues.

Most of my information is coming from the fast track program.

The idea behind fast track is not to over buy any one component, keep costs down and throughput through the roof for data warehouses. I haven’t talked much about the virtual layer but there are a few things to keep in mind:

1. correct # of paths and round robining: ask your SAN vendor about the # of paths
2. queue depth for vmware, also check queue depth of your HBA in the bios
3. IO drivers: make sure your SAN is VMware friendly
4. contention/ contex switching: some SANs are good at large sequential reads so don’t have 2 VMs interrupting each other
5. if you need more than 8 cores you are SOL for a while with VMWare, most fasttrack systems start at 16 cores.

If you checked out the fasttrack link you will notice some of the pre-tested architectures and their throughput measurements. Check the windows (guest) perfmon counter Disk Bytes/sec to line up with this value. I have witnessed that counter reach 3.7GBps so I know its possible. And yes that is a big B.

Leave a comment

Posted by on September 26, 2010 in Network Admin, SQL Admin, Virtual



Under the covers of MSSQL are these roles: sysadmin, datareader, datawriter, security admin, backupadmin etc. In larger organizations there should be three different people with these jobs

1. sysadmin – all access vip, often has windows admin and disk subsystem access as well
2. dbo – vip but only for a specific databases and not system level
3. developer/user – read and/or write to specific objects

Only level 1 qualifies you as a dba. Some dbas don’t know how to function at level 2 and 3. This is ok but not advised. DBAs need some level of automation skills and should be able to relate to developers. They also need to understand why the database structure is the number reason for performance problems.

Regular MSSQL DBA task include:

1. making sure backups are happening
2. monitoring performance
3. monitoring disk space
4. making sure index maintenance is happening
5. setting up new instances
6. consolidating instances

Two more that could fall under either dba or dbo job are

7. making security changes
8. making schema changes

Internally written databases cause more security and schema changes. With internally developed databases also comes maintaining test and qa instances.

Active directory integration will make security updates less frequent. Its a good idea to have a network login decide weather or not a user has access to the data. The network credentials are securely stored and passed to the database that way to avoid having to store plain text passwords. Its called windows authentication or integrated security. SQL authentication is sometimes easier to configure less secure. In the server properties you can turn off SQL authentication entirely.

After security and schema changes, users generally request changes to increase performance. These actions can be taken to increase performance at the system level.

1. more tempdb data files (1 per core)
2. set SQL max memory so the OS does not have to compete for RAM
3. put data files and log files on separate disks
4. put tempdb files on its own disks
5. set initial file sizes large enough so they do not autogrow
6. -E startup param for larger pages (4MB)
7. 1444 for equal file growth and fill
8. eliminate disk communication bottlenecks (upgrade HBAs and SAN switches)
9. turn on data compression and backup compression (SQL 2008+)

At the database level these actions can produce excellent results

1. decrease lock escalation
2. normalize tables
3. remove unused indexes
4. add needed indexes
5. rebuild or reorganize indexes
6. switch to simple recovery model
7. do not auto-close
8. backup->restore->and grow dbs that might have become fragmented at disk level

Leave a comment

Posted by on September 8, 2010 in SQL Admin