RSS

Monthly Archives: June 2011

SQL Backup striping and mirroring to get rid of tape

I’ve been trying to get a handle on all of the SQL server’s in our enterprise environment. The first task was identification, which came with a host of challenges such as, no naming convention, developer machines, applications that install Sql Server that I didn’t care about, strange firewall software blocking remote management and multiple instance boxes.

Once I got a list of servers, the thing that I wanted to check most was, “Are backups happening?” I didn’t really care at this point if the backups were actually good because I wanted to at least make sure they were happening. Sure there was a couple servers, but we were doing really well percentage wise. You can check last backup date with the SSMS GUI. Just go to properties->general. Or a query to join the backup sets to sysdatabases.

SELECT 
dbs.name, baks.backup_finish_date
FROM sys.sysdatabases as dbs 
left join msdb.dbo.backupset as baks
ON baks.database_name = dbs.name

So backups were happening, then fast forward to our certain demise. The disk backup and dedupe appliance we were using didn’t support daytime activity. This was fine for most servers and databases which were in simple mode. Another problem was the appliance didn’t support very large databases. Next, it was out of space and time to run the daytime processes.

We started attempting to squeeze our VLDBs into a tape solution we already had. The switch over went very poorly and restores took days… yes over 24 hours. So we’re stuck with some backups over here mirrored offsite and other backups over there and not mirrored. The replication of tape storage dies every time we need to do a restore so it is quite frequently behind in replication.

Its a mix-n-smash of technologies that seem to be changing faster than I can document. So I have charged myself with coming up with a solution that will at least work for SQL. One of the ideas I have is to find a piece of disk storage that simply does replication offsite. Then just send the compressed backups to it and boom the files are offsite.

Offsite replication is a technology that is extremely expensive. I just need some cheap storage for backups but when the replication feature is added the storage price goes through the roof.

There are some relatively inexpensive rack mount NAS solutions. I don’t really want to get into the game of rolling my own linux/freeNAS solution. I don’t want this device to be dependent on any existing disk solutions. I would like hot-swappable drives but they do not have to rebuild fast. It also has to integrate with AD.

If I got two separate targets I could run the backups locally and then robocopy them offsite. OR, I could use the MIRROR command that is built into SQL 2005 and up. I tried testing this on my express instance and got the dreaded error: Backup mirroring is not available in this edition of SQL Server. So I checked it out on another instance and sure enough, two exact copies. Delete the first file and you can restore from the second file without any problems.

BACKUP DATABASE [dba_maint]
TO DISK=N'C:dbdba_maint_principal.bak'
MIRROR TO DISK=N'C:dbdba_maint_mirror.bak'
GO

If one or the other file location is inaccessible the whole backup will fail. This could be problematic if your offsite server is… offline constantly. I have not tested this, but I also believe the whole backup may be slower if the connection offsite is slow.

Not only can you mirror the files but you can also stripe backup files or write to multiple files at the same time. The benefit would be to hit more disk spindles and to remove other bottlenecks in the backup.

BACKUP DATABASE [dba_maint]
TO DISK=N'C:dbdba_maint_principal.bak',
DISK = N'C:dbdba_maint_mirror.bak'
GO

Putting both of these files to the C: drive wouldn’t give you any boost. So RAID 0 or RAID 1… what will it be? Before you go striping, you might be better of creating a big RAID0 array and storing it in one file. If you stripe you run the risk of not having all the files to restore. Adding moving parts to your backups will complicate things. Remember that you are not always around when databases need to be restored.

If I get a local NAS and an Offsite NAS I could push the files in a MIRROR fashion as long as Kerberos works. Long story short, there are options for me to get rid of the stacks of 8 tracks.

 
Leave a comment

Posted by on June 29, 2011 in Network Admin, SQL Admin

 

A good key exchange

Last weekend I needed someone to let the dog out. I have some good friends that live close by and I actually had a key made for them, but it worked out where it would be best if I just left a key because they wouldn’t be home to pickup the spare. So I left yet another spare on the porch and text’d a picture of where it was to them. No problems.

File server security is a bit different. Mostly there is key pairs, public private… private private.. and so on. Just like in the movies, you need two keys to launch the missiles. If the president had his key sitting under a rock and text’d pictures around, Bobby G. would probably be able to find and and fire away. If you are an IT person, anytime you are storing secure information just assume its nukes.

SFTP (ftp over SSH) you can secure file transmissions with a public and private key very easily. I have instructions on setting up an SFTP in an earlier post. Once you setup a server you can assign each user with a key pair. All you need is the public key to do this. So it is best if user creates the key pair. PuttyKeyGen.exe will use randomness from you mouse movements to create a good key pair ( http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html ). This way the private key never has to change hands.

If you add a protecting passphrase to this key you won’t be able to easily script SFTP transfers. As long as nobody has access to the private key, no passphrase is ok. On top of the keys, access should only be granted if the proper password is supplied. Force clients to supply a password and give it to them on a different medium that the key was exchanged. Expiration is a pain if everything automated but it is required to meet most security standards. Expire your keys and passwords after a certain period of time. So we have a key pair and password, that is secure right? Well, yes and no. The transport layer is secure but the file itself is not secure just yet.

There are a few things we can do to secure a file. PGP would be a top notch security mechanism. PGP would create yet another key pair and encrypt the file itself. PGP is great but don’t discount the value of a good file system security. Do not let users browse around the computer they are connecting to, especially not other users directories. Control local users and network users privileges as best as possible.

To wrap things up, in a good key exchange, the client would create the key pair and never let anyone touch the private(s). Once past a good key echange make sure you lock up things after the exchange is complete.

 
Leave a comment

Posted by on June 27, 2011 in Network Admin

 

A couple SQL Speedbumps

We’ve run across an issue in Windows 2008 when formatting drives. Our procedure to spin up new SQL VMs is to format the drives and then join the domain. So this means the local administrator account formats the drives. When this is done, domain admins do not have access to the drives when logged into the server locally. They just appear as NTFS drives and you can’t even see how much free space they have without using WMI, perfmon or logging in as local administrator. SQL Server ran fine so this is more of a security nuisance.

You can get access from another machine using the admin shares e$, f$ etc. We realized this was easiest to resolve by moving the files off, formatting the drives (64k of course) with a domain admin account and moving the files back onto the drives. Now when you have to remote in, you can actually see free space and browse to see what is on the drives. Rather helpful when troubleshooting.

I came across another server with this issue but knew there wasn’t anything on the drives except TempDB. I know tempdb gets cleared out every time the instance starts so I chose the lazy method and just formatted the drives instead of wasting time moving tempdb around. When I tried to start the instance it failed. I checked the event log first and found:

Unable to open the physical file “G:tempdb.mdf “. Operating system error 3: “3(The system cannot find the path specified.)”.

Interesting I say, I thought SQL Server creates this file every startup but maybe it just clears it out. I’ll have to do some research to figure that out later.

I know that restoring master can be a bit tricky and that you have to use trace flags to start up the instance in single user mode. I thought this might be the answer to fix tempdb as well. This KB proved helpful http://support.microsoft.com/kb/288809

To start SQL Server in single user mode, start an administrator command prompt and run this:

%installdir%sqlservr.exe -c -f -T3608 -T4022

Now you can open SSMS, but another speed bump will stop you from doing much. When you connect to the instance as normal the object explorer on the left uses up your only available connection. If you try to right click on the server and choose new query you will get a connection failed message. Disconnect the object explorer and connect the new query window. The KB will reset a bad tempdb but what I tried first was “USE TEMPDB” which completed successfully. I browsed to the directory and saw everything was in order, or so I thought.

When I tried to restart the instance without the start up flags I hit another speed bump. The event log this time said access denied. We’ve made long strides in SQL Server security and now have a domain user account with minimal privileges running SQL Server. Turns out it did not have access to this newly created tempdb file.

This unplanned DR practice was good for my experience. To sum things up… shortcuts are good, just don’t be lazy.

 
Leave a comment

Posted by on June 17, 2011 in SQL Admin

 

Load Testing with Jmeter

Getting an application to its peak performance requires a load testing phase. When the web servers and database servers are all setup and ready for production you should run a test that simulates the number of users you expect. You should also run more tests to determine the system’s max capacity.

Max capacity testing is more fun. Jmeter has tons of features to perform these tests. It written in java and isn’t the easiest tool to use so I will walk through a basic website test.

After you install java and download jmeter (http://jakarta.apache.org/jmeter/) open up the bin/jmeter.bat. The first thing to do is add a couple of items to our test plan. The first is a thread group that will allow you to group together the actual tests or “samples”. We’ll also want to view the results. To do this add a “summary report” item to your test plan.

Now we can start adding HTTP request samplers. The only thing you have to change is the Server name and the path. For my blog that is dustinandmandy.dyndns-home.com and wordpress.

Once you add a few different HTTP requests with different URLs select the thread group. Check the Forever box next to loop count. Here you can also edit the number of threads. Threads is a setting you will keep coming back to later but for now 1 thread or “user” is good.

The test is set but we want to save off the results in a readable format. Go to summary report and click browse and type in a .csv path and filename. Click configure and check these items as I have found them useful in website load testing:

1. Save Response code (200, 404, 500 etc.)
2. Save Time Stamp (when the test started)
3. Save Field Names (to help sort and filter the .csv)
4. Save Latency (perf data)
5. Save Elapsed Time (perf data)
6. Save byte count ( useful to compare size to time )

When I started out I always saved XML output. You have a couple more options such as saving all of the HTML response data but it adds a couple steps to analyzing the output. I actually wrote a quick program to parse the latency timings and response codes out of this xml into a .csv. That proved a waste of time once I figured out jmeter could do that 🙂

The summary report pages is where you can watch the results poor in as your run the test. Ctrl+E clears the results (but not the results file). Ctrl+R runs and Ctrl+”,” gracefully shuts down the tests. Ctrl+”.” sends a reset which stops the tests instantly but will give you errors in your results for any open threads.

Run a few tests a take a look at the results. Once you make sure you are getting all 200s you can tweak the number of threads. What jmeter does is opens “n” HTTP connections and sends the request, as soon as a response is received another HTTP request is sent. This isn’t a normal users pattern. Normal users will have a delay while they read each page. Depending on your content you may assume that users will spend about 10 seconds on each page so to test for 100 concurrent users you only need 10 concurrent threads (threads = users / time spent on page).

What you want to do is keep increasing the threads until you start receiving server errors or see diminishing returns in the number of samples you can get through in a particular amount of time. If you check the scheduler box you can set a test duration. This would be a full capacity test. Keep an eye on the servers CPU %, Network Utilization, Memory and Disk metrics. Also, watch your test computer’s metrics. What you might find is that a desktop with a 100MB nic is actually the bottleneck in the test. If this is the case you can run the test simultaneously from different computers. You may be able to have 100 active threads but with only 1 core you are not really testing concurrency to the best of jmeter’s abilities. You can definitely max out servers with one testing box and 100 threads but you will get more accurate capacity number if you have 100 testing boxes with 1 thread.

Jmeter is a very useful tool to have in your toolbelt. Use it to help work out all the kinks before you go live.

 
Leave a comment

Posted by on June 12, 2011 in Network Admin