Skip to main content

SANs are different

Simple title...but to me, it makes sense, especially when we're talking about disk performance.

My understanding of disk performance has changed dramatically these last few days. I went into it believing that it was all about disk speed and max throughput, so an array of 6 15k SAS disks in RAID5 was fast to me, and there was no way anything SATA could equal it.

However, when you speak of IOPS (I/O per second) - a term thrown around like jellybeans by SAN sales people - it really comes down to spindles, as in, the number of disks you have.

A good metaphor someone told me was to think of a library. It's a small thing for one person (one disk) to get six books that are on the same shelf. It's even easier for six people (the six SAS disks) to look for 6 books, especially if they are all together. However, in an SQL environment, requests are asking for data all over the place.

So, if you then think of one person trying to get six books from all corners of the library, it makes sense for you to have one person for each book - in the time it takes one person to get one book, you have actually gotten six.

But...then we move from our RAID5/6 disk array to a 16 disk array...sure, the 'people' getting the books are a little slower, but you have an extra 10 people! With the EqualLogic option we're looking at, we'll be doubling the number of disks next year, meaning we would have gone from 6 spindles on our heaviest of duty SQL DB disk array to 32 spindles... Yeesh!

The SATA option we're looking at is probably 25% slower than the 15k SAS unit we're testing this week, but I'll have concrete numbers next week when we test out the 16x250GB disk SATA array. The 500GB option may even come into play, as the price difference is not that big.

Let's now look at throughput.

You can apply the analogy to the size of the door you're trying to fit the books through. So, you've retrieved 100 books, and you can just fit them through the door. But what if you need more than 100 at a time? Throughput allows you to carry a certain amount of books through the door at one time.

It's all well and good to boast 150MB/s max throughput for this RAID5 array, specifically for sequential writes - they monster the SAN speeds. However, it still comes down to what the SAN does best - allow fast access for everything that needs disk.

More to come in Part 2.

Comments

Popular posts from this blog

DFSR - eventid 4312 - replication just won't work

This warning isn't documented that well on the googles, so here's some google fodder:


You are trying to set up replication for a DFS folder (no existing replication)Source server is 2008R2, 'branch office' server is 2012R2 (I'm moving all our infra to 2012R2)You have no issues getting replication configuredYou see the DFSR folders get created on the other end, but nothing stagesFinally you get EventID 4312:
The DFS Replication service failed to get folder information when walking the file system on a journal wrap or loss recovery due to repeated sharing violations encountered on a folder. The service cannot replicate the folder and files in that folder until the sharing violation is resolved.  Additional Information:  Folder: F:\Users$\user.name\Desktop\Random Folder Name\  Replicated Folder Root: F:\Users$  File ID: {00000000-0000-0000-0000-000000000000}-v0  Replicated Folder Name: Users  Replicated Folder ID: 33F0449D-5E67-4DA1-99AC-681B5BACC7E5  Replication Group…

Fixing duplicate SPNs (service principal name)

This is a pretty handy thing to know:

SPNs are used when a specific service/daemon uses Kerberos to authenticate against AD. They map a specific service, port, and object together with this convention: class/host:port/name

If you use a computer object to auth (such as local service):
MSSQLSVC/tor-sql-01.domain.local:1433

If you use a user object to auth (such as a service account, or admin account):
MSSQLSVC/username:1433

Why do we care about duplicate SPNs? If you have two entries trying to auth using the same Kerberos ticket (I think that's right...), they will conflict, and cause errors and service failures.

To check for duplicate SPNs:
The command "setspn.exe -X

C:\Windows\system32>setspn -X
Processing entry 7
MSSQLSvc/server1.company.local:1433 is registered on these accounts:
CN=SERVER1,OU=servers,OU=resources,DC=company,DC=local
CN=SQL Admin,OU=service accounts,OU=resources,DC=company,DC=local

found 1 groups of duplicate SPNs. (truncated/sanitized)

Note that y…

Logstash to Nagios - alerting based on Windows Event ID

This took way longer than it should have to get going...so here's a config and brain dump...

Why?
You want to have a central place to analyze Windows Event/IIS/local application logs, alert off specific events, alert off specific situations.  You don't have the budget for a boxed solution.  You want pretty graphs.  You don't particularly care about individual server states.  (see rationale below - although you certainly have all the tools here to care, I haven't provided that configuration)

How?
ELK stack, OMD, NXlog agent, and Rsyslog.  The premise here is as follows:

Event generated on server into EventLogNXlog ships to Logstash inputLogstash filter adds fields and tags to specified eventsLogstash output sends to a passive Nagios service via the Nagios NSCA outputThe passive service on Nagios (Check_MK c/o OMD) does its thing w. alerting
OMD
Open Monitoring Distribution, but the real point here is Check_MK (IIRC Icinga uses this...).  It makes Nagios easy to use and main…