Archive for category Databases

Recovering SQL Server databases without a log file

I use two drives in RAID0 for my database server at home. This is pretty bad, for sure. The RAID0 log array has failed twice in six years.

I probably should just drop the extra $80 to add a third drive and make RAID5, or buy a couple more and make RAID10. When the failure happens, I don’t lose any data; all the data could be reloaded, but not from a backup — from the original data source. That’s quite time consuming.

I’ve found that I can have SQL Server create a log file if the volume is lost, and that allows me to again mount the related database. This still might cause trouble; I might lose data, I might end up with uncommitted or (half-committed, really) transactions, and so on. Since the server is usually quiescent, it just doesn’t matter to me.

Since I’m often doing this, and since the script is a bit elusive, here’s a note to myself that explains what to do.

First, I need to put the database into EMERGENCY recovery mode:

ALTER DATABASE DatabaseName SET EMERGENCY;

Then, I’ll issue a CHECKDB statement to make sure the database is OK.

DBCC CHECKDB('database');

I can then issue an ALTER DATABASE command to rebuild the log:


DBCC TRACEON(3604);

ALTER DATABASE DatabaseName REBUILD LOG ON (NAME=Database_Log,FILENAME='F:\SQLLogs\DatabaseName_log.Ldf');

DBCC TRACEOOF(3604);

Then reset the database to Multi-User:

ALTER DATABASE DatabaseName SET MULTI_USER;

before doing another CHECKDB to make sure everything is OK.

This isn’t right, and isn’t something I’d do in production, but it is quite fine for my lame little test server.

Crystal Disk Mark and SQL Server

DatabaseWeekly.com recommended some benchmarking tools for SQL Server users via Glenn Berry’s SQL Server Performance Blog. Problem is, one of the tools is pretty badly flawed.

Berry recommended Crystal Disk Mark, which is a cute tool that ostensibly measures disk performance. Thing is, it does a pretty shoddy job. If you run the test, you’ll notice that the disk queue for the drive its testing is never deeper than one. That is, it doesn’t do multi-threaded or overlapped I/O requests. Instead, it issues a single blocking requests. It doesn’t take long to download the source code and confirm this fact.

For desktop machines, doing only one I/O at a time isn’t so bad because that matches the workload desktop machines typically show their drives during normal use.

It seems a bit irresponsible to recommend such a tool for servers, though–particularly machines which will be running SQL Server, where multiple outstanding I/O operations are key to achieving any sensible level of performance. I like to have at least a couple outstanding I/O operations per logical spindle on the servers I run. This lets the controller and the drive have a shot at elevator seeking across multiple requests, combining redundant requests in cache, and batching as best as possible.

Since none of Crystal Disk Mark’s test modes provide any I/O request depth, it won’t provide an accurate measure of performance for the disk arrays and storage subsystems that are typically used with SQL Server installations and it isn’t an appropriate tool for capacity planning or evaluating hardware for such installations.

Thinking about Large Data an Scale-Out

Adam Jacobs wrote an article, which I believe was first published in Queue, paper called The Pathologies of Big Data, which looks at — you know — big data.

The title begs the question of what “big data” really is, and the author doesn’t ignore that. Without saying it explicitly, he points out that the definition of “big data” is relative to time–based on what storage technologies are available at the time, what they cost, and how maintainable they are.

The author then makes an interesting posit: he builds some sample data, carefully packing the record so it stores as much information as possible. The list represents a set of people, who have ages. Since people don’t live past 127 years old very often, he uses only seven bits of a byte. Most database systems don’t allow this level of control over the storage type and packing, so it should be no surprise that the data he stores in his commercial database package takes a lot more space than his hand-coded representation.

What is suprising, however, is that he stores one billion rows of three bytes and the data ends up taking more than 40 gigs on disk. Ideally, the storage would take three gigs; 3 bytes times one billion. The author offers no explanation about why his database system causes more than 1300% overhead in the storage of the data, which seems rather negligent, particularly since he calls this “sort of inflation … typical of a traditional RDBMS”.

The author describes his eight-core Macintosh as “hefty hardware”, but ironically gives no description of the disk subsystem. He says that it has two terabytes of RAID-0 disk, but doesn’t mention if he’s using two commodity one-terabyte drives (which are slow — 12 or 15 milliseconds of latency), or fourteen enterprise class, 15 KRPM 147-gig SAS drives (which are about as fast as you can buy these days, if you’re shopping for mechanical storage).

In investigating the query performance, the author does examine some interesting facts, though he stops just short of directly indicting the PostgreSQL query optimizer or execution engine as the performance issue.

The experiment is a good idea, and I intend to perform a similar test on my own hardware when I have a few moments. What piques me about the article, though, is that the author claims that this behaviour isn’t pathological, and that it happens all the time. Indeed, it does: what’s happening is that the author is using the wrong tool for the wrong job. Scanning large tables and performing aggregation summaries overt the contained data is a pattern that database systems do exercise when supporting data warehousing or business decision systems. But it is, pretty plainly, a misapplication of such a system and isn’t suprising that it’s slow.

I think the author is further misguided in stating that businesses don’t produce such large amounts of data. They do: website logs, transactions, network monitoring, and so on–they’re all applications that aren’t uncommon in today’s businesses, and they do (or have the potential to) generate massive amounts of data. Oddly, the author cites several such applications–and the multipliers, like time that make them big–later in the article.

Anyway, the author does acknowledge that data warehousing is a solution to the problem he’s working, but does not entertain it. Indeed, “merely saying we will build a data warehouse” doesn’t get the job done. Just saying anything does not get the job done–someone has to do actual work.

The article makes one very interesting point, though defeats it itself. The author points out that randomly scanning memory is slightly slower (36.7 million/second) than sequentially reading from disk (53.2 million/second). Reading from disk is probably the slowest thing your computer can do, and we know that cache misses in memory reads are painful–but it is a bit surprising to realize that cache misses total something far slower than a sequential, physical disk read.

The author makes the point that denormalizing tables can help. This is certainly true, particularly when they avoid the sort of joins that the author is describing. However, I think it’s debatable how often such scans are interesting. The suggested pattern, joining transaction data back to the user table to show who performed these transactions isn’t typical because the aggregation isn’t common. When it is, it’s easy enough to cache or hash the lookups so they’re far more efficient than randomly probing the data. It’s also possible to realize this access pattern in the query optimizer and fully scan both tables rather than randomly probing. This access pattern is a bit counter-intutivie, but it turns out the sequential read is so much faster that its benefit easily overwhelms the act of reading data that isn’t actually required to drive the query.

I like the author’s treatment of scale-out solutions for handling larger data sets. Scaling to multiple computers can be an inexpensive way to share the load and requires less specialty hardware. The problem is that it requires lots of special engineering. These days, engineering is lots more expensive than the hardware; it’s easier to spend the money for an exorbitant server just once than it is to spend the money for a team of engineers to make cheaper, lesser hardware do the same job.

The usual solution of commoditization applies here, I think. That is, we have to rely on vendors to absorb the difficult engineering problems and make products which we can purchase at a fraction of the original engineering cost that address the problems we have with large data. No sane organization would ever build their own version of a product as complicated as SQL Server, for example; we shouldn’t expect those same organizations to build even more complicated software.

While I think some parts of this article are poorly written or poorly supported, I like the author’s definition of “big data”, which involves stepping past the “tried-and-true techniques” that we’re used to. I don’t think large data will be successfully utilized as an asset in most organizations until the tools are really there.