SQL Server Data Storage

All db’s have a primary data file: the .mdf file. But secondary data files are (.ndf) are also allowed. The default location for all of these is:
C:\Program Files\Microsoft SQL Server\MSSQL.n\MSSQL\Data
where ‘n’ is a unique number identifying an installed instance (I’ve got \MSSQL.1\MSSQL\Data).
This changes in SQL Server 2008 (Version 10):  here’s where an install on Vista is:
c:\Program Files\Microsoft SQL Server\MSSQL10.SQLEXPRESS\MSSQL\DATA\
Secondary data files are used to split up extremely large databases or to spread data across multiple disks (usually for performance reasons).
Data is written in 8K pages and pages are usually organized into 8 contiguous page ‘extents’ which make up the .mdf or .ndf file (16 extents per Mb). However it is possible for small tables to share a single extent (a mixed extent).

Transaction Log and how it works
At least one transaction log (.ldf) is necessary to run SQL Server. For small databases there is only one LDF. Only one can be used at a time.
Moving transaction logs to a separate drive from the data files is considered a ‘best practice’ since it allows for faster throughput (allows disk heads to remain in place for the serial writing of the transactions and not interrupted by data writes). Typical size for an .ldf file is 25% of the entire database size.
The log is recording more than just the SQL code we put in explicit ‘transactions’ (with a begin tran and commit tran). Implicit transactions are also carried out whenever any of the following occur: alter table, create, delete, drop, fetch, grant, insert, open, revoke, select, truncate table and update. In other words, nearly everything is logged. This may seem excessive but bear in mind that the .ldf is not only there to record your explicit transactions, it is also there so you can recover from a disaster.
Here’s what happens behind the scenes (pay no attention to that curtain, Dorothy) when data modifications occur:

  1. data modification is sent by an app
  2. affected data pages are found in or read into the buffer cache and modified (dirty bit is set for page)
  3. before and after image of the changes recorded in the .ldf
  4. acknowledgement is sent to app that change is committed
  5. checkpoint writes the ‘dirty’ pages to disk, causing the dirty bit to flip. Now the page in cache is ‘clean’.

So the .mdf or .ndf file is only changed after a checkpoint. So that begs the question, ‘When does a checkpoint happen?’ It depends. Usually this process is controlled by SQL Server (usually better off just letting it handle this) but it is possible to force if necessary.
Another way of thinking about a Transaction Log is to visualize a snake eating its own tail example (don’t try this at home with a pet). Assuming a Full Recovery Model, if the head reaches the tail then the log is full and the db becomes read-only (selects work but changes don’t). A log backup is required to clean up (truncate) the log. It will work as long as transactions are committed, checkpointed and backed up. The log backup frees up space by deleting committed, checkpointed and backed up log entries and leaving the rest behind.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s