SQL Server database file size

Keeping track of database size and growth is a critical part of a DBA’s job which I wrote about in an earlier blog post. But now I have improved my scripts that reveal the status of all my db files (I say ‘files’ as each SQL Server database is made up of at least two files).
Many of my db’s are set to autogrow but I also have maximum sizes that I allow these to grow. This is important to me because I want to know about out of control database growth and not just let them fill up my storage. So I need to know, periodically, where my db’s are at. The 25% is arbitrary, you can pick any number that works for you. Also, if you have HUGE db’s you may need to adjust the decimal precision for the output.

Here’s my script:

--all dbs over 25% full with sizes too
select db.name as [db_name], mf.name,
'dbSizeinMB' =
cast (cast (mf.size*1.0/128 as decimal(9,2)) as nvarchar(30)),
'MaxSizeinMB' =
case mf.max_size
when 0 then 'no growth is allowed.'
when -1 then 'autogrowth is on.'
when 268435456
then 'log file will grow to a maximum size of 2 tb.'
else cast (cast (mf.max_size*1.0/128 as decimal(9,2)) as nvarchar(30))
end,
'PercentageSize' =
case mf.max_size
when 0 then 'no growth is allowed.'
when -1 then 'autogrowth is on.'
else cast (cast(((mf.size*1.0)/(mf.max_size*1.0))*100 as decimal(9,2)) as nvarchar(30))
end
from sys.master_files mf, sys.databases db
where mf.database_id = db.database_id and ((mf.size*1.0)/(mf.max_size*1.0)*100) > 25.0
order by db.name

Of course you don’t always have time to run a script to check up this so I have a maintenance plan set up with this content that e-mails be the report every day at 6AM:

--all db files over 25% full with sizes too
DECLARE @tableHTML NVARCHAR(MAX) ;

SET @tableHTML =
N'<H1>db file Percentage Size Check</H1>' +
N'<table border="1">' +
N'<tr><th>db_name</th><th>name</th>' +
N'<th>dbSizeinMB</th><th>MaxSizeinMB</th><th>PercentageSize</th></tr>' +
CAST ( ( SELECT td = db.name, '',
td = mf.name, '',
td = cast (cast (mf.size*1.0/128 as decimal(9,2)) as nvarchar(30)), '',
td = case mf.max_size
when 0 then 'no growth is allowed.'
when -1 then 'autogrowth is on.'
when 268435456
then 'log file will grow to a maximum size of 2 tb.'
else cast (cast (mf.max_size*1.0/128 as decimal(9,2)) as nvarchar(30))
end, '',
td = case mf.max_size
when 0 then 'no growth is allowed.'
when -1 then 'autogrowth is on.'
else cast (cast(((mf.size*1.0)/(mf.max_size*1.0))*100 as decimal(9,2)) as nvarchar(30))
end
FROM sys.master_files mf, sys.databases db
where mf.database_id = db.database_id and ((mf.size*1.0)/(mf.max_size*1.0)*100) > 25.0
order by db.name
FOR XML PATH('tr'), TYPE
) AS NVARCHAR(MAX) ) +
N'</table>' ;

EXEC msdb.dbo.sp_send_dbmail
@profile_name ='My Profile',
@recipients=N'me@provider.com;you@provider.com',
@subject = 'MyServer db files over 25% max size',
@body = @tableHTML,
@body_format = 'HTML' ;

This applies to SQL Server 2005 and newer.

Coding NATO International Radiotelephony Alphabet in SQL

Joe Celko and Ken Henderson are two of my favourite SQL Gurus and I very much enjoyed reading one of Joe’s articles this morning. It made me think and playing with the code was fun.

Try this on for size:

–series create
create table [dbo].[Series] (
[i] [int] identity (1, 1) not null,
constraint Series_i primary key clustered(i),
)
go
begin tran
declare @loop int;
set identity_insert series on
set @loop = 1;
while (@loop <= 1000) begin
insert into Series (i) values (@loop)
set @loop=@loop+1
end
set identity_insert series off
commit tran
select * from series

–create SpellingAlphabet view
CREATE VIEW SpellingAlphabet (Telephony, character)
AS
SELECT X.Telephony, X.character
FROM (VALUES (‘Alpha’,’a’), (‘Bravo’, ‘b’), (‘Charlie’, ‘c’),
(‘Delta’, ‘d’), (‘Echo’,’e’), (‘Foxtrot’, ‘f’), (‘Golf’, ‘g’),
(‘Hotel’, ‘h’), (‘India’, ‘i’), (‘Juliet’, ‘j’), (‘Kilo’, ‘k’),
(‘Lima’, ‘l’), (‘Mike’, ‘m’), (‘November’, ‘n’), (‘Oscar’, ‘o’),
(‘Papa’, ‘p’), (‘Quebec’, ‘q’), (‘Romeo’, ‘r’), (‘Sierra’, ‘s’),
(‘Tango’, ‘t’),(‘Uniform’, ‘u’), (‘Victor’, ‘v’), (‘Whiskey’, ‘w’),
(‘X-Ray’, ‘x’), (‘Yankee’, ‘y’), (‘Zulu’, ‘z’)
)
AS X (Telephony, character);
select * from SpellingAlphabet

–create the NATO (international radiotelephony) spelling alphabet code
declare @string nvarchar(300);
set @string = N’Joe Celko knows how to make SQL Server say uncle';
SELECT telephony
FROM Series
INNER JOIN SpellingAlphabet
ON character = SUBSTRING(@string, i, 1)
WHERE i <= LEN(@string)
ORDER BY i

Here are the Results you’d get:

Juliet
Oscar
Echo
Charlie
Echo
Lima
Kilo
Oscar
Kilo
November
Oscar
Whiskey
Sierra
Hotel
Oscar
Whiskey
Tango
Oscar
Mike
Alpha
Kilo
Echo
Sierra
Quebec
Lima
Sierra
Echo
Romeo
Victor
Echo
Romeo
Sierra
Alpha
Yankee
Uniform
November
Charlie
Lima
Echo

Selecting from a table in ALL your databases in SQL Server

I knew about the system stored procedure sp_MSforeachdb but had never really used it until recently. The books on line weren’t much help so I turned to Google and found this link.
If you fully identify the database you can easily do something like:

select * from mydb1.dbo.tblwhatever
select * from mydb2.dbo.tblwhatever
select * from mydb3.dbo.tblwhatever

But what if you don’t know which of your myriad databases has the table you’re looking for?  You can use

Exec sp_MSforeachdb ‘select * from [?].dbo.tblwhatever';

The ‘[?]‘ takes the place of each individual db the stored procedure is parsing through: very much like a cursor. If you don’t use the ‘[?].dbo’ syntax you will simply be issuing the command on your current database. The square brackets can be useful if you have funky database names with abnormal characters (I find it easier to leave them in). Of course, you’re going to run into errors if any of your databases doesn’t contain that table.  You could put that statement in an if exists loop or be lazy, like I generally am, and just ignore the errors in Management Studio simply looking at the results. You could filter out the system tables with something fancy like this:

Exec sp_MSforeachdb ‘select so.* From [?]..sysobjects so where xtype= ”U” And name = ”tblwhatever”';

One problem with this arrangement is that it doesn’t tell you in the results which came from which database. A way around this is to chain two statements together like:

Exec sp_MSforeachdb ‘select ”?” select * from [?].dbo.tblwhatever';

and then you’ll get two results tables for each database, the first one of which is the name of the database. You do need to be careful with the quotage though: it’s two single quotes then the ? and then another two single quotes to indicate that you really want the single quote instead of the end of the parameter string for sp_MSforeachdb.

Deciding against Code Collapse in SQL Server Management Studio 2008

A very annoying feature, which is automatically on, is the Code Collapse for editing T-SQL code.  I have complex code for which it rarely adds anything valuable. Also, I like to choose rows over on the left by the line numbering and was forever getting fouled up in the collapse boxes.
I looked many times in Options without discovering how to turn it off. Thank God I found this link where Buck Woody explains how to do it.
Just in case Buck’s post goes away I provide the magical invocation here:

Tools > Options > Text Editor > Transact-SQL > Intellisense > uncheck Outline statements

It takes effect without even having to reopen the tabs already opened.

Identifying High Row count tables in SQL Server – New and Improved

A big part of good database maintenance is having a fair idea of where your tables are at in terms of numbers of rows and primary keys. I took a stab at writing a generic script to return all the tables with their row counts with or without a primary key (PK) a few years back. I revisited that script and made some improvements. One of the features I added was a middle select that identifies the tables that have high turn over. This is identified, admittedly arbitrarily, by filtering for those whose row count is less than 10% of its PK value. In this way I can be alerted to those tables that I might want to resync so I don’t reach the 2 billion maximum for integer identities. It unerringly points to my ‘scratch’ tables that are generally wiped out after each use.


declare @tbltmpStats table (
Identity_Tables int identity (1, 1) not null ,
Table_Name nvarchar(200),
Rows int,
Identity_Max int)
declare @objectid int;
declare @objectname nvarchar(200);
declare @sql1 nvarchar(4000)
declare @objectcursor cursor

set @objectcursor=cursor for select o.object_id from sys.objects o
where type='u'
and objectproperty(o.object_id, N'TableHasIdentity')=1
and o.schema_id = 1
order by o.name
open @objectcursor
fetch next from @objectcursor into @objectid
while (@@fetch_status=0) begin
select @objectname=name from sys.objects where object_id=@objectid
set @sql1 = 'select ''' + cast(@objectname as nvarchar(200)) + ''', count(*), max(identitycol) from ' + @objectname
insert into @tbltmpstats (Table_Name, Rows, Identity_Max)
exec (@sql1)
fetch next from @objectcursor into @objectid
end
close @objectcursor
update @tbltmpstats set Identity_Max = 0 where Rows = 0

select Table_Name, Rows, Identity_Max,(case (Identity_Max) when 0 then Null else (cast(Rows as decimal (10,2)) / cast(Identity_Max as decimal (10,2))) end) as [Rows/Identity_Max]
from @tbltmpstats
order by Rows desc,Identity_Max desc

--(Rows / Identity_Max) less than 10%
select 'Candidate for resyncing Primary Key: ' as Watch, Table_Name from @tbltmpstats where Identity_Max > 0 and (cast(Rows as decimal (10,2)) / cast(Identity_Max as decimal (10,2)))<0.1

declare @tbltmpStats2 table (
NonIdentity_Tables int identity (1, 1) not null ,
Table_Name nvarchar(200),
Rows int
)
set @objectcursor=cursor for select o.object_id from sys.objects o
where type='u'
and objectproperty(o.object_id, N'TableHasIdentity')=0
and o.schema_id = 1
order by o.name
open @objectcursor
fetch next from @objectcursor into @objectid
while (@@fetch_status=0) begin
select @objectname=name from sys.objects where object_id=@objectid
set @sql1 = 'select ''' + cast(@objectname as nvarchar(200)) + ''', count(*) from ' + @objectname
insert into @tbltmpstats2 (Table_Name, Rows)
exec (@sql1)
fetch next from @objectcursor into @objectid
end
close @objectcursor
select Table_Name, Rows from @tbltmpstats2 order by Rows desc
deallocate @objectcursor

That, of course, begs the question why I’m not using temporary tables. An experienced DBA will tell you that there are individual answers to that question. I could say something like “Because!” or “It depends!” but if that’s still not clear then the following might give you a better insight into proper DBA attitude:
Dilbert.com

SQL: taking a step back

Joe Celko has written a very interesting and enjoyable article at the Simple Talk site that I recommend to SQL Server database developers anywhere. We all approach our jobs with different mindsets and it’s always a good idea to step back at times and take a breather. Take a look at how your assumptions and habits could be slowing you down. To whet your appetite his 3 headers are: “Errors in complex technology”, “Getting Rid of Procedural Code” and “Getting Rid of Non-Procedural Code”.

Here’s a quote I just love from it:

People prefer static mental models, rather than a dynamic model. This was also the model for traditional programming languages – when you compile a program, you get the same executable code every time. That does not work in SQL; this is why we have query optimizers rather than procedural code in RDBMS.

SQL Server copy_only backup and restore

Previously I wrote an article listing information on SQL Server backups. I want to expand a little on the practical use of the new and useful copy_only backup available in SQL Server 2005 and 2008. You can’t get to copy_only backups/restores through the Management Studio for SQL Server 2005 (you have to use T-SQL) but you can using the new Management Studio for SQL Server 2008.

This is an ideal way to test production db’s on a development server without disrupting your normal backup schedule. In the following example I am doing a full db backup to the production backup drive:

BACKUP DATABASE [my_db]
TO DISK = N'Z:\backup\my_db_copyonly.bak'
WITH
NOFORMAT,
INIT,
copy_only,
NAME = N'my_db-Full Database CopyOnly Backup',
SKIP,
NOREWIND,
NOUNLOAD,
STATS = 10
GO

But when I restore it I would want to be sure that I’m moving the db files to the correct location on my development SQL Server, hence the WITH MOVE subcommands below.   I always want it to have a completely new name:

RESTORE DATABASE [my_db_copy]
FROM DISK = N'Z:\backup\my_db_copyonly.bak'
WITH
MOVE 'my_db_dat' TO 'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\my_db_copydat.mdf',
MOVE 'my_db_log' TO 'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\my_db_copylog.ldf',
FILE = 1,
NOUNLOAD,
REPLACE,
STATS = 10
GO

So now I have an up-to-date version of the production db on my system. Bear in mind that any logins for applications or users that are necessary would need to be recreated to be able work with this new copy. I keep a special login/user script handy and run it to recreate these.

When a decimal isn’t in TSQL

I saw a posting in SimonS Blog on SQL Server Stuff that intrigued me concerning decimals and SQL Server. I wanted to leave a comment but I don’t like having to register to do so I’ll do it here.

Simon’s point is that you should beware about making assumptions as to how code will treat expressions. An expression like 10/4 is simple for us to interpret in our heads but a strongly typed language like T-SQL sometimes needs a little help when you ‘say’ integer but really mean ‘decimal’. Simon is a little unclear because he has a typo in the following statement: ‘In TSQL however you get 4′. I think he meant ‘2’.

But what he’s saying is important. I’ve fleshed out his concern in some code so we can easily see the problem:

declare @decnumb decimal(10,4);
set @decnumb = 10/4;
select 'set @decnumb = 10/4';
select @decnumb;

set @decnumb = cast ((10/4) as decimal);
select 'set @decnumb = cast ((10/4) as decimal)';
select @decnumb;

set @decnumb = (cast (10 as decimal))/(cast (4 as decimal));
select 'set @decnumb = (cast (10 as decimal))/(cast (4 as decimal))';
select @decnumb;

set @decnumb = (cast (10 as decimal))/(4);
select 'set @decnumb = (cast (10 as decimal))/(4)';
select @decnumb;

set @decnumb = (10)/(cast (4 as decimal));
select 'set @decnumb = (10)/(cast (4 as decimal))';
select @decnumb;

set @decnumb = 10/4.0;
select 'set @decnumb = 10/4.0';
select @decnumb;

set @decnumb = 10.0/4;
select 'set @decnumb = 10.0/4';
select @decnumb;

And here are the results:

set @decnumb = 10/4
2.0000

set @decnumb = cast ((10/4) as decimal)
2.0000

set @decnumb = (cast (10 as decimal))/(cast (4 as decimal))
2.5000

set @decnumb = (cast (10 as decimal))/(4)
2.5000

set @decnumb = (10)/(cast (4 as decimal))
2.5000

set @decnumb = 10/4.0
2.5000

set @decnumb = 10.0/4
2.5000

SQL Server Data Storage

The MDF
All db’s have a primary data file: the .mdf file. But secondary data files are (.ndf) are also allowed. The default location for all of these is:
C:\Program Files\Microsoft SQL Server\MSSQL.n\MSSQL\Data
where ‘n’ is a unique number identifying an installed instance (I’ve got \MSSQL.1\MSSQL\Data).
This changes in SQL Server 2008 (Version 10):  here’s where an install on Vista is:
c:\Program Files\Microsoft SQL Server\MSSQL10.SQLEXPRESS\MSSQL\DATA\
Secondary data files are used to split up extremely large databases or to spread data across multiple disks (usually for performance reasons).
Data is written in 8K pages and pages are usually organized into 8 contiguous page ‘extents’ which make up the .mdf or .ndf file (16 extents per Mb). However it is possible for small tables to share a single extent (a mixed extent).

Transaction Log and how it works
At least one transaction log (.ldf) is necessary to run SQL Server. For small databases there is only one LDF. Only one can be used at a time.
Moving transaction logs to a separate drive from the data files is considered a ‘best practice’ since it allows for faster throughput (allows disk heads to remain in place for the serial writing of the transactions and not interrupted by data writes). Typical size for an .ldf file is 25% of the entire database size.
The log is recording more than just the SQL code we put in explicit ‘transactions’ (with a begin tran and commit tran). Implicit transactions are also carried out whenever any of the following occur: alter table, create, delete, drop, fetch, grant, insert, open, revoke, select, truncate table and update. In other words, nearly everything is logged. This may seem excessive but bear in mind that the .ldf is not only there to record your explicit transactions, it is also there so you can recover from a disaster.
Here’s what happens behind the scenes (pay no attention to that curtain, Dorothy) when data modifications occur:

  1. data modification is sent by an app
  2. affected data pages are found in or read into the buffer cache and modified (dirty bit is set for page)
  3. before and after image of the changes recorded in the .ldf
  4. acknowledgement is sent to app that change is committed
  5. checkpoint writes the ‘dirty’ pages to disk, causing the dirty bit to flip. Now the page in cache is ‘clean’.

So the .mdf or .ndf file is only changed after a checkpoint. So that begs the question, ‘When does a checkpoint happen?’ It depends. Usually this process is controlled by SQL Server (usually better off just letting it handle this) but it is possible to force if necessary.
Another way of thinking about a Transaction Log is to visualize a snake eating its own tail example (don’t try this at home with a pet). Assuming a Full Recovery Model, if the head reaches the tail then the log is full and the db becomes read-only (selects work but changes don’t). A log backup is required to clean up (truncate) the log. It will work as long as transactions are committed, checkpointed and backed up. The log backup frees up space by deleting committed, checkpointed and backed up log entries and leaving the rest behind.

Using SQLCMD with SQL Server 2005

Intro
You can’t always use a GUI (examples are batch scripts, DAC, etc.) and so MS provides sqlcmd. This utility replaces the older ‘osql’ utility which won’t be supported in future releases. With sqlcmd you have a host of command line parameters to allow for a line by line interface with the database engine. You can find sqlcmd in this directory for SQL Server 2005:

C:\Program Files\Microsoft SQL Server\90\Tools\Binn\

Using with Variables
One of the more powerful features is the ability to run in ‘variable mode’ so you can make simple (or complex) boilerplate programs. An example. You are always checking the last rows of particular tables but you want to be able to automate it. Using the sql file ‘latest.sql’ with this content:

select * from $(tablename) order by $(primarykey) desc
GO

you could do this with tblmytable (which has the primary key mytableid_pk) using this on the command line:

sqlcmd -i latest.sql -v primarykey="mytableid_pk" tablename="tblmytable" -o latest.output

The table contents would be contained in reverse primary key order in the file ‘latest.output’.

DAC
DAC is something different and more important for administrators. It is a Dedicated Administrative Connection and is accessed using:

sqlcmd -A

This is a very special feature as it allows access even when the server hangs or is otherwise unavailable. It has its own private SQL Server scheduler and so the only way it can be excluded from access is if the SQL Server service has been stopped or paused. You could go into an unresponsive server and attempt to diagnose the problem, end connections or just shut down properly:

sqlcmd -A
1> shutdown with nowait
2> go