WIDBA - The busiest DBA in Wisconsin, its all I do and I do it well

Tuesday, March 29, 2011

Deadlocks: There are too many of them!

Back in the day, deadlock tracing was a bit more painful than it is now.  Relegated to dumping them to the SQL Error log and parsing typically.  A bunch of years ago, Bart Duncan wrote a series on troubleshooting deadlocks that is fantastic.  I couldn't possibly add anything to what he produced, but I did create something to help, especially if you have several deadlocks and are searching for specific ones.  Now that the deadlock is stored in the ring buffer, you can grab out the data and use XQuery to get a bit more out the deadlock graph so you do not have to open each deadlock XML individually.  Local MVP Jason Strate put together a great series of blog posts on the subject of XQuery, in case that is not your strong suit.

So, if you have lots of inconsistent data access patterns, you too might need a script to mine out a few of the key attributes of the deadlock to find the one you are looking for. 

This works on SQL 2008 and specifically have SP1 CU #6 installed.  (There is a workaround to prior installs, I think it is on SQLServerCentral.com.)

This will pull several columns out of the deadlock graph, the datetime of the deadlock, the logins involved and the types of locks involved (Page, Object, Key, etc). 

WITH deadlocks(DeadLockGraph)
    SELECT CAST(XEventData.XEvent.value('(data/value)[1]', 'varchar(max)') as xml) as DeadlockGraph
     (select CAST(target_data as xml) as TargetData
      from sys.dm_xe_session_targets st
        join sys.dm_xe_sessions s on s.address = st.event_session_address
      where name = 'system_health') AS Data
    CROSS APPLY TargetData.nodes ('//RingBufferTarget/event') AS XEventData (XEvent)
    where XEventData.XEvent.value('@name', 'varchar(4000)') = 'xml_deadlock_report'
SELECT DeadLockGraph.query('data(deadlock/process-list/process[1]/@lasttranstarted)').value('.', 'datetime') as DDate
    , DeadLockGraph.query('data(deadlock/victim-list/victimProcess/@id)').value('.', 'varchar(128)') VictimPID
    , DeadLockGraph.query('data(deadlock/process-list/process/@id)').value('.', 'varchar(128)') as PIDS
    , DeadLockGraph.query('data(deadlock/process-list/process/@waitresource)').value('.', 'varchar(128)') as Waits
    , DeadLockGraph.query('data(deadlock/process-list/process/@lockMode)').value('.', 'varchar(128)') as Mode
    , DeadLockGraph.query('data(deadlock/process-list/process/@loginname)').value('.', 'varchar(128)') as Logins
    , DeadLockGraph.query('count(deadlock/process-list/process)').value('.','int') as ProcNum
    , DeadLockGraph.query('deadlock/resource-list/child::node()') as ResourceList
FROM deadlocks 
WHERE DeadLockGraph.query('data(deadlock/process-list/process[1]/@lasttranstarted)').value('.', 'datetime') > '2010-12-03 9:00'

Hope this helps you find the deadlocks you are looking for a bit quicker, when there are many of them, parsing out a few pieces can make identification easier.

Friday, March 18, 2011

Speaking at Twin Cities Code Camp 4/9/2011

On April 9th and 10th, the Twin Cities Code Camp will be held at the University of Minnesota.  The sessions lean towards various programming languages and patterns.  If you are a DBA, stop by and break out of your shell.  Or as I say in a tongue in cheek way "Get to know your enemy".  I will be there on Saturday, hoping to catch a few sessions before I speak after lunch on Powershell monitoring.  Our local SQL rockstar, MVP Jason Strate will also be speaking on Extended Events

My session link is here if you want to attend.  I will skip some of my typical SQL stuff and cover more general server monitoring, jobs and perfmon collection strategies.

Powershell Eye for the Monitoring Guy

Hope to see you there!

Tuesday, March 15, 2011

Shrinkage: How low do I need to go to fix VLF Fragmentation?

VLF(virtual log file) fragmentation has been written about quite a bit in the past few years, in which I learned about it from SQLSkills .  One thing I did was to add a field on the end of the result set that shows you a running total of megabytes for each VLF.  This features is nice if you need to know how far to shrink your log to achieve a desired number of VLFs.  Note: There is a great deal out there about why shrinking is bad, etc please be sure to tackle the root issue before spending lots of time fixing this.  Small growth rates on the log file, bad backup schedules, etc are usually to blame.

DECLARE @beginlsn numeric(22,0);
DECLARE @increment bigINT;
    SET @increment = 1;

IF (OBJECT_ID('tempdb..#vlf') IS NOT NULL)

CREATE TABLE #vlf(fileid int, filesize bigint, startoffset numeric(13,0), fseqno int, status smallint, parity int, createlsn numeric(22,0));

DECLARE @db varchar(128);
SET @db =  DB_NAME()

EXEC ('DBCC LOGINFO('''+ @db +''')');

    ADD growIncrement INT,row_id INT IDENTITY(1,1);

SET @beginlsn = (SELECT TOP 1 createlsn
                 FROM #vlf
                 WHERE growIncrement is null
                 ORDER BY startoffset);

-- Set growth increment (shows how many vlfs were created in a single growth)
WHILE EXISTS(SELECT 1 FROM #vlf WHERE growIncrement is null)
    UPDATE #vlf SET growIncrement = @increment WHERE createlsn = @beginlsn;
    SET @beginlsn = (SELECT TOP 1 createlsn FROM #vlf WHERE growIncrement is null order by startoffset)
    SET @increment = @increment + 1;

SELECT  fileid,fseqno,status,parity,growincrement,row_id as VLF_Number
    , (SELECT (SUM(filesize) / 1048576) FROM #vlf v2 WHERE v2.row_id <= v.row_id) as RunningTotalInMB
FROM #vlf v
ORDER BY row_id desc

Wednesday, March 9, 2011

Putting on your SOX - Services

I had originally set out to publish a few scripts for other DBAs to help make their SOX audits a bit easier.  Currently, my team is going through the process with a third party and they are asking for the services running on a server, and what do they run as.  If you are experienced in Powershell, this is pretty easy.  The call comes down to Get-WMIObject and the win32_service class.  The key thing was adding the "StartName" property which translates to "Log On As", thanks to this site for the help.  You may need to filter on just SQL Services, or a list and filter on that using  -contains or -like.  Hope this helps take the screen shots out of your audit.

$OutputFile = "C:\Powershell\ServerServices.htm"
$title = "<h1> Services SOX2011</h1>"
$title | Out-File $OutputFile

# server list
$arr =  @("Server1"),("Server2")

# Loop for each Path
foreach($Server in $arr)
    "<h2>$Server</h2>" | out-File  $OutputFile -append
    Get-WmiObject -ComputerName $Server -Class "win32_service" |
        Where-Object {$_.State -eq "Running"} |
        Select-Object Name, StartMode, State, Status,StartName |
        ConvertTo-Html |
        Out-File -FilePath $OutputFile -Append
"<h3>GMT Run Date:" + (Get-Date).ToUniversalTime() + "</h3>" | Out-File $OutputFile -Append

Monday, March 7, 2011

Putting on your SOX - Registry ACLs

In the second installment of the Powershell SOX series, we examine reading registry entries on servers being audited.  This script has some extra code in it because registry paths very between SQL 2000 and SQL 2008 in my environment and even depending on which registry area you are in.  If you are asked to take screenshots by the auditors in your company, consider seeing if they will take this and save yourself some time, especially after the first audit.

$OutputFile = "C:\SOX2011\SQL_SOX_ACL_SOX2011.htm"
$type = [Microsoft.Win32.RegistryHive]::LocalMachine
# Create 2d array of server and the "instancename" (depends on version of SQL)
$Srv = @("Server","MSSQL10.Brewers"),("Server2","MSSQL10.Packers")

$headerString = "-------------------------------------------------------------------------<br/>"
$title = "<h1>SQL Registry ACL SOX 2011</h1>"
$title | Out-File $OutputFile

Foreach($server in $Srv) {
    $serverName = $server[0].ToString();
    $instanceName = $server[1].ToString();
    # Instance Name is different in the System\CurrentControlSet.. for some reason
    # (in System is MSSQL$, in Software its MSSQL10.)
    $altInstancename = $instanceName.Replace("10.","$");
    $header = "<h2>Registry ACL for:" + $serverName  + "</h2>"
    $header   | out-File $OutputFile  -append
    # Note registry paths may be different between versions, etc

    # Loop each registry key path in $arr
    foreach($key in $arr)
        $subHeader = "<h4>" + $key + "</h4>"
        $subHeader | out-File  $OutputFile -append
        $headerString | out-File $OutputFile -append   
        Write-Host "Key:" $key
        # Read Key from registry using .NET
        $regKey = [Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey($type, $serverName )
        $regKey = $regKey.OpenSubKey($key)

        if(!$regKey) {   
            Write-Host "The key does not exist!";
        } else {       
        $regKey.GetAccessControl().GetAccessRules($true,$true,[System.Security.Principal.NTAccount]) |
            Where-Object -FilterScript { $_.RegistryRights -notlike "[-,0-9][0-9][0-9]*"}   |
            ConvertTo-HTML |
            out-File $OutputFile -append   
    } #Foreach Key
} # Foreach Server   

Next up we will get into accounts and services on the server.  Have a nice audit.

Saturday, March 5, 2011

Putting on your SOX with Get-Acl

It is that time of year, when a few college graduates hired by the auditing company show up at the office.  Their mission is to make a name for themselves find folks who are not following our Sarbanes-Oxley policies.  The general idea is not all that bad, but in practice much of the audit is hokum.  They request permissions on one folder, but not the folder beneath it with the critical data.  They ask about who has access to change a job, but not who can edit financial data.  Anyway, I got my detailed audit request this past week and thus I utilized some powershell scripts that I created last year.  Last year I fought the screenshot vs powershell battle and won.  I would provided the report and the script I used so they could independently verify if they so chose.  This gives us consistency each year despite the auditors being changed out quicker than kegs at the tavern during a Packers game.

This script creates a HTML report of the Access Control Lists (ACL) for specific folders on the database server.  The report is far from pretty but creates something viewable with a little formatting.  It utilizes the Get-Acl cmdlet with some filters to return the users. 

$OutputFile = "C:\SOX2011\SQL_ACL_SOX2011.htm"
$headerString = "<br/>-------------------------------------------------------------------------<br/>"
$title = "<h1>SQL SOX2011</h1>"
$title | Out-File $OutputFile

# Build array of the paths to be audited for each server
$arr =  @("\\server\c$\Program Files\Microsoft SQL Server\")`
        ,("\\server\c$\Program Files\Microsoft SQL Server\MSSQL10.Brewers")`
        ,("\\server2\c$\Program Files\Microsoft SQL Server\")`
        ,("\\server2\c$\Program Files\Microsoft SQL Server\MSSQL10.Packers")

# Loop for each Path
foreach($BasePath in $arr)
    $subHeader = "<h2>" + $BasePath + "</h2>"
    $headerString | out-File  $OutputFile -append

    Get-Acl -path $BasePath | select -expand access |
        Where-Object -FilterScript { $_.FileSystemRights -notlike "[-,0-9][0-9][0-9]*"}   |
        Select-Object IdentityReference,AccessControlType,IsInherited,FileSystemRights  |
        ConvertTo-HTML -body $subHeader | Out-File $OutputFile -append      

This mini-series will have four parts on the scripts used to get through a SOX audit each year.

Wednesday, March 2, 2011

Looking for trouble in all the right places with Powershell - Keeping a Clean Log

Part of monitoring sometimes is making sure the stuff you are checking is not so big it takes a long time to check it.  This function allows you to centrally manage your SQL Logs so they are cycled based on a number of days.  Many folks just implement this with a SQL job, but I am moving towards central management and less localized stuff.  The function below cycles based on the $DaysToCycle number.  An alternative implementation could be to read the current log and have it rollover on a specified number of entries.

Here is the function, taking an SMO SQL object which would already be created and the number of days.

Function Invoke-SQLLogCycler([Microsoft.SqlServer.Management.Smo.Server] $SmoSqlServer, [int] $DaysToCycle)
    # Get the second newest log - the date stamped there is the last reboot date.
    $lastSQLLog = $SmoSqlServer.EnumErrorLogs() `
        | Where-Object {$_.ArchiveNo -eq 1} | Select-Object -Property "CreateDate"

    [DateTime] $lastLogRolloverDate = $lastSQLLog.CreateDate
    if($DaysToCycle -gt 0) {$DaysToCycle = $DaysToCycle * -1};

    # Check that last log cycle was more than specified number of days out (isPriorTo function below)
    if(isPriorTo -days $DaysToCycle -Date $lastLogRolloverDate)
            Write-Host "Error Cycled for " + $SmoSqlServer.Name;
            Write-Host " SQL Log Cycle Failed for " + $SmoSqlServer.Name;
        Write-Host "SQL Log was last cycled on " + $lastLogRolloverDate.ToLongDateString();

# Helper function - makes code a bit more readable
Function isPriorTo([int]$days, [datetime]$Date)
    [DateTime]::Now.AddDays($days).Date -gt $Date.Date

Just a quick diversion from monitoring - but a nice one when you have several servers and no current management method.  

Tuesday, March 1, 2011

Looking for trouble in all the right places with Powershell - Drive Space

In continuing with my series on Powershell monitoring (Post 1), I will move on to one of the more prevalent examples out there, collecting drive space information.  This function looks for any drives that have less free space than a factor I pass in.  When doing monitoring on several servers, you really need to keep the output to the problems, this can easily be changed to create an inventory if needed.  This particular script will not work with mount points as I have read.  Someday I will have some time to create one that does both.  I learned from several others on this by reading and changed up to create my own, particularly so I could change the multiplier for specific servers if I so chose.   

function Get-DriveSpace
    param( [Parameter(Mandatory=$true)]$computer , [int]$thresholdMultiplier=.10 )
    [string]$filter = "DriveType = '3'"

    $wmi = Get-WmiObject -class win32_logicaldisk -filter $filter -comp $computer |
        Where-Object {$_.size -gt 0} |   
        Where-Object {$_.freespace -le ($_.size * $thresholdMultiplier)} |
        Select-Object Name,Size,Freespace
    return $wmi;

Note the line Where-Object {$_.freespace -le ($_.size * $thresholdMultiplier)} .  This is where I want to only return drives that have limited space available.

# Here is a sample call that you can add to your loop for each server to monitor
        Get-DriveSpace -computer $server | FT -AutoSize
        } catch { Write-Host "Drive Space Error:$server-" $_  }

Well, that is it.  Not a particularly complex function and its widely published, but the end game is to provide a complete set of functions you can use to create a useful monitoring script in your environment.  Next up is starting to learn to collect perfmon counters and store them, all from a central script.