Thursday, 7 February 2013

Adventures with Powershell, finally some code

Time to show the code for the SQL Server service state monitoring, there is quite a lot but hopefully it should be easy to follow, if all else fails try it out :
       

 /*
Check State
Create tables 
Chris Page 2013
*/

use [DBAUtility] 
go
-- Table to store the windows server name & SQL instance name
IF OBJECT_ID ( 'SQLInstance', 'U' ) IS null begin
 CREATE TABLE [dbo].[SQLInstance](
  ID    int not null identity (1,1),
  [WindowsServer] varchar(128) not NULL,
  [SQLInstance] varchar(255) not NULL,          -- either same as server or server\instance
  Active   bit               -- 1 = active so checkm 0 = inactive don't check
  constraint PK_SQLInstance primary key clustered (Id) with (fillfactor=100, pad_index=off)
 )
 create unique index Idx_SQLInstance on SQLInstance (WindowsServer,SQLInstance) with (fillfactor=70, pad_index=on)
 
 -- Inserting Initial Data into SQLInstane
 insert into SQLInstance values ('S-UAT-SQL108.TOWERGATE.LOCAL','S-UAT-SQL108',1) -- Default Instance
 insert into SQLInstance values ('L-TPL-3058','L-TPL-3058',1)      -- "server", "server\instance"

end
if OBJECT_ID ('SQLServiceState','U') is null begin
 -- SQL Service Monitoring Output Table
 CREATE TABLE [dbo].[SQLServiceState](
  Id     int identity (1,1) not null,
  [WindowsServerID] int not null,
  [SQLService]  [varchar](128) not NULL,
  [UpdatedOn]   [smalldatetime] not NULL,
  [ServiceStatus]  [varchar](100) not NULL,
  [StartMode]   [varchar](100) not NULL,
  [ServiceAccount] [varchar](1000) NULL,          -- long to allow command to be stored in case of issues
  constraint FK_SQLServiceState_WindowsServer foreign key (WindowsServerID) references SQLInstance (Id)
 )
 alter table SQLServiceState add constraint [DF_SQLServiceState_UpdatedOn] default (getdate()) for UpdatedOn
 create unique index IX_SQLServiceState on SQLServiceState (ID) with (fillfactor=100, pad_index=off)
end
if OBJECT_ID('SQLServiceStateHistory','U') is null begin
 CREATE TABLE [dbo].[SQLServiceStateHistory](
  Id     int not null,            -- not an identity field as data coming from elsewhere
  [WindowsServerID] int not null,
  [SQLService]  [varchar](500) NULL,
  [UpdatedOn]   [smalldatetime] NULL,
  [ServiceStatus]  [varchar](100) NULL,
  [StartMode]   [varchar](100) NULL,
  [ServiceAccount] [varchar](1000) NULL,
  constraint FK_SQLServiceStateHistory_WindowsServer foreign key (WindowsServerID) references SQLInstance (Id)
 )
end

if OBJECT_ID('ProcessLog','U') is null begin           -- *** STANDARD SPEC!
 create table ProcessLog (
  Id     int    not null identity (1,1),     -- UID
  Package    varchar(200) not null,
  Process    varchar(100) not null,
  ActionTaken   varchar(255) not null,
  DateStamp   datetime  not null,   
  Records    int    not null
  constraint PK_ProcessLog primary key clustered (Id) with (fillfactor=90, pad_index=on)
 )

 create index IDX_ProcessLog_Process on ProcessLog (Process) with (fillfactor=60, pad_index=on)
 create index IDX_ProcessLog_DateStamp on ProcessLog (DateStamp) with (fillfactor=60, pad_index=on)
 alter table ProcessLog add constraint [DF_ProcessLog_Records] default (0) for Records
 alter table ProcessLog add constraint [DF_ProcessLog_DateStamp] default (getdate()) for DateStamp
end

if OBJECT_ID('vSQLService','V') is not null begin
 drop view dbo.vSQLService
end 
go
create view vSQLService as 
 select  w.WindowsServer
 ,   w.SQLInstance
 ,   s.SQLService
 ,   s.ServiceAccount
 ,   s.UpdatedOn
 ,   s.StartMode
 ,   s.ServiceStatus
 from  dbo.SQLServiceState s
 inner join dbo.SQLInstance w on w.ID = s.WindowsServerID
       
 
This is creating a table to log servers that need checking (SQLInstance), a table in which to store the latest service state information (SQLServiceState), a table to store all previous service state information (SQLServiceStateHistory) - volumes can be large quickly hence separating the tables, if you want everything up to and including now a union will work nicely and finally a logging table (ProcessLog) which follows a standard layout I have been using to record what happened, when and why. Lastly I have defined a view to put the principal information from SQLInstance and SQLServiceState together to make simple queries against the current data easier (ie I got fed up rekeying while testing). Next is the Check State script that does all the hard work :
       

use [DBAUtility] 
go
/*
Check State
Chris Page 2013
*/
IF OBJECT_ID ( 'usp_CheckStateOfSQLServers', 'P' ) IS NOT NULL 
    DROP PROCEDURE usp_CheckStateOfSQLServers
go
create procedure usp_CheckStateOfSQLServers 
as
begin
 set nocount on
 insert into ProcessLog (Package, Process,ActionTaken) values ('CheckStateOfServers','Process','Start')
 
 declare @chkCMDShell as sql_variant
 ,  @Server   varchar(128)
 ,  @SQLInstance varchar(128)
 ,  @SQLInstanceID int
 ,  @ERROR   bit              -- flag to stop onward processing
 ,  @ErrMsg   varchar(255)
 ,  @ErrState  int
 ,  @ErrSeverity int
 ,  @SQL   varchar(1000)
 
 set @ERROR = 0
 
 begin try
  select @chkCMDShell = value from sys.configurations where name = 'xp_cmdshell'
  if @chkCMDShell = 0
  begin
   EXEC sp_configure 'Show Advanced Options', 1;
   RECONFIGURE with override;
   EXEC sp_configure 'xp_cmdshell', 1
   RECONFIGURE with override;
   EXEC sp_configure 'Show Advanced Options', 0;
   RECONFIGURE with override;
   insert into ProcessLog (Package, Process, ActionTaken) values ('CheckStateOfServers','xp_cmdshell','Allowed')
  end
 end try
 begin catch
  select @ERROR = 1, @ErrMsg = 'Could not enable XP_CMDSHELL '+ERROR_MESSAGE(),@ErrState = ERROR_STATE(),@ErrSeverity = ERROR_SEVERITY()
  raiserror (@ErrMsg,@ErrState, @ErrSeverity)
  insert into ProcessLog (Package, Process, ActionTaken) values ('CheckStateOfServers','xp_cmdshell','Could not be Allowed')
 end catch

 if @ERROR = 0 begin
  begin try                   -- start by archiving results from the last run
   insert into dbo.SQLServiceStateHistory           -- move old entries to history
    select *
    from SQLServiceState
   
   truncate table dbo.SQLServiceState            -- bin the current data having moved to history
  end try
  begin catch
   select @ERROR = 1, @ErrMsg = 'Could not move old ServiceStates to history '+ERROR_MESSAGE(),@ErrState = ERROR_STATE(),@ErrSeverity = ERROR_SEVERITY()
   raiserror (@ErrMsg,@ErrState, @ErrSeverity)
   insert into ProcessLog (Package, Process, ActionTaken) values ('CheckStateOfServers','ServerStateHistory','Could not moved.')
  end catch
 end
 
 if @ERROR = 0 begin
  create table #output (line varchar(max) null)
  
  declare ServerCursor cursor fast_forward for 
   select distinct LTRIM(rtrim(s.WindowsServer))
   ,  LTRIM(RTRIM(s.SQLInstance))
   ,  s.ID
   from dbo.SQLInstance s
   where s.Active = 1
  
  open ServerCursor
  fetch next from ServerCursor into @Server, @SQLInstance, @SQLInstanceID
  while @@FETCH_STATUS = 0 and @ERROR = 0 begin   
   set @sql = 'powershell.exe -c "Get-WmiObject  -ComputerName ' + QUOTENAME(@Server,'''') + ' -Class win32_service |  where {$_.name -like ' + QUOTENAME('*SQL*','''') + '} | select-object  Name,state,systemname,startmode,startname  | foreach{$_.name+''|''+$_.state+''%''+$_.systemname+''*''+$_.startmode+''@''+$_.startname+''!''}"'

   begin try
    insert #output EXEC xp_cmdshell @sql
   end try
   begin catch
     select @ERROR = 1, @ErrMsg = 'xp_cmdshell execution failed '+ERROR_MESSAGE(),@ErrState = ERROR_STATE(),@ErrSeverity = ERROR_SEVERITY()
     raiserror (@ErrMsg,@ErrState, @ErrSeverity)
     insert into ProcessLog (Package, Process, ActionTaken) values ('CheckStateOfServers','xp_cmdshell','Execution failed')
   end catch 

   if @ERROR = 0 begin   
    if (select COUNT(*) from #output o where o.line like 'Get-WmiObject : The RPC server is unavailable%') = 1 begin
     insert into SQLServiceState (SQLService, ServiceStatus, ServiceAccount, StartMode, WindowsServerID) values
      ('All','Uncontactable','Command tried : '+@SQL, 'Uncontactable', @SQLInstanceID)
    end else if (select COUNT(*) from #output o where o.line like 'Get-WmiObject : Access is denied%') = 1 begin
     insert into SQLServiceState (SQLService, ServiceStatus, ServiceAccount, StartMode, WindowsServerID) values
      ('All','Access Denied','Command tried : '+@SQL, 'Access Denied', @SQLInstanceID)
    end else begin
    
     delete from #output  where len(line) < 30 -- Deleting the rows which contains error or has not sufficient data
     update #output set line = line + '!' where line not like '%!%'
     IF (SELECT COUNT(*) FROM #output where line like '%Get-Wmi%') = 0 begin
      insert into SQLServiceState(SQLService,ServiceStatus,StartMode,ServiceAccount, WindowsServerID)
       select  rtrim(ltrim(SUBSTRING(o.line,1,CHARINDEX('|',o.line) -1))) as SQLServiceName 
       ,  (rtrim(ltrim(SUBSTRING(o.line,CHARINDEX('|',o.line)+1, (CHARINDEX('%',o.line) -1)-CHARINDEX('|',o.line)) ))) as ServiceStatus
       --,(rtrim(ltrim(SUBSTRING(line,CHARINDEX('%',line)+1, (CHARINDEX('*',line) -1)-CHARINDEX('%',line)) ))) as WindowsServerName 
       ,  (rtrim(ltrim(SUBSTRING(o.line,CHARINDEX('*',o.line)+1, (CHARINDEX('@',o.line) -1)-CHARINDEX('*',o.line)) ))) as startmode
       ,  (rtrim(ltrim(SUBSTRING(o.line,CHARINDEX('@',o.line)+1, (CHARINDEX('!',o.line) -1)-CHARINDEX('@',o.line)) ))) as startname
       ,  @SQLInstanceID
       from #output o
       where o.line is not null 
       and  LEN(o.line) > 30
     end else begin -- Unknown error
      insert into SQLServiceState (SQLService, ServiceStatus, ServiceAccount, StartMode, WindowsServerID) values
      ('All','Unknown Error','Command tried : '+@SQL, 'Unknown Error', @SQLInstanceID)    
     end
    end
   end
   truncate table #output
   
   fetch next from ServerCursor into @Server, @SQLInstance,  @SQLInstanceID
  end -- WEND
  close ServerCursor
  deallocate ServerCursor
  drop table #output  
 end

 if @chkCMDShell = 0                    -- turn off xp_cmdshell again if we turned it on during execution
 begin try
  EXEC sp_configure 'Show Advanced Options', 1;
  RECONFIGURE;
  EXEC sp_configure 'xp_cmdshell', 0
  RECONFIGURE;
  EXEC sp_configure 'Show Advanced Options', 0;
  RECONFIGURE;
  insert into ProcessLog (Package, Process, ActionTaken) values ('CheckStateOfServers','xp_cmdshell','DisAllowed')
 end try
 begin catch
  set @ERROR = 1
  select @ERROR = 1, @ErrMsg = 'Could not disable XP_CMDSHELL '+ERROR_MESSAGE(),@ErrState = ERROR_STATE(),@ErrSeverity = ERROR_SEVERITY()
  raiserror (@ErrMsg,@ErrState, @ErrSeverity)
  insert into ProcessLog (Package, Process, ActionTaken) values ('CheckStateOfServers','xp_cmdshell','Could not be disabled.')
 end catch
 
 insert into ProcessLog (Package, Process, ActionTaken) values ('CheckStateOfServers','Process','End')
  
end

       
 
this seems like a lot of code but a lot of it is error checking and reporting rather than getting statuses. Note right at the beginning the code turns on xp_cmdshell if it is turned off, if it does turn on xp_cmdshell it does turn it off again at the end to close that security hole again - the implication here is that this task does need to run at sufficiently high security levels to be able to do take this action. My assumption is that the process will be running at the SQL Server Agent level and will probably have the necessary rights anyway - we are after all looking at service statuses and that is a privileged action.

Following turning on xp_cmdshell the powershell command is run once for each server & instance Pfound in the SQLInstance table. Yes I have used a cursor to provide my loop through the SQLInstance table, it is set as fast_forward so works quickly, you could change this to a while loop or something else if you want but the coding is less and tidier with a cursor and for something that is not OLTP I don't see problems on my systems.

The Powershell command is actually pretty simple if you've done any command line scripting (I'll jump back to my Unix comment from a previous post here, it's remarkably Unix like) :
       
powershell.exe -c "Get-WmiObject  -ComputerName ' + QUOTENAME(@Server,'''') + ' -Class win32_service |  where {$_.name -like ' + QUOTENAME('*SQL*','''') + '} | select-object  Name,state,systemname,startmode,startname  | foreach{$_.name+''|''+$_.state+''%''+$_.systemname+''*''+$_.startmode+''@''+$_.startname+''!''}"'
 
 
This can be run independently of the SQL scripts and simply runs Powershell and calls "get-wmiobject" for the given server and lists out the service status names for any service that is SQL related. This is the only problematic hardcoded part (yes could code around it, necessary? well we'll find out in a future version of SQL server I imagine) because it is looking for any services with SQL in the registered service name - so it might also generate false positives on a box with non SQL server SQL services on there (MySQl etc) - I am not able to test that currently but that should be easy to tweak and potentially is not such a bad thing. to help other people encountering Powershell for the first time, there is another command, "get-service" which seems much simpler and tidier than get-wmiobject, however, it cannot run against a remote server.

The output from the Powershell command is saved into a table and the useful parts processed to find the detail we actually need - both the Powershell command and processing of output are unmodified from Jugal's version as they work perfectly.

I handle as many errors as I have encountered, including inability to turn on or use xp_cmdshell, RPC service unavailable, server uncontactable, access denied and a final "unknown" category just in case something new comes up.

Okay, we've got some output in a table, now what? Well I guess someone needs to know about the results so the following script provides two routes, either direct to the console via Management Studio or via an HTML formatted email (designed to handle Outlook as used in my environment but the HTML is entirely standard so should be fine in any client that accepts HTML) :
       

use [DBAUtility] 

go
/*
Check State
Report State
Chris Page 2013
*/
IF OBJECT_ID ( 'usp_ReportStateOfSQLServers', 'P' ) IS NOT NULL 

    DROP PROCEDURE usp_ReportStateOfSQLServers

go

create procedure usp_ReportStateOfSQLServers (

 -- parameters could be 

 @ShowStopped  bit = 1, -- show services not set as running (as opposed to show stopped in case other values are available but not known)

 @ShowUncontactable bit = 1, -- show uncontactable servers (ignores @AutomaticOnly)

 @ShowDenied   bit = 1, -- show servers to whom access was denied

 @ShowUnknown  bit = 1, -- show servers with unknown errors

 @AutomaticOnly  bit = 1, -- show automatic start services (inc delayed) only

 @Destination  varchar(255) = 'chris.page@towergate.co.uk' -- if 'local' then print locally

)

as

begin

 set nocount on

 

 declare @OutputStopped   varchar(max)

 ,  @OutputUncontactable varchar(max)   -- includes access denied

 ,  @OutputAll    varchar(max)

 ,  @WindowsServer   varchar(128)

 ,  @SQLInstance   varchar(255)

 ,  @SQLService    varchar(128)

 ,  @ServiceAccount   varchar(1000)

 ,  @UpdatedOn    smalldatetime

 ,  @StartMode    varchar(100)

 ,  @ServiceStatus   varchar(100)

 

 select  s.WindowsServer

 ,   s.SQLInstance

 ,   s.SQLService

 ,   s.ServiceAccount

 ,   s.UpdatedOn

 ,   case 

     when s.ServiceStatus = 'Access Denied' then 'Check agent service account has rights on remote' 

     when s.ServiceStatus = 'Uncontactable' then 'Check server is up'

     else '' 

    end as StartMode

 ,   s.ServiceStatus

 into  #TempStateList -- store in temp table so local & email destination done in one query & no dynaminc SQL

 from  vSQLService s

 where  ( (@AutomaticOnly = 0 or (@AutomaticOnly=1 and s.StartMode like 'Auto%'))

     and 

     s.ServiceStatus != 'Running'

     and

     @ShowStopped = 1

    )

    or

    (@ShowUncontactable = 0 or (s.ServiceStatus = 'Uncontactable' and @ShowUncontactable=1))

    or

    (@ShowDenied = 0 or (s.ServiceStatus = 'Access Denied' and @ShowDenied=1))

    or

    (@ShowUnknown = 0 or (s.ServiceStatus = 'Unknown Error' and @ShowDenied=0))

 order by s.WindowsServer

 ,   s.SQLInstance

 ,   s.SQLService



 if lower(@Destination)='local' begin

 

  select 'Unreachable Servers'

  select * from #TempStateList t where t.ServiceStatus in ('Uncontactable','Unknown Error' ,'Access Denied')

  select 'Stopped Services'

  select * from #TempStateList t where t.ServiceStatus not in ('Running','Unknown Error','Uncontactable','Access Denied')

 end else begin

 

  declare ReportableServicesCursor cursor fast_forward for

   select * from #TempStateList

   

  set @OutputUncontactable = ''

  set @OutputStopped = ''

  open ReportableServicesCursor

  fetch next from ReportableServicesCursor into @WindowsServer, @SQLInstance, @SQLService, @ServiceAccount, @UpdatedOn, @StartMode, @ServiceStatus

  while @@FETCH_STATUS = 0 begin

   --select @WindowsServer,@SQLInstance,@SQLService,@ServiceAccount,@UpdatedOn,@StartMode,@ServiceStatus

   select @OutputUncontactable=@OutputUncontactable+'<tr>'+

       '<td>'+@WindowsServer+'</td>'+

       '<td>'+@SQLInstance+'</td>'+

       '<td>'+@SQLService+'</td>'+

       '<td>'+cast(@UpdatedOn as varchar)+'</td>'+

       '<td><font color="red">'+@ServiceStatus+'</font></td>'+

       '<td>'+@Startmode+'</td>'+

       '<td>'+@ServiceAccount+'</td>'+

       '</TR>'

   where  @ServiceStatus in ('Uncontactable','Unknown Error' ,'Access Denied')

  

   select @OutputStopped =@OutputStopped+'<tr>'+

       '<td>'+@WindowsServer+'</td>'+

       '<td>'+@SQLInstance+'</td>'+

       '<td>'+@SQLService+'</td>'+

       '<td>'+cast(@UpdatedOn as varchar)+'</td>'+

       '<td>'+@StartMode+'</td>'+

       '<td><font color="red">'+@ServiceStatus+'</font></td>'+

       '<td>'+@ServiceAccount+'</td>'+

       '</TR>'

   where @ServiceStatus  not in ('Running','Unknown Error','Uncontactable','Access Denied')

  

   fetch next from ReportableServicesCursor into @WindowsServer, @SQLInstance, @SQLService, @ServiceAccount, @UpdatedOn, @StartMode, @ServiceStatus

  end 

  close ReportableServicesCursor

  deallocate ReportableServicesCursor

 

  if @OutputUncontactable!='' or @OutputStopped !='' begin -- only output results if there are some

   set @OutputAll = '<body>'

   if @OutputUncontactable != '' begin

    set @OutputAll = @OutputAll+'The following SQL Server instances are unavailable : <br/><br/>'+

        '<table border=1><tr><td><b>Windows Server</b></td><td><b>SQL Instance</b></td><td><b>SQL Service</b></td><td><b>Updated On</b></td>'+

        '<td><b>Service Status</b></td><td><b>Notes</b></td><td><b>Script Used</b></td></tr>'+

        @OutputUncontactable+'</table>'

   end 

   if @OutputStopped != '' begin

    if @OutputAll != '' begin

     set @OutputAll = @OutputAll + '<br/><br/><br/>'

    end

    set @OutputAll = @OutputAll+

     'The following SQL Server services are stopped :<br/><br/>'+

     '<table  border=1><tr><td><b>Windows Server</b></td><td><b>SQL Instance</b></td><td><b>SQL Service</b></td><td><b>Updated On</b></td>'+

        '<td><b>Start Mode</b></td><td><b>Service Status</b></td><td><b>Service Account</b></td></tr>'+

        @OutputStopped+'</table>'

   end

   set @OutputAll = @OutputAll + '</body>'

   begin try

    exec msdb.dbo.sp_send_dbmail @recipients=@Destination, @Importance='High'

    , @subject='SQL Server Service Review', @Body=@OutputAll, @Body_Format='HTML'

   end try

   begin catch -- unable to send mail, might be a config issue (firewall for smtp) or send mail not configured / enabled / no default public profile

    raiserror ('Unable to send emails while service statuses.',16,1) with log

   end catch

  end

 end

end

 
(worth noting here that I used SimpleCode to embed the script & retain the HTML text - it pasted across to my machine and ran perfectly).

This code is a bit of a fudge because having written a means of emailling the results to myself I got fed up waiting for the emails to arrive while testing so needed an output to screen and it made sense to provide the same output. Hence, the code writes the desired output to a temporary table and then processs that differently according to which the chosen destination is.

This function is called as either :
usp_ReportStateOfSQLServers @Destination = 'local' 
-- prints direct to screen
usp_ReportStateOfSQLServers @Destination = 'email@address' 
-- send to email

You will note some additional parameters that vary the "sensitivity" of what is reported - the default settings will tell you about any SQL services set to automatic start (including delayed start) which are currently stopped and include uncontactable / access denied and unknown errors. For my environment all SQL services that are used normally on a server are set to automatic start and we expect all machines to be visible all the time so this gives me that answers I need. - I deliberately leave my laptop  on the list of servers so when I am working remotely I'll get any email that my laptop is uncontactable and that just keeps me aware that the process is alive because it does not send an email if there is nothing to tell me (so if everything is working you won't see an email - be aware, some people like to get a positive response every time, easy enough to change).

You might have noted in the Check State code that records currently in the SQLServiceState are firstly moved to the SQLServiceStateHistory table before current statuses are checked so now we need to manage how large the history table can get :

       
use [DBAUtility] 
go
/*
Check State
Manage State History
Chris Page 2013
*/
IF OBJECT_ID ( 'usp_ManageSQLServerStateHistory', 'P' ) IS NOT NULL 
    DROP PROCEDURE usp_ManageSQLServerStateHistory
go

create procedure usp_ManageSQLServerStateHistory (
 -- parameters could be 
 @KeepAllHistoryForDays   smallint = 7, -- delete anything "normal" after this length of time in days
 @KeepIssueHistoryForDays  smallint = 90-- Keep stopped / uncontactable items for this long (longer than KeepAll)
)
as
begin
 set nocount on
 insert into ProcessLog (Package, Process, ActionTaken) values ('ManageSQLServerStateHistory','Process','Start')
 
 declare @ErrMsg   varchar(255)
 ,  @ErrState  int
 ,  @ErrSeverity int

 if @KeepAllHistoryForDays > @KeepIssueHistoryForDays begin
  insert into ProcessLog (Package, Process, ActionTaken) values ('ManageSQLServerStateHistory','Check periods','Keep All longer than Keep Issues, Keep Issues aligned')
  set @KeepIssueHistoryForDays = @KeepAllHistoryForDays
 end

 begin try
  delete from SQLServiceStateHistory           -- start by deleting any non issues for the shorter period
  where UpdatedOn < DATEadd (DD,@KeepAllHistoryForDAys*-1,GETDATE()) 
  and  ServiceStatus  in ('Running')
 
  delete from SQLServiceStateHistory           -- then delete anything over the longer period
  where UpdatedOn < DATEADD (dd, @KeepIssueHistoryForDays*-1, getdate())
 end try
 begin catch
  select @ErrMsg = 'Could not delete history SQL Server State '+ERROR_MESSAGE(),@ErrState = ERROR_STATE(),@ErrSeverity = ERROR_SEVERITY()
  raiserror (@ErrMsg,@ErrState, @ErrSeverity)
  insert into ProcessLog (Package, Process, ActionTaken) values ('ManageSQLServerStateHistory','Delete History','Failed.')
 end catch

 insert into ProcessLog (Package, Process, ActionTaken) values ('ManageSQLServerStateHistory','Process','End')
end
       
 
This time there are two parameters :

  1. @KeepAllHistoryForDays -- deletes all "running" service records that exist in the database, generally speaking I am not really interested in services that are running as expected so I will only want to keep that data for a short period so I have a default value of 1 week.
  2. @KeepIssueHistoryForDays -- this deletes any records that still exist after this many days, I am working on the principal that maintaining a recent service problem history is useful and can be used in discussion with service providers as evidence to support reliability discussions (yes I am taking a negative view but if the history is empty then all is well!), hence, I expect this data to be kept for longer and have default to 90 days but could potentially be significantly longer depending on local requirements.
Next comes scheduling, this is simply creating a scheduled task on your designated monitoring server, you will need to alter the Operator name to ensure you get the failure email and the email destination listed on the scheduled task :
       
USE [msdb]
GO

/*
Check State
Schedule Checking
Chris Page 2013
*/
/****** Object:  Job [DBA Maintenance Tasks : Check & Report SQL Server Service States]    Script Date: 02/04/2013 13:39:23 ******/
BEGIN TRANSACTION
DECLARE @ReturnCode INT
SELECT @ReturnCode = 0
/****** Object:  JobCategory [[Uncategorized (Local)]]]    Script Date: 02/04/2013 13:39:23 ******/
IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'[Uncategorized (Local)]'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

END

DECLARE @jobId BINARY(16)
EXEC @ReturnCode =  msdb.dbo.sp_add_job @job_name=N'DBA Maintenance Tasks : Check & Report SQL Server Service States', 
  @enabled=1, 
  @notify_level_eventlog=0, 
  @notify_level_email=2, 
  @notify_level_netsend=0, 
  @notify_level_page=0, 
  @delete_level=0, 
  @description=N'No description available.', 
  @category_name=N'[Uncategorized (Local)]', 
  @owner_login_name=N'chris.page', 
  @notify_email_operator_name=N'SQLAdministration', @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
/****** Object:  Step [Check]    Script Date: 02/04/2013 13:39:23 ******/
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Check', 
  @step_id=1, 
  @cmdexec_success_code=0, 
  @on_success_action=3, 
  @on_success_step_id=0, 
  @on_fail_action=2, 
  @on_fail_step_id=0, 
  @retry_attempts=0, 
  @retry_interval=0, 
  @os_run_priority=0, @subsystem=N'TSQL', 
  @command=N'usp_checkstateofsqlservers', 
  @database_name=N'DBAUtility', 
  @flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
/****** Object:  Step [Report]    Script Date: 02/04/2013 13:39:23 ******/
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Report', 
  @step_id=2, 
  @cmdexec_success_code=0, 
  @on_success_action=1, 
  @on_success_step_id=0, 
  @on_fail_action=2, 
  @on_fail_step_id=0, 
  @retry_attempts=0, 
  @retry_interval=0, 
  @os_run_priority=0, @subsystem=N'TSQL', 
  @command=N'usp_ReportStateOfSQLServers @Destination=''email@address''', 
  @database_name=N'DBAUtility', 
  @flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'DBA Maintenance Tasks : Daily', 
  @enabled=1, 
  @freq_type=4, 
  @freq_interval=1, 
  @freq_subday_type=8, 
  @freq_subday_interval=1, 
  @freq_relative_interval=0, 
  @freq_recurrence_factor=0, 
  @active_start_date=20120914, 
  @active_end_date=99991231, 
  @active_start_time=90000, 
  @active_end_time=170000
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
    IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:

GO       
 
Last but not least is logging deployment, one day I will write something clever to keep all deployment logs on SQL Server and be able to output a list of everything that needs to be updated (automating updated might be a step to far in case there is a reason machines are out of step). This is just a screenshot :
All I've done is create a spreadsheet with conditional formatting on column B such that if the date in column B for any server (so B5 onwards) is before the date in B3 is shows the column as red. Not entirely scientific but what I've also noted in red text is that the version notes need to be reviewed before deploying to every server in the estate as potentially the update is minor can wait.

Enjoy.

No comments:

Post a Comment