Creating SQL Server index fragmentation

In a few client environments, I have seen index monitoring and maintenance procedures triggered by fragmentation levels.  I have sometimes wondered how these procedures were tested when they were put in place, other than waiting for fragmentation to occur through normal operations.  Wouldn’t it be a good idea to have a process that generates a level of fragmentation to test these processes?

Here is a process that can be used as the basis for generating index fragmentation to properly test index monitoring and maintenance procedures.  I will create a test table, populate it with data, create an index on the test table, then execute a process that keeps inserting data into the test table until the desired level of index fragmentation is reached.

During testing, 100 rows were inserted into the test table initially, then each iteration of the fragmentation process inserted another 100 rows.  It took only two iterations through the process to produce 50 percent index fragmentation.  For more precise fragmentation, use a larger number of rows during the initial data insert.

NOTE – This process was created and tested on SQL Server 2016.  Be sure to set the database environment where you want to create the test table before executing these commands.

--Drop the test table if it exists
drop TABLE dbo.TestTable
GO

--Create the test table
CREATE TABLE dbo.TestTable
(
        RowID int IDENTITY(1,1),
	MyKeyField VARCHAR(10) NOT NULL,
	MyDate DATETIME NOT NULL,
	MyInt DATETIME NOT NULL,
	MyString VARCHAR(30) NOT NULL
)
GO

--Code to insert first rows into table 
DECLARE @RowCount INT
DECLARE @RowString VARCHAR(10)
DECLARE @Random INT
DECLARE @Upper INT
DECLARE @Lower INT
DECLARE @InsertDate DATETIME
DECLARE @s CHAR(30)

SET @Lower = -730
SET @Upper = -1
SET @RowCount = 0

WHILE @RowCount < 100  --Set this value to the number of rows desired
BEGIN
	SET @RowString = CAST(@RowCount AS VARCHAR(10))
	SELECT @Random = ROUND(((@Upper - @Lower -1) * RAND() + @Lower), 0)
	SET @InsertDate = DATEADD(dd, @Random, GETDATE())

SET @s = (
SELECT
	c1 AS [text()]
FROM
	(
	SELECT TOP (30) c1  --Change this value to the desired length of the resulting string.
	FROM
	  (
    VALUES
      ('A'), ('B'), ('C'), ('D'), ('E'), ('F'), ('G'), ('H'), ('I'), ('J'),
      ('K'), ('L'), ('M'), ('N'), ('O'), ('P'), ('Q'), ('R'), ('S'), ('T'),
      ('U'), ('V'), ('W'), ('X'), ('Y'), ('Z'), ('0'), ('1'), ('2'), ('3'),
      ('4'), ('5'), ('6'), ('7'), ('8'), ('9'), ('!'), ('@'), ('#'), ('$'),
      ('%'), ('&'), ('*'), ('('), (')'), ('?'), ('+'), ('-'), (','), ('.')
        ) AS T1(c1)
	ORDER BY ABS(CHECKSUM(NEWID()))
	) AS T2
FOR XML PATH('')
);

	INSERT INTO TestTable
		(MyKeyField
		,MyDate
		,MyInt
		,MyString)
	VALUES
		(REPLICATE('0', 10 - DATALENGTH(@RowString)) + @RowString
		, @InsertDate
		,RAND(@RowCount)
		,@s)

	SET @RowCount = @RowCount + 1
END
GO


--Verify number of row inserted into test table
select count(*)
from TestTable;


--Create index on test table
SET ANSI_PADDING ON
GO

CREATE NONCLUSTERED INDEX [dbo.TestTableIndex] ON [dbo].[TestTable]
(
	[MyString] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO

--Check initial fragmentation
select avg_fragmentation_in_percent
    from sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, 'LIMITED')
    where object_id = (select object_id from sys.tables where name = 'TestTable')
    AND index_type_desc = 'NONCLUSTERED INDEX'
GO


--Routine to loop through row inserts until fragmentation reaches desired level
DECLARE @FragEnd INT
DECLARE @FragPercent INT

SET @FragEnd = 30      --Set this number to degree of fragmentation desired 
SET @FragPercent = 0  
SET IDENTITY_INSERT [dbo].[TestTable] ON

WHILE @FragEnd > @FragPercent
BEGIN

  select @FragPercent = max(avg_fragmentation_in_percent)
  from sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, 'LIMITED')
  where object_id = (select object_id from sys.tables where name = 'TestTable')
  AND ALLOC_unit_type_desc = 'IN_ROW_DATA';

  insert into [dbo].[TestTable] ( [RowID],[MyKeyField],[MyDate],[MyInt],[MyString])
  select top(100) [RowID],[MyKeyField],[MyDate],[MyInt],[MyString]
  from dbo.TestTable;

END;
GO

 

Advertisements

SQL Server Migration Tool

I was recently asked to review a migration plan from SQL Server to Oracle. The actual export and import of the table data was broken into two separate tasks and the time projected was (in my opinion) overly long. I did not know the exact details of what steps were being performed during the export and import, but I immediately thought about improving the process by using the SQL Server Import and Export Wizard.

I first encountered the SQL Server Import and Export Wizard utility in the early days of working with SQL Server versions 7 and 2000 and before you ask, yes I am old.  I attended the Microsoft launch event for SQL Server 7 while living in Denver and even got the t-shirt!

Anyway, I was impressed with this tool even this early in its history and the fact that such a capable tool was bundled with SQL Server at no additional charge.  This is a practice that continues today.

The use of this tool is well-documented (https://docs.microsoft.com/en-us/sql/integration-services/import-export-data/import-and-export-data-with-the-sql-server-import-and-export-wizard?view=sql-server-2017), but I will give a brief overview in this post.

The SQL Server Import and Export Wizard can be accessed in a couple of ways.  The first is directly though the Start menu in Windows:

sql server mig tool 1

It is interesting to note that both 32 and 64-bit versions are available in this manner.  Another method of access is directly from the SQL Server Management Studio when right-clicking on a database, going to the tasks sub-menu, and selecting either import data or export data.

Once selected by either of these methods, the interface is the same:

sql server mig tool 2

Going to the next screen, you are prompted for a data source for your process.  The data sources range from .NET, to Access (remember it?), to ODBC data sources for Oracle and SQL Server.

sql server mig tool 3

Take note that Microsoft has a built-in OLE DB provider client for Oracle databases.  However, I have always found it easier to use the provider supplied by Oracle.  It would be necessary to install the Oracle client software on the Windows host to make this option available.  If using Oracle as a source or target for an import or export process, I tend to go through the Windows control panel and set up my connections before using the SQL Server Import and Export Wizard.  I found it easier, especially being familiar with Oracle networking.

The target options for the SQL Server Import and Export Wizard are similar to the source options.From here you get the idea that you can perform the import and export in one process, provided you get the target specifics taken care of first.

I recommend using this tool for your smaller data migration needs, even if you are not using SQL Server.  The drawback is that you will need access to SQL Server software to install this utility.

 

SQL Server Assessment Script

I finally get to cross an item off my back burner project list.  Last February, I was at a client assessing their entire SQL Server database footprint for migration to the cloud.  Not having done this kind of work before, I quickly cobbled together scripts that gave me a basic overview of each database.  I then combined the results into a spreadsheet.

Wouldn’t it have been nice to get this information in a single script?  You could use this script to not only get a better understanding of resource usage in your SQL Server instance, but also use it to determine needed resources for migrating to either other hardware or the cloud.

I have already produced a lengthy health check script (here) that gives quite a bit of information.  It did not, however, take measurements of CPU usage and IOPS values over a period of time.  My new SQL Server assessment script not only gathers a smaller but more critical set of information than my health check script, but also takes hourly snapshots of CPU usage and IOPS values including individual read and write values over the course of one day.  Execute this script during a normal working day or during a period of end-of-month processing, if it exists.

Here is the script in its latest form.  To adjust the CPU and IOPS snapshots taken and the interval between snapshots, edit these two lines:

while (@cntr) < 24 – This is the number of snapshots to take before results are produced.

waitfor delay ’01:00:00′  –  This is the time delay between snapshots.  It is currently set to one hour.

--sql_server_assessment.sql
 --Lists the resources and their usage for a single SQL Server instance
 --Intended to act as a tool for migrations to other hardware or the cloud.

use master
 GO
 ----------------------------------------------------------Node name

SELECT SERVERPROPERTY('ComputerNamePhysicalNetBIOS') AS [CurrentNodeName];
 ----------------------------------------------------------Product Version, level, and Edition

SELECT
 CASE
 WHEN CONVERT(VARCHAR(128), SERVERPROPERTY ('productversion')) like '8%' THEN 'SQL2000'
 WHEN CONVERT(VARCHAR(128), SERVERPROPERTY ('productversion')) like '9%' THEN 'SQL2005'
 WHEN CONVERT(VARCHAR(128), SERVERPROPERTY ('productversion')) like '10.0%' THEN 'SQL2008'
 WHEN CONVERT(VARCHAR(128), SERVERPROPERTY ('productversion')) like '10.5%' THEN 'SQL2008 R2'
 WHEN CONVERT(VARCHAR(128), SERVERPROPERTY ('productversion')) like '11%' THEN 'SQL2012'
 WHEN CONVERT(VARCHAR(128), SERVERPROPERTY ('productversion')) like '12%' THEN 'SQL2014'
 WHEN CONVERT(VARCHAR(128), SERVERPROPERTY ('productversion')) like '13%' THEN 'SQL2016'
 WHEN CONVERT(VARCHAR(128), SERVERPROPERTY ('productversion')) like '14%' THEN 'SQL2017'
 ELSE 'unknown'
 END AS MajorVersion,
 SERVERPROPERTY('ProductLevel') AS ProductLevel,
 SERVERPROPERTY('Edition') AS Edition,
 SERVERPROPERTY('ProductVersion') AS ProductVersion

----------------------------------------------------------License information

print N'License information';

SELECT SERVERPROPERTY('LicenseType') AS LICENSE_TYPE,
 ISNULL(SERVERPROPERTY('NumLicenses'),0) AS NUM_LICENCES;
 ----------------------------------------------------------Database listing

print N'Database list with Status and Recovery Model';
 ----------------------------------------------------------Logical CPUs
 select cpu_count as logical_cpu_count
 from sys.dm_os_sys_info;
 ----------------------------------------------------------Physical Memory

--For SQL Server version 2008, 2008R2
 select physical_memory_in_bytes/1024/1024/1024 as GB
 from sys.dm_os_sys_info;

--For SQL Server version > 2008
 select physical_memory_kb/1024/1024 as GB
 from sys.dm_os_sys_info;

----------------------------------------------------------Memory usage per database

print N'Memory Usage per User Database';
 SELECT
 substring(DB_NAME(database_id),1,40) AS [Database Name]
 ,COUNT(*) * 8/1024.0 AS [Cached Size (MB)]
 FROM
 sys.dm_os_buffer_descriptors
 WHERE
 database_id > 4
 AND database_id <> 32767
 AND db_name(database_id) <> 'SSISDB'
 GROUP BY DB_NAME(database_id)
 ORDER BY [Cached Size (MB)] DESC OPTION (RECOMPILE);

----------------------------------------------------------Database file size and growth settings

print N'Size and Growth';
 select substring(b.name,1,40) AS DB_Name, substring(a.name,1,40) AS Logical_name,
 substring(a.filename,1,100) AS File_Name,
 cast((a.size * 8.00) / 1024 as numeric(12,2)) as DB_Size_in_MB,
 case when a.growth > 100 then 'In MB' else 'In Percentage' end File_Growth,
 cast(case when a.growth > 100 then (a.growth * 8.00) / 1024
 else (((a.size * a.growth) / 100) * 8.00) / 1024
 end as numeric(12,2)) File_Growth_Size_in_MB,
 case when ( maxsize = -1 or maxsize=268435456 ) then 'AutoGrowth Not Restricted' else 'AutoGrowth Restricted' end AutoGrowth_Status
 from sysaltfiles a
 join sysdatabases b on a.dbid = b.dbid
 where DATABASEPROPERTYEX(b.name, 'status') = 'ONLINE'
 order by b.name;

----------------------------------------------------------Database data and log file total size
 SELECT
 SUM(CASE WHEN type = 0 THEN MF.size * 8 / 1024.0 /1024.0 ELSE 0 END) +
 SUM(CASE WHEN type = 1 THEN MF.size * 8 / 1024.0 /1024.0 ELSE 0 END) AS Total_GB
 FROM
 sys.master_files MF
 JOIN sys.databases DB ON DB.database_id = MF.database_id
 WHERE DB.source_database_id is null;
 ----------------------------------------------------------License feature usage

IF OBJECT_ID('tempdb.dbo.##enterprise_features') IS NOT NULL
 DROP TABLE ##enterprise_features

CREATE TABLE ##enterprise_features
 (
 dbname SYSNAME,
 feature_name VARCHAR(100),
 feature_id INT
 )

EXEC sp_msforeachdb
 N' USE [?]
 IF (SELECT COUNT(*) FROM sys.dm_db_persisted_sku_features) >0
 BEGIN
 INSERT INTO ##enterprise_features
 SELECT dbname=DB_NAME(),feature_name,feature_id
 FROM sys.dm_db_persisted_sku_features
 END '
 SELECT *
 FROM ##enterprise_features;

----------------------------------------------------------CPU and IOPS for 24 hours (avg. work day or end of month)

create table #CPU (snap_time varchar(30),
 row_num int,
 db_name varchar(50),
 cpu_ms int,
 cpu_pct int)

create table #IOPS (snap_time varchar(30),
 db_id int,
 db_name varchar(50),
 reads int,
 writes int,
 total_io int)

declare @cntr int
 set @cntr = 0
 while (@cntr) < 24 --set number of 1-hour loops to perform here.
 begin;

WITH DB_CPU_Stats
 AS
 (
 SELECT convert(varchar, getdate(), 120) AS ts,
 DatabaseID, DB_Name(DatabaseID) AS [DatabaseName],
 SUM(total_worker_time) AS [CPU_Time_Ms]
 FROM sys.dm_exec_query_stats AS qs
 CROSS APPLY (
 SELECT CONVERT(int, value) AS [DatabaseID]
 FROM sys.dm_exec_plan_attributes(qs.plan_handle)
 WHERE attribute = N'dbid') AS F_DB
 GROUP BY DatabaseID
 )
 insert into #CPU (row_num,snap_time,db_name,cpu_ms,cpu_pct)
 SELECT ROW_NUMBER() OVER(ORDER BY [CPU_Time_Ms] DESC) AS [row_num],
 ts, DatabaseName,
 [CPU_Time_Ms],
 CAST([CPU_Time_Ms] * 1.0 / SUM([CPU_Time_Ms]) OVER() * 100.0 AS DECIMAL(5, 2)) AS [CPUPercent]
 FROM DB_CPU_Stats
 --WHERE DatabaseID > 4 -- system databases
 --AND DatabaseID <> 32767 -- ResourceDB
 ORDER BY row_num OPTION (RECOMPILE);

WITH D
 AS(
 SELECT
 convert(varchar, getdate(), 120) AS ts,
 database_id,DB_NAME(database_id) AS Name,
 SUM(num_of_reads) AS IO_Reads,
 SUM(num_of_writes) AS IO_Writes,
 SUM(num_of_reads + num_of_writes) AS Total_IO
 FROM sys.dm_io_virtual_file_stats(NULL, NULL)
 GROUP BY database_id,DB_NAME(database_id))
 insert into #IOPS (snap_time,db_id,db_name,reads,writes,total_io)
 SELECT
 D.ts, D.database_id, D.Name,
 SUM(D.IO_Reads) - SUM(L.num_of_reads) AS Data_Reads,
 SUM(D.IO_Writes) - SUM(L.num_of_writes) AS Data_Writes,
 SUM(D.IO_Reads+D.IO_Writes) - SUM(L.num_of_reads+L.num_of_writes) AS Total_IO
 FROM D
 JOIN sys.dm_io_virtual_file_stats(NULL, 2) L
 ON D.database_id = L.database_id
 GROUP BY D.ts,D.database_id, D.Name
 ORDER BY D.database_id;

waitfor delay '01:00:00' --change this to determine the interval between snapshots
 set @cntr = @cntr + 1

end

select db_name, avg(cpu_pct) as AVG_CPU
 from #CPU
 where db_name is not null
 group by db_name
 order by db_name;

select DB_NAME, snap_time, MAX(cpu_pct) as MAX_CPU
 from #CPU
 group by db_name, snap_time
 having MAX(cpu_pct) > 0
 order by MAX(cpu_pct) desc

select db_name, min(total_io) AS MIN_IO, max(total_io) AS MAX_IO, avg(total_io) AS AVG_IO
 from #IOPS
 group by db_name
 order by db_name;

select snap_time, db_name, reads, writes, total_io
 from #IOPS
 where total_io = (select max(total_io) from #IOPS);

drop table #CPU;

drop table #IOPS;

Keep your Oracle binaries!

I am a pack rat.  I admit it.  I rarely throw away old software or IT books.  I still have Oracle version 6 books.  The last time I got rid of software was when I threw out 3 boxes of 3.5″ floppy drives.  I still have old versions of Windows and other software on CD, though.  Not DVD, but CD.  Anyone need a copy of Windows NT 3.51?

My point is that we typically don’t think about preserving Oracle installation media files after the installation is concluded.  More often they are removed to reclaim space in a staging area with the thought that we can always download them again.

After the installation, flash forward to the point where upgrades have not occurred and the hardware is starting to show its age.  Then, just before the work day begins, a failure on a legacy storage appliance causes all the production databases to go down.

If you are lucky, there is a disaster recovery plan in place.  You are even luckier if it has been tested.  At the end of the failure, there are going to be lessons learned and plan revisions.

I recently experienced just such an occurrence at a client site.  A legacy storage appliance did fail, taking the production databases with it.  You are probably thinking that something like this very rarely, if ever, happens but I have seen network storage appliance failure of this kind twice within the last five years at different places and with different storage appliance makers.

After the initial smoke cleared, three options for recovery were available.  In order of preference, they were recover the storage appliance, rebuild servers and databases and recover from backups, and fail over to a disaster recovery site.

While the first option was in progress, it was discovered that the disaster recovery site option would only bring designated critical systems back online, but not every production database.  It was further discovered that this option would only allow fail over and not fail back and not allow running systems from sites other than the DR site.  As such, this option was abandoned.

This left the option of rebuilding servers and databases and recovering from backups while the storage appliance was being worked on.  Unfortunately, a recent acquisition had brought many databases running on unsupported versions of Oracle into the environment.  Now there was a scramble to secure the installation media for these versions.

Installation media is available through Oracle Technology Network (https://www.oracle.com/technetwork) or Oracle Software Delivery Cloud (https://edelivery.oracle.com).  However, as of the writing of this blog, versions 11g release 1 and 10g release 2 are no longer available for direct download.  You can request previous versions of the installation media using the instructions in My Oracle Support (MOS) note 1071023.1.  Oracle provides this statement (links included) on the Oracle Technology Network site when viewing Oracle Database software downloads:

Oracle Database 10.2 and 11.1 are no longer available for download. The software is available as a media or FTP request for those customers who own a valid Oracle Database product license for any edition. To request access to these releases, follow the instructions in Oracle Support Document 1071023.1 (Requesting Physical Shipment or Download URL for Software Media) from My Oracle Support. NOTE: for Oracle Database 10.2, you should request 10.2.0.1 even if you want to install a later patch set. Once you install 10.2.0.1 you can then apply any 10.2 patch set. Similarly, for 11.1 request 11.1.0.6 which must be applied before installing 11.1.0.7. Patch sets can be downloaded from the Patches and Updates tab on My Oracle Support.

In most of the recent client sites I have been to, virtual machines are the standard, replacing individual servers.  These VMs are hosted on large appliances that allow near instantaneous recovery in the event of a failure.  In non-virtual environments, I have seen backups performed at the OS level, where the Oracle binaries are typically installed.  In either of these cases, it may not be necessary to keep copies of the Oracle installation media.

My recommendation is to keep not only the installation media of every Oracle software product in your environment, but also keep copies of all patches applied.  With storage as cheap and accessible as it is now, there is no  real reason not to.  You never know when you will need it again.

SQL Server Health Check v3

My health check script continues to evolve.

–sql_server_health_check_v3.sql
–v3
–This is a non-intrusive script written to give an overall informational view of a
–SQL Server 2005-2016(CTP3) server instance for the purpose of first-time review.

–Version 3 additions
–User statistics identification
–License information
–Database data and log size with total

/*
Information gathered
-Node name
-License information
-Product Version, level, and Edition
-Instance parameters
-Start time and up time
-Database listing
-Database uptime
-Database data and log size with total
-Database file size and growth settings
-Is this a Cluster Node?
-Nodes in Cluster
-Is AlwaysOn enabled (2012 and above)?
-AlwaysOn status
-Memory usage per database
-Memory usage of in-memory OLTP tables
-Last backup time per database
-No log backups for FULL or BULK_LOGGED recovery model databases in last 30 days
-Databases with no backups at all in the last 30 says
-Backups for the previous week per database
-Jobs that failed in the last 24 hours
-Missing indexes
-Duplicate indexes
-High index fragmentation check
-Wait stats
-Users and roles
-Job information
-Existing linked server listing
-User statistics identification

*/
use master
GO
———————————————————-Node name

SELECT SERVERPROPERTY(‘ComputerNamePhysicalNetBIOS’) AS [CurrentNodeName];
———————————————————-Product Version, level, and Edition

SELECT SERVERPROPERTY(‘productversion’) AS ProductVersion,
SERVERPROPERTY (‘productlevel’) AS ProductLevel
SERVERPROPERTY (‘edition’) AS Edition;

———————————————————-License information

print N’License information’;

SELECT SERVERPROPERTY(‘LicenseType’) AS LICENSE_TYPE,
ISNULL(SERVERPROPERTY(‘NumLicenses’),0) AS NUM_LICENCES;

———————————————————-Instance parameters

print N’Instance Parameter’;
SELECT *
FROM sys.configurations
ORDER BY name ;
———————————————————-Database listing

print N’Database list with Status and Recovery Model’;

SELECT substring(name,1,40) AS name, substring(state_desc,1,20) AS STATE,
substring(recovery_model_desc,1,20) AS RECOVERY_MODEL
FROM sys.databases
order by name;

———————————————————-Database startup time

print N’Start time’;

SELECT DATEADD(ms,-sample_ms,GETDATE() )AS StartTime
FROM sys.dm_io_virtual_file_stats(1,1);
———————————————————-Database start time uptime

print N’Up time’;
DECLARE @server_start_time DATETIME,
@seconds_diff INT,
@years_online INT,
@days_online INT,
@hours_online INT,
@minutes_online INT,
@seconds_online INT ;

SELECT @server_start_time = login_time
FROM master.sys.sysprocesses
WHERE spid = 1 ;

SELECT @seconds_diff = DATEDIFF(SECOND, @server_start_time, GETDATE()),
@years_online = @seconds_diff / 31536000,
@seconds_diff = @seconds_diff % 31536000,
@days_online = @seconds_diff / 86400,
@seconds_diff = @seconds_diff % 86400,
@hours_online = @seconds_diff / 3600,
@seconds_diff = @seconds_diff % 3600,
@minutes_online = @seconds_diff / 60,
@seconds_online = @seconds_diff % 60 ;

SELECT @server_start_time AS server_start_time,
@years_online AS years_online,
@days_online AS days_online,
@hours_online AS hours_online,
@minutes_online AS minutes_online,
@seconds_online AS seconds_online ;
SELECT substring(name,1,40) AS name, substring(state_desc,1,20) AS STATE,
substring(recovery_model_desc,1,20) AS RECOVERY_MODEL
FROM sys.databases
order by name;

———————————————————-Database data and log size with total

print N’Data and Log Size with Total’;
with fs
as
(
select database_id, type, size * 8.0 / 1024 size
from sys.master_files
)
select
name,
(select sum(size) from fs where type = 0 and fs.database_id = db.database_id) DataFileSizeMB,
(select sum(size) from fs where type = 1 and fs.database_id = db.database_id) LogFileSizeMB,
(select sum(size) from fs where type = 0 and fs.database_id = db.database_id) +
(select sum(size) from fs where type = 1 and fs.database_id = db.database_id) AS Total_MB
from sys.databases db;

–Grand total of ALL data and log files as one value
select SUM(size*8.0)/1024 AS Total_MB
from sys.master_files;
———————————————————-Database file size and growth settings

print N’Size and Growth’;
select substring(b.name,1,40) AS DB_Name, substring(a.name,1,40) AS Logical_name,
substring(a.filename,1,100) AS File_Name,
cast((a.size * 8.00) / 1024 as numeric(12,2)) as DB_Size_in_MB,
case when a.growth > 100 then ‘In MB’ else ‘In Percentage’ end File_Growth,
cast(case when a.growth > 100 then (a.growth * 8.00) / 1024
else (((a.size * a.growth) / 100) * 8.00) / 1024
end as numeric(12,2)) File_Growth_Size_in_MB,
case when ( maxsize = -1 or maxsize=268435456 ) then ‘AutoGrowth Not Restricted’ else ‘AutoGrowth Restricted’ end AutoGrowth_Status
from sysaltfiles a
join sysdatabases b on a.dbid = b.dbid
where DATABASEPROPERTYEX(b.name, ‘status’) = ‘ONLINE’
order by b.name;
———————————————————-Is this a Cluster Node?

SELECT ‘Clustered’, case when SERVERPROPERTY(‘IsClustered’) = 0 then ‘No’
else ‘Yes’ end;

———————————————————-Nodes in Cluster

print N’Cluster Nodes’;

SELECT * FROM fn_virtualservernodes();
———————————————————-Is AlwaysOn enabled (2012 and above)?

SELECT ‘AlwaysOn’, case when SERVERPROPERTY(‘IsHadrEnabled’) = 0 then ‘No’
when SERVERPROPERTY(‘IsHadrEnabled’) = 1 then ‘Yes’
else SERVERPROPERTY(‘IsHadrEnabled’) end;
———————————————————-AlwaysOn status

declare @c int;
declare @rd nvarchar(60);
declare @osd nvarchar(60);
declare @rhd nvarchar(60);
declare @shd nvarchar(60);
declare @csd nvarchar(60);
select @c = COUNT(name)
from sys.all_objects
where name = ‘dm_hadr_availability_replica_states’;
if @c = 0
print N’No AlwaysOn Status’;
else
select @rd = role_desc, @osd= case when operational_state_desc is null then ‘Replica is not local’
else operational_state_desc end,
@rhd = recovery_health_desc, @shd = synchronization_health_desc,
@csd = connected_state_desc
from sys.dm_hadr_availability_replica_states;
print @rd
print @osd
print @rhd
print @shd
print @csd
———————————————————-Memory usage per database

print N’Memory Usage per User Database’;
SELECT
substring(DB_NAME(database_id),1,40) AS [Database Name]
,COUNT(*) * 8/1024.0 AS [Cached Size (MB)]
FROM
sys.dm_os_buffer_descriptors
WHERE
database_id > 4
AND database_id <> 32767
AND db_name(database_id) <> ‘SSISDB’
GROUP BY DB_NAME(database_id)
ORDER BY [Cached Size (MB)] DESC OPTION (RECOMPILE);

———————————————————-Memory usage of in-memory OLTP tables

print N’In-memory OLTP table usage’;
SELECT object_name(object_id) AS Name, *
FROM sys.dm_db_xtp_table_memory_stats;
———————————————————-Last backup time per database

SELECT substring(sdb.Name,1,40) AS DatabaseName,
COALESCE(CONVERT(VARCHAR(12), MAX(bus.backup_finish_date), 101),’-‘) AS LastBackUpTime
FROM sys.sysdatabases sdb
LEFT OUTER JOIN msdb.dbo.backupset bus ON bus.database_name = sdb.name
where sdb.Name <> ‘tempdb’
GROUP BY sdb.Name;
———————————————————-No log backups for FULL or BULK_LOGGED recovery model databases in last 30 days
print N’Databases with FULL or BULK_LOGGED recovery model and no log backups in last 30 days’;
SELECT name AS at_risk_database
FROM sys.databases
where recovery_model_desc in(‘FULL’,’BULK_LOGGED’)
and name not in(
SELECT
msdb.dbo.backupset.database_name AS DBName
FROM msdb.dbo.backupmediafamily
INNER JOIN msdb.dbo.backupset ON msdb.dbo.backupmediafamily.media_set_id = msdb.dbo.backupset.media_set_id
WHERE (CONVERT(datetime, msdb.dbo.backupset.backup_start_date, 102) >= GETDATE() – 30)
and msdb..backupset.type = ‘L’
group by msdb.dbo.backupset.database_name
);

———————————————————-Databases with no backups at all in the last 30 says

print N’Databases with NO backups in last 30 days’;

SELECT name AS at_risk_database
FROM sys.databases
where name <> ‘tempdb’
and name not in(
SELECT
substring(msdb.dbo.backupset.database_name,1,40) AS DBName
FROM msdb.dbo.backupmediafamily
INNER JOIN msdb.dbo.backupset ON msdb.dbo.backupmediafamily.media_set_id = msdb.dbo.backupset.media_set_id
WHERE (CONVERT(datetime, msdb.dbo.backupset.backup_start_date, 102) >= GETDATE() – 30)
group by msdb.dbo.backupset.database_name
);
———————————————————-Backups for the previous week per database

print N’All backups for previous week’;

SELECT
CONVERT(CHAR(40), SERVERPROPERTY(‘Servername’)) AS Server,
substring(msdb.dbo.backupset.database_name,1,40) AS DBName,
msdb.dbo.backupset.backup_start_date,
msdb.dbo.backupset.backup_finish_date,
msdb.dbo.backupset.expiration_date,
CASE msdb..backupset.type
WHEN ‘D’ THEN ‘Database’
WHEN ‘L’ THEN ‘Log’
WHEN ‘F’ THEN ‘File’
WHEN ‘P’ THEN ‘Partial’
WHEN ‘I’ THEN ‘Differential database’
WHEN ‘G’ THEN ‘Differential file’
WHEN ‘Q’ THEN ‘Differential partial’
WHEN NULL THEN msdb..backupset.type
END AS backup_type,
msdb.dbo.backupset.backup_size,
substring(msdb.dbo.backupmediafamily.logical_device_name,1,50) AS logical_device_name,
substring(msdb.dbo.backupmediafamily.physical_device_name,1,50) AS physical_device_name,
substring(msdb.dbo.backupset.name,1,50) AS backupset_name,
substring(msdb.dbo.backupset.description,1,50) AS description
FROM msdb.dbo.backupmediafamily
INNER JOIN msdb.dbo.backupset ON msdb.dbo.backupmediafamily.media_set_id = msdb.dbo.backupset.media_set_id
WHERE (CONVERT(datetime, msdb.dbo.backupset.backup_start_date, 102) >= GETDATE() – 7)
ORDER BY
msdb.dbo.backupset.database_name,
msdb.dbo.backupset.backup_finish_date;
———————————————————-Jobs that failed in the last 24 hours

print N’Jobs Failing in last 24 hours’;

———————————————————-Variable Declarations
DECLARE @PreviousDate datetime
DECLARE @Year VARCHAR(4)
DECLARE @Month VARCHAR(2)
DECLARE @MonthPre VARCHAR(2)
DECLARE @Day VARCHAR(2)
DECLARE @DayPre VARCHAR(2)
DECLARE @FinalDate INT
———————————————————-Initialize Variables
SET @PreviousDate = DATEADD(dd, -1, GETDATE()) –Last 1 day
SET @Year = DATEPART(yyyy, @PreviousDate)
SELECT @MonthPre = CONVERT(VARCHAR(2), DATEPART(mm, @PreviousDate))
SELECT @Month = RIGHT(CONVERT(VARCHAR, (@MonthPre + 1000000000)),2)
SELECT @DayPre = CONVERT(VARCHAR(2), DATEPART(dd, @PreviousDate))
SELECT @Day = RIGHT(CONVERT(VARCHAR, (@DayPre + 1000000000)),2)
SET @FinalDate = CAST(@Year + @Month + @Day AS INT)
–Final Logic
SELECT substring(j.[name],1,40) AS JOB,
substring(s.step_name,1,40) AS Step,
h.step_id,
substring(h.step_name,1,40) AS Step,
h.run_date,
h.run_time,
h.sql_severity,
substring(h.message,1,100) AS Message,
h.server
FROM msdb.dbo.sysjobhistory h
INNER JOIN msdb.dbo.sysjobs j
ON h.job_id = j.job_id
INNER JOIN msdb.dbo.sysjobsteps s
ON j.job_id = s.job_id AND h.step_id = s.step_id
WHERE h.run_status = 0 –Failure
AND h.run_date > @FinalDate
ORDER BY h.instance_id DESC;

———————————————————-Missing indexes

print N’Missing Indexes’;
SELECT substring(so.name,1,40) AS Name
, (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) as Impact
, ddmid.equality_columns
, ddmid.inequality_columns
, ddmid.included_columns
FROM sys.dm_db_missing_index_group_stats AS ddmigs
INNER JOIN sys.dm_db_missing_index_groups AS ddmig
ON ddmigs.group_handle = ddmig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS ddmid
ON ddmig.index_handle = ddmid.index_handle
INNER JOIN sys.objects so WITH (nolock)
ON ddmid.object_id = so.object_id
WHERE ddmigs.group_handle IN (
SELECT TOP (5000) group_handle
FROM sys.dm_db_missing_index_group_stats WITH (nolock)
ORDER BY (avg_total_user_cost * avg_user_impact)*(user_seeks+user_scans)DESC);

———————————————————-Duplicate indexes

print N’Duplicate Indexes’;

DECLARE @SCHEMANAME VARCHAR(30);
DECLARE @TABLENAME VARCHAR(127);
WITH ind_list AS(
select o.schema_id, i.object_id, i.index_id,
i.name, i.type_desc,
i.is_unique, i.is_primary_key,
STUFF( (SELECT ‘,’ + tc.name
FROM sys.index_columns ic
JOIN sys.columns tc
ON tc.column_id = ic.column_id AND
tc.object_id = ic.object_id
WHERE ic.object_id = i.object_id AND
ic.index_id = i.index_id
AND ic.is_included_column = 0
ORDER BY ic.index_column_id
FOR XML PATH (”) ),1,1,” ) index_columns,
STUFF( (SELECT ‘,’ + tc.name
FROM sys.index_columns ic
JOIN sys.columns tc
ON tc.column_id = ic.column_id AND
tc.object_id = ic.object_id
WHERE ic.object_id = i.object_id AND
ic.index_id = i.index_id
AND ic.is_included_column = 1
ORDER BY ic.index_column_id
FOR XML PATH (”) ),1,1,” ) include_columns
FROM sys.indexes i
JOIN sys.objects o ON o.object_id = i.object_id
WHERE i.index_id > 0 AND i.type_desc <> ‘XML’
AND object_name(i.object_id) LIKE @TABLENAME
AND i.is_disabled = 0
AND schema_name(o.schema_id) LIKE @SCHEMANAME )
SELECT substring(schema_name(included_indexes.schema_id),1,30) AS owner,
object_name(included_indexes.object_id) table_name,
(SELECT SUM(st.row_count) FROM sys.dm_db_partition_stats st
WHERE st.object_id = included_indexes.object_id
AND st.index_id < 2 ) num_rows,
included_indexes.name included_index_name,
included_indexes.index_columns included_index_columns,
included_indexes.include_columns
included_index_include_columns,
included_indexes.type_desc included_index_type,
included_indexes.is_unique included_index_uniqueness,
included_indexes.is_primary_key included_index_PK,
(SELECT SUM(a.total_pages) * 8 FROM sys.allocation_units a
JOIN sys.partitions p ON a.container_id = p.partition_id
WHERE p.object_id = included_indexes.object_id AND
p.index_id = included_indexes.index_id
) included_index_size_kb,
including_indexes.name including_index_name,
including_indexes.index_columns including_index_columns,
including_indexes.include_columns
including_index_include_columns,
including_indexes.type_desc including_index_type,
including_indexes.is_unique including_index_uniqueness,
including_indexes.is_primary_key including_index_PK,
(SELECT SUM(a.total_pages) * 8 FROM sys.allocation_units a
JOIN sys.partitions p ON a.container_id = p.partition_id
WHERE p.object_id = including_indexes.object_id AND
p.index_id = including_indexes.index_id
) including_index_size_kb
FROM ind_list included_indexes
JOIN ind_list including_indexes
ON including_indexes.object_id = included_indexes.object_id
JOIN sys.partitions ing_p
ON ing_p.object_id = including_indexes.object_id AND
ing_p.index_id = including_indexes.index_id
JOIN sys.allocation_units ing_a
ON ing_a.container_id = ing_p.partition_id
WHERE including_indexes.index_id <> included_indexes.index_id
AND LEN(included_indexes.index_columns) <=
LEN(including_indexes.index_columns)
AND included_indexes.index_columns + ‘,’ =
SUBSTRING(including_indexes.index_columns,1,
LEN(included_indexes.index_columns + ‘,’))
ORDER BY 2 DESC;

———————————————————-High index fragmentation check

print N’Index with HIGH Fragmentation’;

EXEC sp_MSforeachdb ‘
USE [?]
SELECT ”?” AS DB_NAME,
QUOTENAME(sysind.name) AS [index_name],
indstat.*
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, ”LIMITED”)
AS indstat
INNER JOIN sys.indexes sysind ON indstat.object_id = sysind.object_id AND
indstat.index_id = sysind.index_id
where avg_fragmentation_in_percent >= 30
ORDER BY avg_fragmentation_in_percent DESC;

use master
GO
———————————————————-Wait stats

print N’Wait Stats’;

SELECT *
FROM sys.dm_os_wait_stats
where wait_time_ms > 10000
ORDER BY wait_time_ms DESC;
———————————————————-Users and roles

print N’Users and Roles’;

WITH Roles_CTE(Role_Name, Username)
AS
(
SELECT
User_Name(sm.[groupuid]) as [Role_Name],
user_name(sm.[memberuid]) as [Username]
FROM [sys].[sysmembers] sm
)

SELECT
Roles_CTE.Role_Name,
[DatabaseUserName] = princ.[name],
[UserType] = CASE princ.[type]
WHEN ‘S’ THEN ‘SQL User’
WHEN ‘U’ THEN ‘Windows User’
WHEN ‘G’ THEN ‘Windows Group’
WHEN ‘A’ THEN ‘Application Role’
WHEN ‘R’ THEN ‘Database Role’
WHEN ‘C’ THEN ‘User mapped to a certificate’
WHEN ‘K’ THEN ‘User mapped to an asymmetric key’
END
FROM
sys.database_principals princ
JOIN Roles_CTE on Username = princ.name
where princ.type in (‘S’, ‘U’, ‘G’, ‘A’, ‘R’, ‘C’, ‘K’)
ORDER BY princ.name;
———————————————————-Job information

print N’Job Information’;
SELECT [JobName] = [jobs].[name]
,[Category] = [categories].[name]
,[Owner] = SUSER_SNAME([jobs].[owner_sid])
,[Enabled] = CASE [jobs].[enabled] WHEN 1 THEN ‘Yes’ ELSE ‘No’ END
,[Scheduled] = CASE [schedule].[enabled] WHEN 1 THEN ‘Yes’ ELSE ‘No’ END
,[Description] = [jobs].[description]
,[Occurs] =
CASE [schedule].[freq_type]
WHEN 1 THEN ‘Once’
WHEN 4 THEN ‘Daily’
WHEN 8 THEN ‘Weekly’
WHEN 16 THEN ‘Monthly’
WHEN 32 THEN ‘Monthly relative’
WHEN 64 THEN ‘When SQL Server Agent starts’
WHEN 128 THEN ‘Start whenever the CPU(s) become idle’
ELSE ”
END
,[Occurs_detail] =
CASE [schedule].[freq_type]
WHEN 1 THEN ‘O’
WHEN 4 THEN ‘Every ‘ + CONVERT(VARCHAR, [schedule].[freq_interval]) + ‘ day(s)’
WHEN 8 THEN ‘Every ‘ + CONVERT(VARCHAR, [schedule].[freq_recurrence_factor]) + ‘ weeks(s) on ‘ +
LEFT(
CASE WHEN [schedule].[freq_interval] & 1 = 1 THEN ‘Sunday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 2 = 2 THEN ‘Monday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 4 = 4 THEN ‘Tuesday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 8 = 8 THEN ‘Wednesday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 16 = 16 THEN ‘Thursday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 32 = 32 THEN ‘Friday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 64 = 64 THEN ‘Saturday, ‘ ELSE ” END ,
LEN(
CASE WHEN [schedule].[freq_interval] & 1 = 1 THEN ‘Sunday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 2 = 2 THEN ‘Monday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 4 = 4 THEN ‘Tuesday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 8 = 8 THEN ‘Wednesday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 16 = 16 THEN ‘Thursday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 32 = 32 THEN ‘Friday, ‘ ELSE ” END +
CASE WHEN [schedule].[freq_interval] & 64 = 64 THEN ‘Saturday, ‘ ELSE ” END
) – 1
)
WHEN 16 THEN ‘Day ‘ + CONVERT(VARCHAR, [schedule].[freq_interval]) + ‘ of every ‘ + CONVERT(VARCHAR, [schedule].[freq_recurrence_factor]) + ‘ month(s)’
WHEN 32 THEN ‘The ‘ +
CASE [schedule].[freq_relative_interval]
WHEN 1 THEN ‘First’
WHEN 2 THEN ‘Second’
WHEN 4 THEN ‘Third’
WHEN 8 THEN ‘Fourth’
WHEN 16 THEN ‘Last’
END +
CASE [schedule].[freq_interval]
WHEN 1 THEN ‘ Sunday’
WHEN 2 THEN ‘ Monday’
WHEN 3 THEN ‘ Tuesday’
WHEN 4 THEN ‘ Wednesday’
WHEN 5 THEN ‘ Thursday’
WHEN 6 THEN ‘ Friday’
WHEN 7 THEN ‘ Saturday’
WHEN 8 THEN ‘ Day’
WHEN 9 THEN ‘ Weekday’
WHEN 10 THEN ‘ Weekend Day’
END + ‘ of every ‘ + CONVERT(VARCHAR, [schedule].[freq_recurrence_factor]) + ‘ month(s)’
ELSE ”
END
,[Frequency] =
CASE [schedule].[freq_subday_type]
WHEN 1 THEN ‘Occurs once at ‘ +
STUFF(STUFF(RIGHT(‘000000’ + CONVERT(VARCHAR(8), [schedule].[active_start_time]), 6), 5, 0, ‘:’), 3, 0, ‘:’)
WHEN 2 THEN ‘Occurs every ‘ +
CONVERT(VARCHAR, [schedule].[freq_subday_interval]) + ‘ Seconds(s) between ‘ +
STUFF(STUFF(RIGHT(‘000000’ + CONVERT(VARCHAR(8), [schedule].[active_start_time]), 6), 5, 0, ‘:’), 3, 0, ‘:’) + ‘ and ‘ +
STUFF(STUFF(RIGHT(‘000000’ + CONVERT(VARCHAR(8), [schedule].[active_end_time]), 6), 5, 0, ‘:’), 3, 0, ‘:’)
WHEN 4 THEN ‘Occurs every ‘ +
CONVERT(VARCHAR, [schedule].[freq_subday_interval]) + ‘ Minute(s) between ‘ +
STUFF(STUFF(RIGHT(‘000000’ + CONVERT(VARCHAR(8), [schedule].[active_start_time]), 6), 5, 0, ‘:’), 3, 0, ‘:’) + ‘ and ‘ +
STUFF(STUFF(RIGHT(‘000000’ + CONVERT(VARCHAR(8), [schedule].[active_end_time]), 6), 5, 0, ‘:’), 3, 0, ‘:’)
WHEN 8 THEN ‘Occurs every ‘ +
CONVERT(VARCHAR, [schedule].[freq_subday_interval]) + ‘ Hour(s) between ‘ +
STUFF(STUFF(RIGHT(‘000000’ + CONVERT(VARCHAR(8), [schedule].[active_start_time]), 6), 5, 0, ‘:’), 3, 0, ‘:’) + ‘ and ‘ +
STUFF(STUFF(RIGHT(‘000000’ + CONVERT(VARCHAR(8), [schedule].[active_end_time]), 6), 5, 0, ‘:’), 3, 0, ‘:’)
ELSE ”
END
,[AvgDurationInSec] = CONVERT(DECIMAL(10, 2), [jobhistory].[AvgDuration])
,[Next_Run_Date] =
CASE [jobschedule].[next_run_date]
WHEN 0 THEN CONVERT(DATETIME, ‘1900/1/1’)
ELSE CONVERT(DATETIME, CONVERT(CHAR(8), [jobschedule].[next_run_date], 112) + ‘ ‘ +
STUFF(STUFF(RIGHT(‘000000’ + CONVERT(VARCHAR(8), [jobschedule].[next_run_time]), 6), 5, 0, ‘:’), 3, 0, ‘:’))
END
FROM [msdb].[dbo].[sysjobs] AS [jobs] WITh(NOLOCK)
LEFT OUTER JOIN [msdb].[dbo].[sysjobschedules] AS [jobschedule] WITh(NOLOCK)
ON [jobs].[job_id] = [jobschedule].[job_id]
LEFT OUTER JOIN [msdb].[dbo].[sysschedules] AS [schedule] WITh(NOLOCK)
ON [jobschedule].[schedule_id] = [schedule].[schedule_id]
INNER JOIN [msdb].[dbo].[syscategories] [categories] WITh(NOLOCK)
ON [jobs].[category_id] = [categories].[category_id]
LEFT OUTER JOIN
( SELECT [job_id], [AvgDuration] = (SUM((([run_duration] / 10000 * 3600) +
(([run_duration] % 10000) / 100 * 60) +
([run_duration] % 10000) % 100)) * 1.0) / COUNT([job_id])
FROM [msdb].[dbo].[sysjobhistory] WITh(NOLOCK)
WHERE [step_id] = 0
GROUP BY [job_id]
) AS [jobhistory]
ON [jobhistory].[job_id] = [jobs].[job_id];
———————————————————-Existing linked server listing
print N’Linked Server Information’;
declare @x int;
select @x = COUNT(name)
from sys.all_objects
where name = ‘Servers’;
if @x <> 0
SELECT *
–c.name, provider, data_source, is_remote_login_enabled, b.modify_date
FROM sys.Servers a
LEFT OUTER JOIN sys.linked_logins b ON b.server_id = a.server_id
LEFT OUTER JOIN sys.server_principals c ON c.principal_id = b.local_principal_id
where a.server_id <> 0;
else
exec sp_linkedservers;
———————————————————-User statistics identification

print N’User statistics’;

select s.name as STATS_NAME, SCHEMA_NAME(ob.Schema_id) AS SCHEMA_NAME, OBJECT_NAME(s.object_id) AS TABLE_NAME
FROM sys.stats s
INNER JOIN sys.Objects ob ON ob.Object_id = s.object_id
WHERE SCHEMA_NAME(ob.Schema_id) <> ‘sys’
AND Auto_Created = 0 AND User_Created = 1;

Evaluating a SQL Server environment

This is a topic with more than one meaning, so I will narrow the focus to the aspect of just looking at an existing SQL Server environment for the first time. If you are experiencing performance issues with an existing environment, I recommend you start with the Microsoft (MS) SQL Server performance monitor to concentrate your effort instead of looking at the overall SQL Server. This is especially true if there are only a few problematic processes.

It is good to start asking basic questions for better understanding of the environment to evaluate. A short checklist of the items to inquire about before proceeding includes:

  • Is this a 24×7 environment?
  • What are the peak times of usage and critical availability?
  • Are there any current performance concerns?
  • What is the current backup configuration?
  • When was the last time a recovery was performed? Practice recovery? Cloning using backups?

This information can help you determine the best approach for your evaluation like the best times to execute any of the more resource-intensive examinations. There are many tools to choose from when evaluating a SQL Server environment. Some of these are MS native tools and some are third-party. As a precaution, verify the presence of a recent backup or even execute a one-off full backup before evaluation.

The most recognized MS built-in tool for evaluation is DBCC CHECKDB which is included with every SQL Server version. DBCC CHECKDB is a Transact SQL based utility that checks the logical and physical integrity of all the objects in the specified database. This process also calls other DBCC programs and performs these steps while executing:

  • Runs DBCC CHECKALLOC on the database to check the consistency of disk space allocation structures.
  • Runs DBCC CHECKTABLE on every table and view in the database to check the integrity of all the pages and structures that make table or indexed views.
  • Runs DBCC CHECKCATALOG on the database to check for catalog consistency within the database.
  • Validates the contents of every indexed view in the database.
  • Validates link-level consistency between table metadata and file system directories and files when storing varbinary(max) data in the file system using FILESTREAM.
  • Validates the Service Broker data in the database.

The DBCC CHECKDB utility includes options to repair issues it finds, but I do not recommend the use of these options until any existing issues have been identified and understood. All of these checks and validations can take a long time to execute against a sizeable database, so there are options to exclude one or more of the above steps for time and resource considerations. MS SQL Server versions 2014 through 2017 even offer the ability to execute this utility with a parallel degree. Consult the MS DBCC CHECKDB documentation for the version you are evaluating for more specific information.

Basic execution (from a SQL Server Management Studio prompt):

use <db_name>

go

dbcc checkdb

go

 

Note – For larger database executions, use the “Results To” option under the Query menu item in SQL Server Management Studio to spool the results for later review.

Brent Ozar, a well-regarded SQL Server expert, created a tool called the First Responder Kit, or sp_Blitz for short.   It is available for free from https://www.brentozar.com/first-aid/ and provides a list of its findings prioritized for immediate action or informational purposes. The lower the number a finding is given, the higher the priority. This tool installs required stored procedures and an output table in the database. However, it is free and you can use parameters at execution to avoid checking certain objects or ignore findings over a certain priority level.

Basic execution (from a SQL Server Management Studio prompt):

After downloading the First Responder’s Kit compressed archive and placing it in an accessible location, execute the sp_blitz.sql file to install the database objects. Then, from a SQL Server command prompt, execute:

sp_blitz

go

Note – By default, this script examines ALL databases present on the current SQL Server and groups the findings by order of priority, sorting by database name.

My own SQL Server Health Check script (https://wordpress.com/post/dbajonblog.wordpress.com/178), which is also free, may not have the maturity of some other tools, but it does not require any objects to be installed and provides a good overall picture of a database environment. It is a single script that can be executed the same as a standard SQL script.

 

Backups triggered automatically

Database availability is the number one goal of the database administrator.  Backup and recovery should be the second goal.  An internet search on establishing database backup and recovery processes will produce many examples.  I won’t go into my personal preferences, just emphasize the points I think are important:

  • Backup and recovery processes MUST be documented for reference by personnel responsible for maintenance and execution. I find that the lack of backup and recovery documentation is fairly common out in the world. No DBA likes to try to do an undocumented recovery at 3AM.
  • Backups processes that do not include a check for successful completion are INVALID. If the backup has been completing with an error for a month then guess what, you last valid backup was a month ago and if you had to do a recovery today, you just lost a month’s worth of data.
  • Recovery processes that have not been tested by performing a practice recovery or a clone are INVALID. You may have a proven backup method, but it has not been proven until you do a recovery in your own environment.
  • If there is available time and resources, add an export to your backup process. Exports are good for granular recovery of metadata and objects such as indexes and PL/SQL code.

That said, even solid backup processes can get blind-sided by unexpected activity in the database that produces vastly more redo activity then is normally seen.  It only takes one instance of the dreaded “ORA-00257: archiver error” to send personnel from support through management looking for or demanding answers and remediation steps.

The work around for these situations is usually increasing either the amount of space in the archived log destination or increasing the frequency of backups.  While either of these are fairly easy to implement, they each have areas of concern that should be considered prior to implementation.

Increasing the amount of space in the archived log location could be costly based on amount and type of storage used.  Also, if the surge of redo is not a common occurrence, all the additional space will be unused, essentially wasted.  Increasing the backup frequency may not cost in terms of additional storage, but in database resources needed to perform the additional backups, especially when they occur during normal business hours.

When a database is configured to use the fast recovery area (FRA) for backups and flashback, Oracle automatically monitors and alerts on its space usage.  Oracle 11.2 and 12.2 documentation both have the same information on these alerts:

The database issues a warning alert when reclaimable space is 
less than 15% and a critical alert when reclaimable space is 
less than 3%. To warn the DBA of this condition, an entry is 
added to the alert log and to the DBA_OUTSTANDING_ALERTS table 
(used by Enterprise Manager). Nevertheless, the database 
continues to consume space in the fast recovery area until 
there is no reclaimable space left.

When the recovery area is completely full, the error displayed 
is as follows, where nnnnn is the number of bytes required and 
mmmmm is the disk quota:

ORA-19809: limit exceeded for recovery files ORA-19804: cannot 
reclaim nnnnn bytes disk space from mmmmm limit

From <https://docs.oracle.com/cd/E11882_01/backup.112/e10642/rcmmaint.htm>
The database issues a warning alert when reclaimable space is 
less than 15% and a critical alert when reclaimable space is 
less than 3%. To warn the DBA of this condition, an entry is 
added to the alert log and to the DBA_OUTSTANDING_ALERTS table 
(used by Enterprise Manager). Nevertheless, the database 
continues to consume space in the fast recovery area until 
there is no reclaimable space left.

When the recovery area is completely full, the error displayed 
is as follows, where nnnnn is the number of bytes required and 
mmmmm is the disk quota:

ORA-19809: limit exceeded for recovery files ORA-19804: cannot 
reclaim nnnnn bytes disk space from mmmmm limit

From <https://docs.oracle.com/en/database/oracle/oracle-database/12.2/bradv/maintaining-rman-backups.html>

A practical database administrator should already be monitoring the alert log for errors and warnings.  Why not configure these monitoring processes to automatically trigger backups if this error is encountered?  One of my favorite alert log monitor scripts sends an email when an ORA- error is found in the alert log and sends a page or text message to designated parties if the error involves some type of corruption:

#

ERRMSG="Missing command line parameters"

if [ "$1" ]; then
        export ORACLE_SID=$1
else
        echo $ERRMSG
        exit 1
fi

if [ "$2" ]; then
        export ALERT_LOG_DIR=$2
else
        echo $ERRMSG
        exit 1
fi

            LOGFILE="${ALERT_LOG_DIR}/alert_${ORACLE_SID}.log"
            ERRFILE="${ALERT_LOG_DIR}/alert_error.log"
            CHECK_TIMEB="`date +%m/%d/%y-%H:%M`"
            SAVFILE="${ALERT_LOG_DIR}/alert_${ORACLE_SID}_`date '+%y%m%d-%H%M'`.log"
            GROUPID="SAPADMIN1.BASIS_ON_CALL"
            RECIP="youremail@sendwordnow.com"
            echo "  "
            echo "**"
            echo "**** $ORACLE_SID"
            echo "**"
            echo " "
            if [ -f $LOGFILE ]; then
               if [ `egrep -i "ORA-|corrupt" "${LOGFILE}" | wc -l` -eq 0 ]; then
                  echo "No errors found in alert log "
               else
                  egrep -i "ORA-|corrupt" $LOGFILE > $ERRFILE
                  echo " " >> $ERRFILE
                  echo "  Found the above error(s) in the alert log for ${ORACLE_SID}" >> $ERRFILE
                  echo "  Alert Log with error(s) saved as ${SAVFILE}" >> $ERRFILE
                  echo " " >> $ERRFILE
                  echo " " >> $ERRFILE
                  mailx -s "ALERT-Errors found in ${ORACLE_SID} Alert Log" youremail@yourcompany.com < $ERRFILE
                  if [ `egrep -i "corrupt" "${ERRFILE}" | wc -l` -gt 0 ]; then
                    mailx -s "ALERT - CORRUPTION DETECTED IN ${ORACLE_SID} ALERT LOG" YourPager@yourcompany.com < $ERRFILE
                  fi
                  mv $LOGFILE $SAVFILE
               fi
            else
               echo "No alert log found for ${ORACLE_SID} "
            fi
exit

You could easily add a line to call a separate backup script from this script.  I usually set my alert log monitor to execute once an hour, but increasing the frequency of execution will help catch redo space problems and not cost in terms of resource or space usage.  Also,  having an automatically-triggered backup will save time diagnosing the issue when an archiver ‘out of space’ problem occurs.

The drawback here is that you must have FRA enabled to utilize the alert functionality.  What if usage of the FRA is not preferred?  You could use the alert log monitor script above to trigger a backup when an archiver error is seen, but that would mean the database is already in the process of halting activity when the backup is triggered.  It is better to be pre-emptive in these types of alerts by setting up monitoring to alert of a problem before an alert log message is generated.

For example, here is a snippet of code for a Linux/Unix shell script that determines how much space is available in GB for a certain filesystem.  If that amount of space falls below a certain value, a message is generated:

#!/bin/sh

space_avail="`df -k | grep /archive | awk '{print $4}'`"

if [ `expr $space_avail / 1048576` -lt 40  ]; then
echo Backup needed
fi

You could easily replace the echo command with a call to a backup script.  After testing, this script could be called at an interval of your choosing from the crontab.

If you have your backup location in ASM, you can use this next piece of code to accomplish the same thing by querying the desired diskgroup:

#!/bin/sh

free_mb=`$CRS_HOME/bin/sqlplus -s "/ as sysasm"<<EOF
set heading off
select free_mb
from v\\$asm_diskgroup
where name = 'DATA'
order by name;
exit
EOF`

if [ `expr $free_mb / 1024` -lt 40  ]; then
echo Backup needed
fi

With a few simple steps, you can be proactive about excessive redo activity in your database.