Analysis Method of MSSQL Implementing Analysis Extend Event Log File

Keywords: SQL Session xml Programming

background

In the first two monthly reports, we share the method of implementing audit log function with SQL Server in June. Finally, we get the best option Extend Event from five aspects: reliability, object level, maintainability, overhead and impact on database system. In July monthly report, we quantitatively analyze the performance and swallow of SQL Server by using Extend Event to realize audit log function. The conclusion is that the impact of vomiting volume is about 0.01% on system performance and throughput. August monthly report sharing is the last issue of SQL Server audit log. This paper discusses the summary of analysis methods of Extend Event to realize audit log function, and the advantages and disadvantages of these methods.

June monthly report, details please stamp: Exploration on the Scheme of MSSQL, Realization Analysis and SQL Server Implementing Audit Log

July monthly report, details please stamp: The Impact of MSSQL Implementation Analysis Extend Event Implementing Audit Log on the Performance of SQL Server

Problem introduction

In order to be compatible with the version of SQL Server 2008 R2, we have slightly modified the extended event creation method to implement the audit log function as follows:

USE master
GO

CREATE EVENT SESSION [svrXEvent_User_Define_Testing] ON SERVER 
ADD EVENT sqlserver.sql_statement_completed
( 
	ACTION 
	( 
		sqlserver.database_id,
		sqlserver.session_id, 
		sqlserver.username, 
		sqlserver.client_hostname,
		sqlserver.client_app_name,
		sqlserver.sql_text, 
		sqlserver.plan_handle,
		sqlserver.tsql_stack,
		sqlserver.is_system,
		package0.collect_system_time
	) 
	WHERE sqlserver.username <> N'NT AUTHORITY\SYSTEM'
		AND sqlserver.username <> 'sa'
		AND sqlserver.is_system = 0		
)
ADD TARGET package0.asynchronous_file_target
( 
	SET 
		FILENAME = N'C:\Temp\svrXEvent_User_Define_Testing.xel', 
		METADATAFILE = N'C:\Temp\svrXEvent_User_Define_Testing.xem',
		MAX_FILE_SIZE = 10,
		MAX_ROLLOVER_FILES = 500
)
WITH (
	EVENT_RETENTION_MODE = NO_EVENT_LOSS,
	MAX_DISPATCH_LATENCY = 5 SECONDS,
    STARTUP_STATE=ON
);
GO


-- We need to enable event session to capture event and event data 
ALTER EVENT SESSION [svrXEvent_User_Define_Testing]
ON SERVER STATE = START;
GO

After the extended event is created and started, all sql_statement_completed event information occurring on the server side of the SQL Server database will be scrolled asynchronously by the extended event in the log file svrXEvent_User_Define_Testing.xel. The format of the log file is svrXEvent_User_Define_Testing_0_TimeStamp.xel, such as svrXEvent_User_Define_Testing_0_131465070445600.x.xel. El. Here we introduce the key issues of sharing this monthly report:

What analytical methods do audit logs have?

What are the advantages and disadvantages of these methods?

How should we choose the suitable method?

Using DMF

Extend Event (XE) uses an asynchronous way to write audit log records to the target log file, and each event is written to the log file in XML format, so we can use the dynamic management function sys.fn_xe_file_target_read_file provided by SQL Server to read and analyze the upgrade log file.

Full reading

Full audit log reading refers to the use of SQL Server DMF sys.fn_xe_file_target_read_file (path, mdpath, initial_file_name, initial_offset) without passing in initial_file_name and initial_offset. This method reads the contents of all audit log files in the audit log directory. For example, here is an example of using DMF to read all audit log files in full.

USE master
GO

SELECT *
FROM sys.fn_xe_file_target_read_file('C:\Temp\svrXEvent_User_Define_Testing*.xel', 
		'C:\Temp\svrXEvent_User_Define_Testing*.xem', null, null)

The results are as follows:

From this result, we can not clearly know which user executed which SQL statements at which time point, and how much time it took to execute. Here we need to use XML parsing to analyze the richer content in the Event_data field. Use the following query statement for more detailed information:

-- This is SQL 2008R2
;WITH events_cte
AS (
	SELECT
		[event_data] = T.C.query('.'),
		[event_name] = T.C.value('(event/@name)[1]','varchar(100)'),
		[event_time] = DATEADD(mi, DATEDIFF(mi, GETUTCDATE(), CURRENT_TIMESTAMP),T.C.value('(event/@timestamp)[1]','datetime2')),
		[client app name] = T.C.value('(event/action[@name="client_app_name"]/value/text())[1]', 'sysname'),
		[client host name] = T.C.value('(event/action[@name="client_hostname"]/value/text())[1]', 'sysname'),
		[database_id]= T.C.value('(event/action[@name="database_id"]/value/text())[1]', 'int'),
		[cpu time (ms)] = T.C.value('(event/data[@name="cpu"]/value/text())[1]', 'bigint'),
		[logical reads] = T.C.value('(event/data[@name="reads"]/value/text())[1]', 'bigint'),
		[logical writes] = T.C.value('(event/data[@name="writes"]/value/text())[1]', 'bigint'),
		[duration (ms)] = T.C.value('(event/data[@name="duration"]/value/text())[1]', 'bigint'),
		[row count] = T.C.value('(event/data[@name="row_count"]/value/text())[1]', 'bigint'),
		[sql_text] = T.C.value('(event/action[@name="sql_text"]/value/text())[1]','nvarchar(max)'),
		[session_id] = T.C.value('(event/action[@name="session_id"]/value/text())[1]','int'),
		[user_name] = T.C.value('(event/action[@name="username"]/value/text())[1]','sysname'),
		[is_system] = T.C.value('(event/action[@name="is_system"]/value/text())[1]','sysname'),
		[query_timestamp] = T.C.value('(event/action[@name="collect_system_time"]/value/text())[1]','bigint'),
		[query_time] = DATEADD(mi, DATEDIFF(mi, GETUTCDATE(), CURRENT_TIMESTAMP),T.C.value('(event/action[@name="collect_system_time"]/text/text())[1]','datetime2'))
	FROM sys.fn_xe_file_target_read_file('C:\Temp\svrXEvent_User_Define_Testing*.xel', 
		'C:\Temp\svrXEvent_User_Define_Testing*.xem', null, null)
		CROSS APPLY (SELECT CAST(event_data as XML) AS event_data) as T(C)
)
SELECT 
	
	cte.session_id,
	--cte.query_timestamp,
	--cte.[event_time],
	cte.[query_time],
	--cte.[event_name],
	cte.user_name,
	[database_name] = db.name,
	cte.[database_id],
	cte.[client host name],
	
	cte.[logical reads],
	cte.[logical writes],
	cte.[cpu time (ms)],
	cte.[duration (ms)],
	--cte.[plan_handle],
	cte.sql_text,
	sql_text_hash = CHECKSUM(cte.sql_text),
	cte.[client app name],
	cte.[event_data],
	cte.is_system
FROM events_cte as cte
	LEFT JOIN sys.databases as db
	on cte.database_id = db.database_id
ORDER BY [query_time] ASC
;

The implementation results are shown as follows:

From this result set, we can clearly know the details of each SQL statement execution, including user name, execution time point, client name, logical reading, logical writing, CPU consumption, execution time consumption, query statement details and other very important information.

Partial Reading

Using DMF sys.fn_xe_file_target_read_file (path, mdpath, initial_file_name, initial_offset) to realize audit log not only full reading, but also partial reading. I can pass in initial_file_name and initial_offset to realize specific offset (file content offset) from a log file. In order to reduce the size of audit log files read each time. For example, we started reading 94720 from the offset in the file C: Temp svrXEvent_User_Define_Testing_0_131471065758970000.xel by following methods:

USE master
GO

SELECT *
FROM sys.fn_xe_file_target_read_file('C:\Temp\svrXEvent_User_Define_Testing*.xel', 
		'C:\Temp\svrXEvent_User_Define_Testing*.xem', 'C:\Temp\svrXEvent_User_Define_Testing_0_131471065758970000.xel', 94720)

The results of the implementation are as follows:

Of course, you can also use the analysis method in "Full Read" to get some details of the audit log read, which will not be repeated here.

XEReader API

In addition to using DFM improved by SQL Server itself to analyze audit logs, we can also use XE Reader API to read audit log files programmatically. Starting from SQL Server 2012, under Shared (C: Program Files Microsoft SQL Server 110 Shared) directory, two dll files related to XEvent are provided, and the audit log files can be read using XEReader's API interface.
SQL 2012: Microsoft.SqlServer.XEvent.Linq.dll
SQL 2014: Microsoft.SqlServer.XEvent.Linq.dll and Microsoft.SqlServer.XE.Core.dll
SQL 2016: Microsoft.SqlServer.XEvent.Linq.dll and Microsoft.SqlServer.XE.Core.dll
Following is the Visual Studio 2015 programming tool, using C # programming language, written console application project, to see in detail how to use XEReader API to achieve the analysis of audit log files.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.SqlServer.XEvent.Linq;

namespace MonthlyShareExtendEventDemo
{
    class Program
    {
        static void Main(string[] args)
        {
            string[] xelFiles = new string[] { @"C:\Temp\svrXEvent_User_Define_Testing*.xel" };
            string[] xemFiles = new string[] { @"C:\Temp\svrXEvent_User_Define_Testing*.xem" };
            QueryableXEventData events = new QueryableXEventData(xelFiles, xemFiles);
            foreach (PublishedEvent evt in events)
            {
                Console.WriteLine("=>>>>>>>>>>>>>>>>>>" + evt.Name);

                foreach (PublishedEventField fld in evt.Fields)
                {
                    Console.WriteLine("\tField: {0} = {1}", fld.Name, fld.Value);
                }

                foreach (PublishedAction act in evt.Actions)
                {
                    Console.WriteLine("\tAction: {0} = {1}", act.Name, act.Value);
                }
                Console.WriteLine("=<<<<<<<<<<<<<<<" + evt.Name);
            }

            Console.ReadKey();

        }
    }
}

One of my screenshots yields an audit log as shown in the following figure:

Be careful:
To analyze audit logs using XEReader API, you need to rely on two installation packages: SQLSysClrTypes.msi and Shared Management Objects. msi. Please install them in advance.

Event stream read

Of course, we can also use the XEReader API event stream to read the audit log. When the client submits the query statement to the SQL Server back-end service, the event stream will capture the query behavior and analyze it. The example code is as follows:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.SqlServer.XEvent.Linq;

namespace XEStreamDemo
{
    class Program
    {
        static void Main(string[] args)
        {
            string connString = string.Format("Data Source=.,{0};Initial Catalog={1};Integrated Security=SSPI", 1433, "master");
            string xeSessionName = @"svrXEvent_User_Define_Testing";
            using (QueryableXEventData eventData = new QueryableXEventData(connString, xeSessionName, EventStreamSourceOptions.EventStream, EventStreamCacheOptions.DoNotCache))
            {
                foreach (PublishedEvent evt in eventData)
                {
                    Console.WriteLine("=>>>>>>>>>>>>>>>>>>" + evt.Name);

                    foreach (PublishedEventField fld in evt.Fields)
                    {
                        Console.WriteLine("\tField: {0} = {1}", fld.Name, fld.Value);
                    }

                    foreach (PublishedAction act in evt.Actions)
                    {
                        Console.WriteLine("\tAction: {0} = {1}", act.Name, act.Value);
                    }
                    Console.WriteLine("=<<<<<<<<<<<<<<<" + evt.Name);
                }
            }

            Console.ReadKey();
        }
    }
}

When a query is executed, the console application captures query statements executed by the SQL Server server server, as shown in the following screenshot:

Be careful:
The method of analyzing the audit log function of SQL Server based on event flow does not support the versions of SQL Server 2008 and SQL Server 2008 R2. The lowest version requirement is SQL Server 2012. Because "Invalid object name'sys.fn_MSxe_read_event_stream" exception information will be reported in the version below SQL Server 2012, the error information is as follows.
An unhandled exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
Additional information: Invalid object name 'sys.fn_MSxe_read_event_stream'.
The abnormal information screenshot is as follows:

Comparison of three methods

This chapter introduces the comparison of three audit log analysis methods. We will measure these three methods from the following perspectives:

Dependence on SQL Server Service

Analysis Delay

Stability

Impact on SQL Server

DMF

DMF sys.fn_xe_file_target_read_file is built-in object of SQL Server itself, so using this method to analyze audit log information does not need too much programming processing, the threshold is low, and even using SSMS directly can analyze audit log files. These are the advantages of using DMF to analyze audit logs. Of course, the disadvantages of this method are also obvious: using DMF to read audit logs requires connecting to the SQL Server service, so the SQL Server service itself is required to start, because this is achieved by using the built-in dynamic management function of SQL Server; and this analysis method needs to use SQL Server to parse event_data with XML operation technology, parsing XML is a CPU secret. Centralized operation consumes system CPU resources very much. In my previous test case, the use of DMF to analyze audit log details resulted in more than 50% additional CPU overhead. As shown in the following screenshot:

XEReader API

The method of using API provided by SQL Server XEReader to read audit log files is entirely based on the operation mode of audit log files and can be independent of the service of SQL Server. In other words, no matter whether the SQL Server is closed or started, the analysis of our audit log will not be affected. These are the advantages of using the XEReader API to analyze audit logs. This method also has its drawbacks: when we analyze the audit log files currently being written by Extend Event Session objects, we don't know (or it's hard to know) which records we have analyzed and which ones we haven't analyzed yet. If this problem can not be solved, it may lead to duplication or loss of audit log records. Of course, we can also use XE loop to write audit log files. Every time we read Archive's audit log files, we can skip the reading of the current files and wait for the current files to be full of fixed size. After Archive comes out, we can read and analyze them. Another problem introduced by this improvement is that it may lead to delays in the analysis of audit logs, and the delay time is uncertain. For example, if the user query fills up the current audit log file in 10 minutes, the delay is 10 minutes; if the user query fills up the current audit log file in 1 hour, the delay will be 1 hour.

Event stream read

The method of analyzing audit log based on user query event flow has obvious advantages: the delay is very small, it can be controlled in seconds, and the real-time performance is good. It solves the problem of XEReader API query event delay. However, the disadvantage is that you also need to rely on the start state of SQL Service, otherwise you will report exceptions; when a large number of queries are executed instantaneously (in a short time), such as when the user accidentally writes a dead-loop query, restart the state of SQL Service or Extend Event Session, according to the situation I tested, this scenario will lead to loss of audit log records, which is unreliable. To guarantee.

Final conclusion

Based on the summary of the advantages and disadvantages of the above three audit log analysis methods, we summarize the comprehensive scores as follows:

DMF: Dependent on SQL Service, with a score of 0; Delay depends on the mobility efficiency of Offset, with a score of 80; Stability is guaranteed, with a score of 100; It has a greater impact on the SQL Server CPU, with a score of 0;

XEReader API: No dependence on SQL Service, score 100; delay depends on the speed of query generation, score 50; stability is guaranteed, score 100; little impact on SQL Server, score 0;

XEReader Stream: Dependent on SQL Service, scoring 0; Very low latency, scoring 100; There are unstable scenarios, scoring 50; less impact on SQL Server, scoring 100;

The comprehensive score is made into radar map as follows:

From this summary chart, the XEReader API's method of directly analyzing audit log files performs best in terms of dependency, latency, stability and impact.

Reference articles

Introducing the Extended Events Reader

Posted by SilentQ-noob- on Sat, 01 Jun 2019 15:22:43 -0700