SQL SERVER - Talking about the Solution of Deadlock Monitoring and Analysis

Keywords: SQL Server Session SQL xml less

1 background

1.1 Alarm

Recently, I have been sorting out my notes and intend to migrate them all to EVERNOTE. Tidy up to the lock part, there happens to be a case recorded by myself, reorganize and share it to you.

One day at noon, received alarm messages, DB deadlock abnormal, one minute deadlock 120.

The deadlock xml file is as follows:

 1 <deadlock-list>
 2 <deadlock victim="process810b00cf8">
 3 <process-list>
 4 <process id="process810b00cf8" taskpriority="0" logused="0" waitresource="RID: 13:1:1541136:62" waittime="7682" ownerId="3396587959" transactionname="UPDATE" lasttranstarted="2016-01-08T12:03:51.067" XDES="0xa99746d08" lockMode="U" schedulerid="41" kpid="17308" status="suspended" spid="108" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2016-01-08T12:03:51.067" lastbatchcompleted="2016-01-08T12:03:51.067" lastattention="1900-01-01T00:00:00.067" clientapp="Microsoft SQL Server Management Studio - query" hostname="test-server" hostpid="1433" loginname="xinysu" isolationlevel="read committed (2)" xactid="3396587959" currentdb="13" lockTimeout="4294967295" clientoption1="671098976" clientoption2="390200">
 5 <executionStack>
 6 <frame procname="adhoc" line="7" stmtstart="214" stmtend="484" sqlhandle="0x020000003acf4f010561e479685209fb09a7fd15239977c60000000000000000000000000000000000000000">
 7 UPDATE FinanceReceiptNoRule SET NowSeqValue=@ReturnNum,ISRUNNING='0',LastWriteTime=GETDATE() WHERE IsRunning='1' AND SeqCode=@SeqCode </frame>
 8 </executionStack>
 9 <inputbuf>
10 declare @SeqCode varchar(60)
11 declare @ReturnNum bigint
12 set @SeqCode='CGJS20160106'
13 while(1=1)
14 begin
15 UPDATE FinanceReceiptNoRule SET NowSeqValue=@ReturnNum,ISRUNNING='0',LastWriteTime=GETDATE() WHERE IsRunning='1' AND SeqCode=@SeqCode
16 end </inputbuf>
17 </process>
18 <process id="process18fd5d8cf8" taskpriority="0" logused="248" waitresource="KEY: 13:72057594040090624 (b3ade7c5980c)" waittime="4" ownerId="3396522828" transactionname="user_transaction" lasttranstarted="2016-01-08T12:03:05.310" XDES="0x18c1db63a8" lockMode="U" schedulerid="57" kpid="16448" status="suspended" spid="161" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2016-01-08T12:03:58.737" lastbatchcompleted="2016-01-08T12:03:33.847" lastattention="2016-01-08T12:03:33.850" clientapp="Microsoft SQL Server Management Studio - query" hostname="test-server" hostpid="1433" loginname="xinysu" isolationlevel="read committed (2)" xactid="3396522828" currentdb="13" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
19 <executionStack>
20 <frame procname="adhoc" line="6" stmtstart="210" stmtend="400" sqlhandle="0x020000001b4f23368af7bba99098c10dec46585804f1b4ce0000000000000000000000000000000000000000">
21 Update dbo.FinanceReceiptNoRule Set [IsRunning]='1' where SeqCode=@SeqCode and IsRunning='0' </frame>
22 </executionStack>
23 <inputbuf>
24 declare @SeqCode varchar(60)
25 declare @ReturnNum bigint
26 set @SeqCode='CGJS20160106'
27 while(1=1)
28 begin
29 Update dbo.FinanceReceiptNoRule Set [IsRunning]='1' where SeqCode=@SeqCode and IsRunning='0' 
30 end
31 </inputbuf>
32 </process>
33 </process-list>
34 <resource-list>
35 <ridlock fileid="1" pageid="1541136" dbid="13" objectname="fin_test.dbo.FinanceReceiptNoRule" id="lock51e8a3980" mode="X" associatedObjectId="72057594040025088">
36 <owner-list>
37 <owner id="process18fd5d8cf8" mode="X" />
38 </owner-list>
39 <waiter-list>
40 <waiter id="process810b00cf8" mode="U" requestType="wait" />
41 </waiter-list>
42 </ridlock>
43 <keylock hobtid="72057594040090624" dbid="13" objectname="fin_test.dbo.FinanceReceiptNoRule" indexname="PK_FINANCERECEIPTNORULE" id="lock7b2c6bc80" mode="U" associatedObjectId="72057594040090624">
44 <owner-list>
45 <owner id="process810b00cf8" mode="U" />
46 </owner-list>
47 <waiter-list>
48 <waiter id="process18fd5d8cf8" mode="U" requestType="wait" />
49 </waiter-list>
50 </keylock>
51 </resource-list>
52 </deadlock>
53 </deadlock-list>

The table structure and simulation data are as follows:

 1 --Relevant forms:
 2 CREATE TABLE [dbo].[FinanceReceiptNoRule](
 3 [SeqCode] [varchar](60) NOT NULL,
 4 [NowSeqValue] [bigint] NULL,
 5 [SeqDate] [varchar](14) NOT NULL,
 6 [IsRunning] [varchar](1) NULL,
 7 [LastWriteTime] [datetime] NULL,
 8 [Prefix] [varchar](4) NULL
 9 ) ON [PRIMARY]
10 GO
11 --Data simulation
12 INSERT [dbo].[FinanceReceiptNoRule] ([SeqCode], [NowSeqValue], [SeqDate], [IsRunning], [LastWriteTime], [Prefix]) VALUES (N'TEST20150108', 1469, N'20150108', N'0', CAST(N'2015-01-08 05:05:49.163' AS DateTime), N'TEST')
13 GO
14 INSERT [dbo].[FinanceReceiptNoRule] ([SeqCode], [NowSeqValue], [SeqDate], [IsRunning], [LastWriteTime], [Prefix]) VALUES (N'TEST20150109', 1377, N'20150109', N'0', CAST(N'2015-01-09 04:50:26.610' AS DateTime), N'TEST')
15 GO
16  
17 ALTER TABLE [dbo].[FinanceReceiptNoRule] ADD CONSTRAINT [pk_FinanceReceiptNoRule] PRIMARY KEY NONCLUSTERED 
18 (
19 [SeqCode] ASC
20 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
21 GO

1.2 How to monitor

There are many ways to capture deadlocks. Here we introduce two ways: the SQL SERVER Profiler tool and Extended Events. Profiler is relatively resource-intensive, but because it only monitors deadlock, its performance impact is not great, and its visual interface is easy to use; Extended Events consumes less resources, and records the penultimate deadlock in real time, and requires SQL statements to analyze the query record file.

How to use Profiler monitoring?
Open SSMS, click Tools, and select SQL Server Profiler, as shown below.

Log in to the DB instance that needs to be monitored and fill in the corresponding tracking attributes. First, the < General > page, as shown below. Here we pay attention to two aspects: first, choose the < TSQL-Locks > template, which can be used to monitor deadlocks, also can be used to observe the lock application and release situation, very detailed, you can use SELECT UPDATE DELETE and other statements to see the lock application and release situation; second, monitoring resu lt s storage, the proposal can be stored in a form to facilitate regular points. Analysis and statistics.

Then fill in the < Event Selection > item, just select < deadlock graph > Events, no other ticks are needed, and finally click on the run to start monitoring.

 

A common example for ten thousand years can be used to check whether the monitoring is normal, open three query windows, and execute in the following order will result in resource occupation and application mutually exclusive deadlock. When the fifth step is completed, the deadlock will occur while waiting for 1-3 seconds. The script is provided as follows:

 1 --session 1
 2 CREATE TABLE Test_DL(
 3 id int not null primary key ,
 4 name varchar(100));
 5 
 6 INSERT INTO Test_DL(id,name) select 1,'a';
 7 INSERT INTO Test_DL(id,name) select 2,'b';
 8 
 9 --session2 2 2 2 2 2 2 2 2 2 
10 BEGIN TRANSACTION
11 UPDATE Test_DL SET Name='a-test' WHERE ID=1
12 
13 --session3 3 3 3 3 3 3 3 3 3 
14 BEGIN TRANSACTION
15 UPDATE Test_DL SET Name='b-test' WHERE ID=2
16 
17 --session2 2 2 2 2 2 2 2 2 2 
18  SELECT * FROM Test_DL WHERE ID=2
19 
20 --session3 3 3 3 3 3 3 3 3 3
21  SELECT * FROM Test_DL WHERE ID=1
Simulated deadlock SQL

The deadlock Interface monitored is as follows:

How to use Extended Events for monitoring?

The script for establishing extended event monitoring is as follows: (Extended event is very good, the 2012 version supports visual operation, and you can learn about it on MSDN if you are interested: https://msdn.microsoft.com/zh-cn/library/bb630282.aspx, this article will not analyze the grammar and other knowledge points)

1 CREATE EVENT SESSION [DeadLock] ON SERVER 
2 ADD EVENT sqlserver.xml_deadlock_report 
3 ADD TARGET package0.event_file(SET filename=N'F:\events\deadlock\deadlock.xel',max_file_size=(20)),
4 ADD TARGET package0.ring_buffer(SET max_events_limit=(100),max_memory=(10240),occurrence_number=(50))
5 WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
6 GO

Query SQL as follows, here we need to note: query is based on buffer or file analysis, the number of buffer storage is generally limited, such as the above we only allocated 4M storage, file analysis is complete, but it depends on the number of retained files. Here we give buffer's query SQL as follows. File's query is interesting and can be written down by hand.

DECLARE @deadlock_xml XML
SELECT @deadlock_xml=(
                       SELECT
                              (
                                SELECT
                                      CONVERT(XML, target_data)
                                FROM sys.dm_xe_session_targets st
                                JOIN sys.dm_xe_sessions s ON s.address = st.event_session_address
                                WHERE s.name = 'deadlock' AND st.target_name = 'ring_buffer'
                              ) AS [x]
                       FOR XML PATH('') , TYPE
                      )

SELECT dateadd(hour,+6,tb.col.value('@timestamp[1]','varchar(max)')) TimePoint, tb.col.value('(data/value/deadlock/process-list/process/executionStack/frame)[1]','VARCHAR(MAX)') statement_parameter_k, tb.col.value('(data/value/deadlock/process-list/process/executionStack/frame)[2]','VARCHAR(MAX)') statement_k, tb.col.value('(data/value/deadlock/process-list/process/executionStack/frame)[3]','VARCHAR(MAX)') statement_parameter, tb.col.value('(data/value/deadlock/process-list/process/executionStack/frame)[4]','VARCHAR(MAX)') [statement], tb.col.value('(data/value/deadlock/process-list/process/@waitresource)[1]','VARCHAR(MAX)') waitresource_k, tb.col.value('(data/value/deadlock/process-list/process/@waitresource)[2]','VARCHAR(MAX)') waitresource, tb.col.value('(data/value/deadlock/process-list/process/@isolationlevel)[1]','VARCHAR(MAX)') isolationlevel_k, tb.col.value('(data/value/deadlock/process-list/process/@isolationlevel)[2]','VARCHAR(MAX)') isolationlevel, tb.col.value('(data/value/deadlock/process-list/process/@waittime)[1]','VARCHAR(MAX)') waittime_k, tb.col.value('(data/value/deadlock/process-list/process/@waittime)[2]','VARCHAR(MAX)') waittime, tb.col.value('(data/value/deadlock/process-list/process/@clientapp)[1]','VARCHAR(MAX)') clientapp_k, tb.col.value('(data/value/deadlock/process-list/process/@clientapp)[2]','VARCHAR(MAX)') clientapp, tb.col.value('(data/value/deadlock/process-list/process/@hostname)[1]','VARCHAR(MAX)') hostname_k, tb.col.value('(data/value/deadlock/process-list/process/@hostname)[2]','VARCHAR(MAX)') hostname FROM @deadlock_xml.nodes('//event') as tb(col)

This SQL can query very detailed resource contention situation, if you want to effectively use extended events, we recommend that you take a detailed look at the xml grammar of the official website (SQL SERVER support for xml is also great, looking forward to the 2016 version of json support)

Is it very clear, clear at a glance, with this can be analyzed!?

2 Analysis

Depending on the content of the xml file or the monitoring content of the extended event, you can organize the following information (the deadlock analysis at the beginning):

Look at the execution plans for transaction 1 and transaction 2 as follows:

 

Combining tables and implementation plans, deadlock processes can be roughly inferred:

Conversation 1:

During this process, Session 2 happens to apply for such a lock:

Suppose that at this point, Session 1 performs an update operation (in the same transaction):

Then a deadlock occurs (see figure below for details):

3 solution

Without RID lookup, the idea of finding data directly by index will not lead to this deadlock, that is, to re-establish the aggregated index on the primary key and discard the original non-aggregated index primary key. Because this excludes the application and holding of U-lock in RID, it keeps X-lock until the end of the transaction, and can modify the data page where the key value is located directly according to the primary key, thus reducing the time of inquiring rows in RID.

The revised implementation plan is as follows:

The process of applying for release of the lock is as follows (see screenshot for details):

Posted by blyz on Sat, 13 Apr 2019 09:18:33 -0700