Using Linear Fitting Model to Find Hidden Hazards in Test Environment Performance

Keywords: Database Python network encoding

Using Linear Fitting Model to Find Hidden Hazards in Test Environment Performance

A Classical Performance Problem

Several weeks after putting into operation, production events such as slow response of the system, high load of the database and unreleased connection of a large number of databases occurred in the production environment. The reason is that the data volume of a table keeps increasing after production, and online statements use full table scanning to access the table, resulting in longer execution time, increased waiting for database locks, continuous consumption of resources and number of triggers. Database performance issues. Such problems have plagued and annoyed us for the past few years. Of course, there are also many friends who spend a lot of tricks, such as restoring full production data, or eliminating all online full table scanning, but these methods are not very effective. First of all, the test environment has limited resources, especially not so much storage; secondly, on-line full table scanning does not violate the development specifications, and for a small amount of data full table scanning is very reasonable, so it is difficult to persuade development to make changes. So the problem ultimately has to be solved by testing itself.

Solution

"The sustained growth of performance indicators will inevitably lead to testing problems in the future." It should be a theorem in the field of performance testing. Based on this theorem, we customize a trend analysis model to identify indicators that have a sustained growth trend. The principle is very simple:

  1. Using self-developed database monitoring system to record all full table scanning statements occurring in the test environment
  2. The number of objects scanned by the whole table is counted and recorded once a day.
  3. Using python's first-order linear fitting model, the test and production environment records of the whole table scanning object are linearly fitted.
  4. According to the fitting results, the records with growth trend > 0 and fitting degree > 80% were counted.
  5. In-depth analysis based on sqlid in the records

Similarly, we also include the index of statement execution time, statement cost, operating system CPU, memory utilization, network connection number and so on. In this way, we can find out the hidden performance problems in the daily function testing, but there are no performance problems that have reached the alarm threshold.

Code

# encoding=gbk

#According to the records in the APP_SQL_COST table, statistics of COST and statements with increasing execution time are made.
import math
import cx_Oracle
import collections

#First-order linear fitting model
def linefit(x , y):
   N = float(len(x))
   sx,sy,sxx,syy,sxy=0,0,0,0,0
   for i in range(0,int(N)):
       sx  += x[i]
       sy  += y[i]
       sxx += x[i]*x[i]
       syy += y[i]*y[i]
       sxy += x[i]*y[i]
   a = (sy*sx/N -sxy)/( sx*sx/N -sxx)
   b = (sy - a*sx)/N
   if a==0:
       r=0
   else:
       r = abs(sy*sx/N-sxy)/math.sqrt((sxx-sx*sx/N)*(syy-sy*sy/N))
   return a,b,r

def func_table_full(appname,addr,dbsid):
   str0="SELECT OBJECT_OWNER, OBJECT_NAME FROM APP_SQL_PLAN_STAT WHERE APPNAME='"+APPNAME+"' AND ADDR='"+ADDR+"' AND DBSID='"+DBSID+"' GROUP BY OBJECT_OWNER, OBJECT_NAME"
   cursor0 = conn.cursor()
   cursor0.execute(str0)
   res0 = cursor0.fetchall()
   rowcount0 = len(res0)
   x=0
   result=[]
   while x<rowcount0:
       table_owner=res0[x][0]
       table_name=res0[x][1]
       str1="SELECT TABLE_NAME,rank() over(partition by TABLE_NAME order by EXEC_SEQ,CHECK_TIME,STAT_DATE) rk,nvl(rows_count,0) FROM APP_TABLE_ROWS WHERE  APPNAME='"+APPNAME+"' AND ADDR='"+ADDR+"' AND DBSID='"+DBSID+"' AND TABLE_NAME='"+table_name+"' AND OWNER='"+table_owner+"'"
       cursor1 = conn.cursor()
       cursor1.execute(str1)
       res1 = cursor1.fetchall()
       rowcount1 = len(res1)
       RK=[]
       ROWS=[]
       if rowcount1>=3:
           y = 0
           while y<rowcount1:
               RK.append(res1[y][1])
               ROWS.append(res1[y][2])
               y=y+1
           try:
               a,b,r=linefit(RK,ROWS)
           except:
               print APPNAME,ADDR,DBSID,ROWS
           if a>0 and r>0.8 :
               str2="INSERT INTO APP_GROWTH_MODEL VALUES('"+APPNAME+"','"+ADDR+"','"+DBSID+"','ROWS','"+table_owner+"."+table_name+"','"+str(max(ROWS))+"','"+str(a)+"','"+str(r)+"','"+str(len(RK))+"','')"
               cursor2 = conn.cursor()
               cursor2.execute(str2)
               cursor2.execute("commit")
       str2 = "SELECT TABLE_NAME,rank() over(partition by TABLE_NAME order by to_date(COLLECT_DATE,'YYYY-MM-DD')) rk,nvl(NUM_ROWS,0) FROM MCTEST.DB_SC_ROW_NUM WHERE   UPPER(DBSID)=UPPER('"+DBSID+"') AND TABLE_NAME='"+table_name+"' AND OWNER='"+table_owner+"'"
       cursor1 = conn.cursor()
       cursor1.execute(str2)
       res1 = cursor1.fetchall()
       rowcount1 = len(res1)
       RK=[]
       ROWS=[]
       if rowcount1>=3:
           y = 0
           while y<rowcount1:
               RK.append(res1[y][1])
               ROWS.append(res1[y][2])
               y=y+1
           try:
               a,b,r=linefit(RK,ROWS)
           except:
               print APPNAME,ADDR,DBSID,ROWS
           if a>0 and r>0.8 :
               str2="INSERT INTO APP_GROWTH_MODEL VALUES('"+APPNAME+"','"+ADDR+"','"+DBSID+"','SCROWS','"+table_owner+"."+table_name+"','"+str(max(ROWS))+"','"+str(a)+"','"+str(r)+"','"+str(len(RK))+"','')"
               #print str2
               cursor2 = conn.cursor()
               cursor2.execute(str2)
               cursor2.execute("commit")
       x=x+1



if __name__ == '__main__':
   distinct_count=3
   conn = cx_Oracle.connect('test/test@127.0.0.1/test')
   strdel="DELETE FROM APP_GROWTH_MODEL"
   cursordel=conn.cursor()
   cursordel.execute(strdel)
   cursordel.execute("commit")
   str2="SELECT APPNAME,ADDR,DBSID FROM APP_SQL_COST GROUP BY APPNAME,ADDR,DBSID"
   cursor2 = conn.cursor()
   cursor2.execute(str2)
   res2 = cursor2.fetchall()
   rowcount2 = len(res2)
   z=0
   while z<rowcount2:
       APPNAME=res2[z][0]
       ADDR=res2[z][1]
       DBSID=res2[z][2]
       table_full=func_table_full(APPNAME,ADDR,DBSID)
       str0="SELECT DISTINCT SQL_ID FROM APP_SQL_COST WHERE APPNAME='"+APPNAME+"' AND ADDR='"+ADDR+"' AND DBSID='"+DBSID+"'"
       cursor0 = conn.cursor()
       cursor0.execute(str0)
       res0 = cursor0.fetchall()
       rowcount0 = len(res0)
       x=0
       while x<rowcount0:
           SQL_ID=str(res0[x][0])
           str1="SELECT  SQL_ID,rank() over(partition by SQL_ID order by STAT_DATE,SNAP_ID,PLAN_HASH_VALUE) rk,nvl(cost,0),nvl(elapsed_time,0) FROM APP_SQL_COST WHERE APPNAME='"+APPNAME+"' AND ADDR='"+ADDR+"' AND DBSID='"+DBSID+"' AND SQL_ID='"+SQL_ID+"'"
           cursor1 = conn.cursor()
           cursor1.execute(str1)
           res1 = cursor1.fetchall()
           rowcount1 = len(res1)
           RK=[]
           COST=[]
           ELAPSED_TIME=[]
           if rowcount1>=3:
               y = 0
               while y<rowcount1:
                   #print res1[y][1],res1[y][2], res1[y][3]
                   RK.append(res1[y][1])
                   COST.append(res1[y][2])
                   ELAPSED_TIME.append(res1[y][3])
                   y=y+1
               try:
                   a,b,r=linefit(RK,COST)
               except:
                   print APPNAME,ADDR,DBSID,COST
               distinct_cost=len(set(COST))
               if a>0 and r>0.8 and distinct_cost>=distinct_count and max(COST)>1000:
                   str_ins="INSERT INTO APP_GROWTH_MODEL VALUES('"+APPNAME+"','"+ADDR+"','"+DBSID+"','COST','"+SQL_ID+"','"+str(max(COST))+"','"+str(a)+"','"+str(r)+"','"+str(len(RK))+"','')"
                   cursor_ins = conn.cursor()
                   cursor_ins.execute(str_ins)
                   cursor_ins.execute("commit")
               try:
                   a,b,r=linefit(RK,ELAPSED_TIME)
               except:
                   print APPNAME,ADDR,DBSID,ELAPSED_TIME
               distinct_time=len(set(ELAPSED_TIME))
               if a>0 and r>0.8 and distinct_time>=distinct_count  and max(ELAPSED_TIME)>60:
                   str_ins="INSERT INTO APP_GROWTH_MODEL VALUES('"+APPNAME+"','"+ADDR+"','"+DBSID+"','ELAPSED_TIME','"+SQL_ID+"','"+str(max(ELAPSED_TIME))+"','"+str(a)+"','"+str(r)+"','"+str(len(RK))+"','')"
                   cursor_ins = conn.cursor()
                   cursor_ins.execute(str_ins)
                   cursor_ins.execute("commit")
           x=x+1
       z=z+1

Posted by t_galan on Mon, 08 Apr 2019 22:42:31 -0700