Use Python to check whether other programs are stuck

Keywords: Python Java Windows encoding

I. demand scenario:

A large number of windows are running Java programs for a certain task, and the program will appear nameless card owner. The utilization rate of system resources is normal when the card owner is in charge, and the program is just that the card owner will not give any error prompt, so it is often found when there is a problem in the business and then it will be checked back, or someone will be sent to patrol regularly.

2. Ideas:

It is observed that after the program card master stops outputting the log, the log and system time can be compared regularly. Exceeding the specified threshold value is regarded as an exception, and the program can be restarted immediately (alas, there are many programs with various problems but no maintenance, so you can only paste the dog skin plaster)

III. practice:

# -*- coding: utf-8 -*-
import os
import time
import datetime
import subprocess
from dateutil.parser import parse


def error1(fun):
    def error2(doc):
        try:
            fun(doc)
        except Exception as reason:
            print(reason)
    return error2
    
def read_log(log_file):
    tag = -1
    flag = True
    with open(log_file, 'rb') as f:
        lines = f.readlines()        
        while flag:
            try:
                last_line = str(lines[tag], encoding='gb2312')
                print('Get to the first place%s The data for the row is:\n%s' % (tag ,last_line))
                log_time = last_line.split(",")[0]
                print('The time in the extracted row data is:%s' % (log_time))
                time_format = time.strptime(log_time, "%Y-%m-%d %H:%M:%S")
                print('The time format is correct!')
                flag = False
            except Exception as reason:
                print(reason)
                tag -= 1
        return log_time
        
def contrast_time(log_time):   
    system_time = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    print('Get system time:%s' % (system_time))      
    a=parse(log_time)
    b=parse(system_time)
    interval = (b - a).total_seconds()
    print("The time interval is:%s second" % (interval))
    return interval
    
def restart_script(interval, timeout, proc_name, cmd_file):
    current_time = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    f = open('log.txt', 'a+')
    print('%s Execute restart' % (current_time), file=f)
    f.close()
    os.system('taskkill /IM %s /F' % (proc_name))
    os.system('taskkill /IM cmd.exe /F')
    time.sleep(1)
    subprocess.Popen("%s" % (cmd_file))
    
@error1
def main(intervals):      
    print('The documents checked are:%s' % (log_file))
    log_time = read_log(log_file)
    interval = contrast_time(log_time)
    if interval > timeout:
        print("Timeout detected, script will restart....")
        restart_script(interval, timeout, proc_name, cmd_file)
    else:
        print("Time interval detected within specified range")
    print('==============================================')
    time.sleep(intervals)
    
if __name__ == '__main__':        
    intervals = 300
    timeout = 300
    proc_name = 'java.exe'
    log_file = 'C:\\mailboxcode\\log\\log_'
    cmd_file = 'C:\\check_log\\start_run.bat'    
    while True:
        main(intervals)

    

IV. operation demonstration:

If the last row of data does not contain time, it will continue to look up one row.

Posted by orionellis on Thu, 05 Dec 2019 08:29:33 -0800