Smooth restart of GoLang service

elicit questions

In the process of developing with go, if we modify the code, it is control+c that kills the running process, and then go run or go build that runs. When our project goes online, killing the process directly will lead to online service interruption, which is absolutely not allowed in the production environment

Solutions

After changing the code, recompile and restart the process. When the current main process fork s out a sub process to run the changed program.

Implementation details

How to notify the process to restart smoothly? The answer is to register the SIGHUP semaphore and process it in the handle method. So what has been done when fork comes out of subprocesses, and how to deal with the executing services? We know that all connections are communicated through socket file descriptors, so we just need to get the socket file descriptors of the parent process and assign them to the new fork's child processes. At this time, we need to make new requests. The file descriptors point to the new child processes, all of which are processed by the child processes. When the parent processes the current request, the SIGTERM signal will be executed Kill it. At this time, since the child process has no parent process, it becomes a zombie process and is handed over to system 1 process for takeover. The approximate code is as follows:

server := gin.New();
group := server.Group("")
group.GET("/ping", func(c *gin.Context) {
	c.JSON(http.StatusOK, gin.H{
		"errno": 0,
		"errmsg": "success",
		"data":	"",
		"user_msg": "",
	})
})

tmpServer := endless.NewServer(fmt.Sprintf(":%s", strconv.Itoa(Port)), server)
tmpServer.BeforeBegin = func(add string) {
	log.Printf("Actual pid is %d", syscall.Getpid())
}
err := tmpServer.ListenAndServe()
if err != nil {
	log.Printf("Server err: %v", err)
}

Here is the source code of endless

ListenAndServe:

func (srv *endlessServer) ListenAndServe() (err error) {
	addr := srv.Addr
	if addr == "" {
		addr = ":http"
	}

	go srv.handleSignals()  //Registered semaphore processing method

	l, err := srv.getListener(addr)
	if err != nil {
		log.Println(err)
		return
	}

	srv.EndlessListener = newEndlessListener(l, srv)

	if srv.isChild {
		syscall.Kill(syscall.Getppid(), syscall.SIGTERM) //kill the parent process through SIGTERM signal in the child process
	}

	srv.BeforeBegin(srv.Addr)

	return srv.Serve()
}

handleSignals:

func (srv *endlessServer) handleSignals() {
	var sig os.Signal

	signal.Notify(
		srv.sigChan,
		hookableSignals...,
	)

	pid := syscall.Getpid()
	for {
		sig = <-srv.sigChan
		srv.signalHooks(PRE_SIGNAL, sig)
		switch sig {
		case syscall.SIGHUP:
			log.Println(pid, "Received SIGHUP. forking.")
			err := srv.fork() //fork subprocess 
			if err != nil {
				log.Println("Fork err:", err)
			}
		case syscall.SIGUSR1:
			log.Println(pid, "Received SIGUSR1.")
		case syscall.SIGUSR2:
			log.Println(pid, "Received SIGUSR2.")
			srv.hammerTime(0 * time.Second)
		case syscall.SIGINT:
			log.Println(pid, "Received SIGINT.")
			srv.shutdown()
		case syscall.SIGTERM:
			log.Println(pid, "Received SIGTERM.")
			srv.shutdown()
		case syscall.SIGTSTP:
			log.Println(pid, "Received SIGTSTP.")
		default:
			log.Printf("Received %v: nothing i care about...\n", sig)
		}
		srv.signalHooks(POST_SIGNAL, sig)
	}
}

fork to assign the socket file descriptor:

func (srv *endlessServer) fork() (err error) {
	runningServerReg.Lock()
	defer runningServerReg.Unlock()

	// only one server instance should fork!
	if runningServersForked {
		return errors.New("Another process already forked. Ignoring this one.")
	}

	runningServersForked = true

	var files = make([]*os.File, len(runningServers))
	var orderArgs = make([]string, len(runningServers))
	// get the accessor socket fds for _all_ server instances
	for _, srvPtr := range runningServers {
		// introspect.PrintTypeDump(srvPtr.EndlessListener)
		switch srvPtr.EndlessListener.(type) {
		case *endlessListener:
			// normal listener
			files[socketPtrOffsetMap[srvPtr.Server.Addr]] = srvPtr.EndlessListener.(*endlessListener).File()
		default:
			// tls listener
			files[socketPtrOffsetMap[srvPtr.Server.Addr]] = srvPtr.tlsInnerListener.File()
		}
		orderArgs[socketPtrOffsetMap[srvPtr.Server.Addr]] = srvPtr.Server.Addr
	}

	env := append(
		os.Environ(),
		"ENDLESS_CONTINUE=1",
	)
	if len(runningServers) > 1 {
		env = append(env, fmt.Sprintf(`ENDLESS_SOCKET_ORDER=%s`, strings.Join(orderArgs, ",")))
	}

	// log.Println(files)
	path := os.Args[0]
	var args []string
	if len(os.Args) > 1 {
		args = os.Args[1:]
	}

	cmd := exec.Command(path, args...)
	cmd.Stdout = os.Stdout
	cmd.Stderr = os.Stderr
	cmd.ExtraFiles = files
	cmd.Env = env

	// cmd.SysProcAttr = &syscall.SysProcAttr{
	// 	Setsid:  true,
	// 	Setctty: true,
	// 	Ctty:    ,
	// }

	err = cmd.Start()
	if err != nil {
		log.Fatalf("Restart: Failed to launch, error: %v", err)
	}

	return
}

Operation result

1. Compile and start the program

localhost:go why$ go build main.go 
localhost:go why$ ./main.go 
[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
 - using env:   export GIN_MODE=release
 - using code:  gin.SetMode(gin.ReleaseMode)

[GIN-debug] GET    /ping                     --> main.main.func1 (1 handlers)
2020/02/08 17:52:26 Actual pid is 16333

2. Call interface

curl -XGET localhost:777/ping

{
    "data": "",
    "errmsg": "success",
    "errno": 0,
    "user_msg": ""
}

3. Change code

c.JSON(http.StatusOK, gin.H{
	"errno": 0,
	"errmsg": "success",
	"data":	"new data",
	"user_msg": "",
})

4. Recompile

go build main.go

5. After a smooth restart, you can see that the PID of the new process is 1

whydeMacBook-Pro:go why$ kill -1 16333
whydeMacBook-Pro:go why$ lsof -i:777
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
main    16350  why    3u  IPv6 0x8dbd126b94b08875      0t0  TCP *:multiling-http (LISTEN)
main    16350  why    6u  IPv6 0x8dbd126b94b08875      0t0  TCP *:multiling-http (LISTEN)
whydeMacBook-Pro:go why$ ps -ef | grep 16350
  501 16350     1   0  5:53 Afternoon ttys004    0:00.03 /var/folders/_s/jfrm6_712w58sytpc753pmr40000gn/T/go-build001868412/b001/exe/main
  501 16395 34106   0  5:56 Afternoon ttys007    0:00.00 grep 16350
whydeMacBook-Pro:go why$

6. View results

{
    "data": "new data",
    "errmsg": "success",
    "errno": 0,
    "user_msg": ""
}

AbleYu

211 original articles published, 28 praised, 150000 visitors+

Private letter follow

Posted by benrussell on Sat, 08 Feb 2020 03:31:02 -0800

Programmer Group

Smooth restart of GoLang service

elicit questions

Solutions

Implementation details

Operation result

Hot Keywords