Performance test of three HTTP clients for Java&Go

After learning Practice of HTTP client in Golang language,Six implementations of HTTPServer development in Go language After that, I naturally started the HTTP client performance test in Java & go.

I was writing before 100000 QPS, the ultimate duel between K6, Gatling and FunTester! This article and 120000 QPS per machine -- Revenge of FunTester In order to achieve 120000 QPS, I deleted all codes except statistics. This time, I won't go so extreme. Moreover, during my initial test, I found that the laptop can only run to 80%CPU at most. I don't know if it is limited by Mac OS. Besides, I also feel distressed about consuming my own computer.

Server

The server still adopts moco_ moco service of funtester framework, code as follows:

class Share extends MocoServer {

    static void main(String[] args) {
        def util = new ArgsUtil(args)
        def server = getServerNoLog(util.getIntOrdefault(0,12345))
        server.response("Have Fun ~ Tester !")
//        server.response(delay(textRes("Have Fun ~ Tester !"),10))
        def run = run(server)
        waitForKey("fan")
        run.stop()
    }
}

In this test, there are two service states: one is non delayed HTTP service, and the other is low latency (5ms and 10ms). There are three HTTP services in total. Because the Go language HTTP library has its own HTTP service development function, I will write another article later to compare the performance of three HTTP services: Java netty, Go (net/http) and Go (/ valyala/fasthttp).

test case

FunTester

The FunTester test framework uses Java HttpClient, encapsulates the HttpClient API, and then completes this test with the FunTester performance test framework. In the actual measurement, the influence of packaging and framework on performance can be ignored.

class HttpClientTest extends FunLibrary {

    static final String uri = "http://localhost:12345/test/fun"
    static final HttpRequestBase get = FunLibrary.getHttpGet(uri)

    static final int thread = 10
    static final int times = 10000

    public static void main(String[] args) {
        RUNUP_TIME = 0
        def tester = new FunTester()
        new Concurrent(tester, thread, DEFAULT_STRING).start()
    }

    private static class FunTester extends FixedThread<HttpRequestBase> {

        FunTester() {
            super(get, times, true)
        }

        @Override
        protected void doing() throws Exception {
            FunLibrary.executeOnly(get)
        }

        @Override
        FixedThread clone() {
            return new FunTester()
        }
    }

}

Go(net/http)

Here I wrote a test method, which uses the co process of Go language and chan knowledge points. It is rough, but it can be used. If the code changes from time to time, the background reply git can obtain the GIT warehouse address of multiple projects, including this project.

var key bool = false

const (
 url    = "http://localhost:8001/test/fun"
 thread = 20
 times  = 10000
)

func TestPer(t *testing.T) {
 get := funtester.Get(url, nil)
 c := make(chan int)

 start := time.Now().UnixMilli()
 for i := 0; i < thread; i++ {
  go func() {
   sum := 0
   for i := 0; i < times; i++ {
    if key {
     break
    }
    funtester.Response(get)
    sum++
   }
   key = true
   c <- sum
  }()
 }
 total := 0
 for i := 0; i < thread; i++ {
  num := <-c
  total += num
 }
 end := time.Now().UnixMilli()
 diff := end - start
 log.Printf("Total time: %f", float64(diff)/1000)

 log.Printf("Total requests: %d", total)
 log.Printf("QPS: %f", float64(total)/float64(diff)*1000.0)

}

Go(/valyala/fasthttp)

Similar to net/http, the difference is that / valyala/fasthttp cannot use the same object for pressure measurement. So you have to create an object every time, but the actual measurement is more efficient. The fasthttp object pool is really good. And it is said that it is 10 times as much as net/http, which is really a bit boastful.

var key bool = false

const (
 url    = "http://localhost:8001/test/fun"
 thread = 20
 times  = 10000
)

func TestPerFast(t *testing.T) {
 c := make(chan int)
 start := time.Now().UnixMilli()
 for i := 0; i < thread; i++ {
  go func() {
   sum := 0
   for i := 0; i < times; i++ {
    if key {
     break
    }
    get := funtester.FastGet(url, nil)
    funtester.FastResponse(get)
    sum++

   }
   key = true
   c <- sum
  }()
 }
 total := 0
 for i := 0; i < thread; i++ {
  num := <-c
  total += num
 }
 end := time.Now().UnixMilli()
 diff := end - start
 //total := thread * times
 log.Printf("Total time: %f", float64(diff)/1000)

 log.Printf("Total requests: %d", total)
 log.Printf("QPS: %f", float64(total)/float64(diff)*1000.0)

}

test

No delay service

1 thread

frame

CPU

Memory

QPS

FunTester

51.04

354.9 MB

17715

Go(net/http)

104.26

14.8 MB

13120

Go(/valyala/fasthttp)

81.67

5.3 MB

20255

5 threads

frame

CPU

Memory

QPS

FunTester

230.08

555.5 MB

59626

Go(net/http)

323.45

14.9 MB

43143

Go(/valyala/fasthttp)

215.73

6.4 MB

68659

10 threads

frame

CPU

Memory

QPS

FunTester

356.43

685.2 MB

81795

Go(net/http)

573.08

1.36 GB

36431

Go(/valyala/fasthttp)

321.85

6.8 MB

82093

Up to now, the CPU has basically run at full load. In fact, it is 80%. I don't know if it is limited by macOS. Now the CPU can only run to about 80%.

The actual test results are very obvious. In terms of overall CPU indicators, there is little difference between FunTester and / valyala/fasthttp, / valyala/fasthttp has some advantages in CPU, but it is almost incomprehensible in memory. Damn it. In contrast, net/http is much inferior. In low concurrency, except for the good memory performance, the CPU and QPS are lower than FunTester and / valyala/fasthttp. However, in the case of 10 threads, the CPU runs full and the memory soars, which is really incomprehensible. After further study, I think I can understand this problem.

Delay 5ms service

10 threads

frame

CPU

Memory

QPS

FunTester

19.63

163.9 MB

1671

Go(net/http)

31.18

14.2 MB

1440

Go(/valyala/fasthttp)

15.63

6.8 MB

1709

20 threads

frame

CPU

Memory

QPS

FunTester

36.88

235.3 MB

3318

Go(net/http)

48.47

14.4 MB

2400

Go(/valyala/fasthttp)

32.81

7.5 MB

3339

The conclusion is similar to no delay service. In general, / valyala / fasthttp > FunTester > net/http. Even with the Go language bonus, net/http is not as good as FunTester written in Java except memory.

conclusion

/valyala/fasthttp is really awesome. It is recommended to use Go language for HTTP performance test and skip net/http directly.

PS: next time I will test the performance of three HTTP servers. Please look forward to it.

Posted by colmtourque on Fri, 03 Dec 2021 01:15:19 -0800