Simple implementation and test of C++11 thread pool
Don't talk much, code first
//threadpool.h #pragma once #include<vector> #include<queue> #include<thread> #include<iostream> #include<stdexcept> #include<condition_variable> #include<memory> #include<functional> const int MAX_THREADS = 1000; typedef std::function<void(void)> Task; int num = 0; class threadPool { public: threadPool(int number = 1); ~threadPool(); bool append(Task task); private: static void* worker(void* arg); void run(); private: std::vector<std::thread> work_threads; std::queue<Task> tasks_queue; std::mutex queue_mutex; std::condition_variable condition; bool stop; }; threadPool::threadPool(int number) : stop(false) { if (number <= 0 || number > MAX_THREADS) { throw std::exception(); } for (int i = 0; i < number; i++) { //STD:: cout < < create the first < < I < < processes < < STD:: endl; work_threads.emplace_back(threadPool::worker, this); } } inline threadPool::~threadPool() { { std::unique_lock<std::mutex> lock(queue_mutex); stop = true; condition.notify_all(); for (auto& ww : work_threads) ww.join(); } } bool threadPool::append(Task task) { queue_mutex.lock(); tasks_queue.push(task); queue_mutex.unlock(); condition.notify_one(); return true; } void* threadPool::worker(void* arg) { threadPool* pool = (threadPool*)arg; pool->run(); return pool; } void threadPool::run() { while (!stop) { std::unique_lock<std::mutex> lk(this->queue_mutex); //queue_mutex.lock(); this->condition.wait(lk, [this] {return !this->tasks_queue.empty(); }); if (this->tasks_queue.empty()) { continue; } else { Task task = tasks_queue.front(); tasks_queue.pop(); lk.unlock(); //After getting the task in the task queue, you should unlock the task queue and execute the task task(); } } }
It mainly tests the impact of the number of task s and the number of opened threads on the completion time.
The task in the test code is calculation intensive: for loop accumulation. The code is as follows:
#include <iostream> #include"threadpool.h" #include<Windows.h> using namespace std; int show_res = 0; mutex data_mutex; class Test { public: void process_no_static_bind(const int i, const int j) { int res = 0; for (int n = 0; n < 50000000; n++) { res += (i + j); } //Sleep(500); unique_lock<mutex> lk(data_mutex); show_res++; //cout << show_res << endl; } }; int main() { threadPool pool(1); int test_num = 100; Test tt_bind; time_t start, end; start = GetTickCount64(); /* while (true) { pool.append(std::bind(&Test::process_no_static_bind, &tt_bind, 3, 4)); }*/ cout << "start tag=" << show_res << endl; for (int i = 0; i < test_num; i++) { pool.append(std::bind(&Test::process_no_static_bind, &tt_bind, i, 4)); //tt_bind.process_no_static_bind(i, 4); } while (show_res < test_num-1) { Sleep(10); } cout << "end tag=" << show_res << endl; end = GetTickCount64(); cout << "spend time:" << end - start << "mileseconds" << endl; exit(0); }
Test results:
Test one:
Number of tasks performed: 10
Process number | time |
---|---|
10 | 1203ms |
9 | 1078ms |
8 | 1329ms |
7 | 1250ms |
6 | 1484ms |
5 | 1359ms |
4 | 1765ms |
3 | 1266ms |
2 | 1984ms |
1 | 3547ms |
Test two:
Add a Sleep(100) to each task;
Number of tasks performed: 10
Process number | time |
---|---|
10 | 1281ms |
9 | 1265ms |
8 | 1563ms |
7 | 1563ms |
6 | 1547ms |
5 | 1421ms |
4 | 1765ms |
3 | 1578ms |
2 | 2485ms |
1 | 4453ms |
Combining test one and test two
There is a conjecture: the time for a single process to complete all tasks divided by the number of cores (this time 4) is approximately equal to the time required for multiple threads to execute tasks at the same time.
So it's not that the more threads, the faster execution.
At the same time, we need to consider whether it is CPU intensive or IO intensive
CPU intensive testing: performing a large number of calculations or circular tasks in a task
IO intensive testing: blocking with Sleep() function in tasks
Test three:
CPU intensive:
Number of tasks performed: 10
Process number | time |
---|---|
10 | 1156ms |
9 | 1125ms |
8 | 1531ms |
7 | 1453ms |
6 | 1515ms |
5 | 1235ms |
4 | 1703ms |
3 | 1406ms |
2 | 2125ms |
1 | 3641ms |
Test four:
IO intensive:
Number of tasks performed: 10
Process number | time |
---|---|
10 | 625ms |
9 | 640ms |
8 | 1141ms |
7 | 1109ms |
6 | 1140ms |
5 | 1140ms |
4 | 1719ms |
3 | 1719ms |
2 | 2719ms |
1 | 4984ms |
IO intensive is more efficient than CPU intensive when there are many threads
This is because the CPU intensive type is limited by the number of CPU cores. Because the number of CPU cores occupied by tasks is limited, threads exceeding the number of cores will not execute at the same time
However, IO intensive type has less dependence on CPU. When one thread waits for IO results, other threads can continue to use CPU
The CPU intensive type relies heavily on the CPU. When one thread is waiting for the CPU result, other threads cannot continue to use the CPU
So for IO intensive, using the same number of threads as the number of tasks can improve the execution speed
Test five:
IO intensive:
Number of tasks performed: 100
Process number | time |
---|---|
10 | 5062ms |
20 | 2563ms |
50 | 1078ms |
100 | 641ms |
200 | 640ms |
From the results of test 5, it can be seen that for IO intensive tasks, the number of threads that exceed the number of CPU cores greatly improves the execution speed
Because of the use of multithreading, the CPU of each thread can be used by other threads during the period of waiting for IO results.
Test six:
CPU intensive:
Number of tasks performed: 100
Process number | time |
---|---|
10 | 5672ms |
20 | 5766ms |
50 | 5250ms |
100 | 5062ms |
200 | 5031ms |
4 | 8047ms |
1 | 19172ms |
From the results of test 6, it can be seen that for CPU intensive tasks, the number of threads that exceed the CPU core number does not increase the execution speed much
One possible speed increase is that when multithreading, the number of threads executing this task collection accounts for a large proportion in the system, which also increases the proportion of using CPU
The CPU usage time of other process tasks caused by this is occupied. When more than 50 threads are executed, the mouse response speed can be obviously reduced
And switching between threads takes time