This article starts from personal blog https://kezunlin.me/post/8d877e63/ Welcome to read!
cpp caffe net run in multiple threads
Guide
set_mode
Caffe fails to use GPU in a new thread ???
see here
the `Caffe::mode_` variable that controls this is thread-local, so ensure you're calling `caffe.set_mode_gpu()` in each thread before running any Caffe functions. That should solve your issue. Caffe set mode GPU fails under multithreading Set GPU mode in main thread and call the network in worker thread to detect. GPU mode does not work. CPU mode is still used by default, so the speed is very slow. Compared with GPU, it is slow About 10 times. Solution: set_mode in sub thread, then call network to detect. (1) create a network in the main thread. The static network is stored in the global static data area. worker thread can be used directly. (2) detection in worker thread requires set_mode in the sub thread and then calls the network for testing. Conclusion: (1) the set mode of Caffe must be in the same thread as the one using nets to forward. Otherwise, the CPU mode is used by default, and the speed will be very slow. (2) the network initialization of Caffe can be in main thread or worker thread.
code example
#include <iostream> #include <string> #include <thread> #include <gtest/gtest.h> #include <glog/logging.h> #include <boost/date_time/posix_time/posix_time.hpp> // opencv #include <opencv2/core.hpp> #include <opencv2/highgui.hpp> #include <opencv2/imgproc.hpp> using namespace std; #include "algorithm/algorithm.h" using namespace kezunlin::algorithm; #pragma region net-demo void topwire_demo(bool run_in_worker_thread) { if (run_in_worker_thread) { CaffeApi::set_mode(true, 0, 1234);// set in worker thread-1, use GPU-0 } // do net detect // ... } void railway_demo(bool run_in_worker_thread) { if (run_in_worker_thread) { CaffeApi::set_mode(true, 0, 1234);// set in worker thread-1, use GPU-0 } // do net detect // ... } void sidewall_demo(bool run_in_worker_thread) { if (run_in_worker_thread) { CaffeApi::set_mode(true, 0, 1234);// set in worker thread-1, use GPU-0 } // do net detect // ... } void lockcatch_demo(bool run_in_worker_thread) { if (run_in_worker_thread) { CaffeApi::set_mode(true, 0, 1234);// set in worker thread-1, use GPU-0 } // do net detect // ... } #pragma endregion #pragma region worker-thread-demo void worker_thread_topwire_demo(bool run_in_worker_thread) { std::thread thr(topwire_demo, run_in_worker_thread); thr.join(); } void worker_thread_railway_demo(bool run_in_worker_thread) { std::thread thr(railway_demo, run_in_worker_thread); thr.join(); } void worker_thread_sidewall_demo(bool run_in_worker_thread) { std::thread thr(sidewall_demo, run_in_worker_thread); thr.join(); } void worker_thread_lockcatch_demo(bool run_in_worker_thread) { std::thread thr(lockcatch_demo, run_in_worker_thread); thr.join(); } #pragma endregion enum DETECT_TYPE { SET_IN_MAIN_DETECT_IN_MAIN, // Main thread set mode, main thread detection, about 40ms, using GPU SET_IN_WORKER_DETECT_IN_WORKER, // Sub thread set mode, sub thread detection, about 40ms, using GPU SET_IN_MAIN_DETECT_IN_WORKER // Main thread set mode, sub thread detection, 400ms or so, 10 times slower, GPU not used }; void thread_demo() { DETECT_TYPE detect_type = SET_IN_MAIN_DETECT_IN_MAIN; detect_type = SET_IN_WORKER_DETECT_IN_WORKER; detect_type = SET_IN_MAIN_DETECT_IN_WORKER; init_algorithm_api(); switch (detect_type) { case SET_IN_MAIN_DETECT_IN_MAIN: topwire_demo(false); railway_demo(false); sidewall_demo(false); lockcatch_demo(false); break; case SET_IN_WORKER_DETECT_IN_WORKER: worker_thread_topwire_demo(true); worker_thread_railway_demo(true); worker_thread_sidewall_demo(true); worker_thread_lockcatch_demo(true); break; case SET_IN_MAIN_DETECT_IN_WORKER: worker_thread_topwire_demo(false); worker_thread_railway_demo(false); worker_thread_sidewall_demo(false); worker_thread_lockcatch_demo(false); break; default: break; } free_algorithm_api(); } void test_algorithm_api() { thread_demo(); } TEST(algorithn_test, test_algorithm_api) { test_algorithm_api(); }
- Set ﹣ in ﹣ main ﹣ detect ﹣ in ﹣ main, / / main thread set ﹣ mode, main thread detection, about 40ms, using GPU
- Set ﹣ in ﹣ worker ﹣ detect ﹣ in ﹣ worker, / / sub thread set ﹣ mode, sub thread detection, about 40ms, using GPU
- Set ﹣ in ﹣ main ﹣ detect ﹣ in ﹣ worker / / main thread set ﹣ mode, sub thread detection, 400ms, 10 times slower, GPU not used
Reference
History
- 20180712: created.
Copyright
- Post author: kezunlin
- Post link: https://kezunlin.me/post/8d877e63/
- Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 3.0 unless stating additionally.