Preface
For most machine learning enthusiasts, TensorFlow (TF) is a very good Python open source machine learning framework. But for some developers, they often need to train their models in Python environment and deploy them in C++ environment. Docker is often used in this deployment and testing environment.
To meet this need, combined with the pits I trampled and the nights I stayed up in the process, I recorded how to build a Docker image with TensorFlow C++ environment. The article contains the following contents:
- How to compile TensorFlow C++ API manually (general TF C++ API compilation);
- How to compile TF C++ API automatically when building Docker image?
- Test your installation
- Possible problems
preparation in advance
- Docker CLI, used to build Docker image, can install the latest version;
- Bazel, for compiling TF. The version of TF r1.14 can be Bazel >= 0.26.1.
According to the author's observation, the more CPU threads the faster the computer compiles. Conditionally, compiling on a computer with a large number of CPUs and threads will save a little time.
1) Manual compilation of TF C++ API (generic C++ API compilation)
TF C++ API can only be compiled by source code at present, so first pull repo down from the Internet:
# Cloning TensorFlow to tensorflow_src folder from github $ git clone https://github.com/tensorflow/tensorflow.git tensorflow_src # Go to the version you need. Take r1.14 for example. Other versions may need to adjust your CUDA, CUDNN, NCCL, BAZEL versions accordingly. $ cd tensorflow_src $ git checkout r1.14
Next, compile the configuration.
$ ./configure You have bazel 0.26.1 installed. # Specify a Python environment, either system-owned or Python executable files in virtual environments such as virtual alenv/conda Please specify the location of python. [Default is /usr/bin/python]: # Specify the installation location of the Python package Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/site-packages] # Specifies whether XLA (Accelerated Linear Algebra) support is turned on by default Yes Do you wish to build TensorFlow with XLA JIT support? [Y/n]: XLA JIT support will be enabled for TensorFlow. # Specifies whether OpenCL SYCL support is turned on by default No Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: No OpenCL SYCL support will be enabled for TensorFlow. # Specifies whether ROCm support is turned on by default No Do you wish to build TensorFlow with ROCm support? [y/N]: No ROCm support will be enabled for TensorFlow. # Specify whether to open CUDA by default No and Yes because GPU is used Do you wish to build TensorFlow with CUDA support? [y/N]: Y CUDA support will be enabled for TensorFlow. # Specifies whether to turn on TensorRT support by default No Do you wish to build TensorFlow with TensorRT support? [y/N]: No TensorRT support will be enabled for TensorFlow. # If the installation paths of cuDNN and CUDA are different, the installation paths of cuDNN must also be provided. Enter a series of comma-separated paths to ensure that the header files and dynamic link libraries of cuDNN can be found under these paths. Found CUDA 10.0 in: /usr/local/cuda/lib64 /usr/local/cuda/include Found cuDNN 7 in: /usr/local/cuda/lib64 /usr/local/cuda/include Please specify a list of comma-separated CUDA compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 7.5]: # Specify the computed version of CUDA. Just take the default version directly. Do you want to use clang as CUDA compiler? [y/N]: # Specifies whether to use clang for CUDA compilation. Default No nvcc will be used as CUDA compiler. # Specify gcc version. Just take the default gcc directly Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: # Specifies whether MPI support is turned on. Default No Do you wish to build TensorFlow with MPI support? [y/N]: No MPI support will be enabled for TensorFlow. # Specify the flag at bazel compilation time, you can take the default value directly, and you can continue to add it later if you need it. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: # No Android, default No Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details. # The following can be added to the command line as flag in bazel build --config=mkl # Build with MKL support. --config=monolithic # Config for mostly static monolithic build. --config=gdr # Build with GDR support. --config=verbs # Build with libverbs support. --config=ngraph # Build with Intel nGraph support. --config=numa # Build with NUMA support. --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. Preconfigured Bazel build configs to DISABLE default on features: --config=noaws # Disable AWS S3 filesystem support. --config=nogcp # Disable GCP support. --config=nohdfs # Disable HDFS support. --config=noignite # Disable Apache Ignite support. --config=nokafka # Disable Apache Kafka support. --config=nonccl # Disable NVIDIA NCCL support. Configuration finished
After the configuration is complete, the actual compilation begins:
$ bazel build --config=opt --config=cuda --action_env="LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" //tensorflow:libtensorflow_cc.so # Adding -- config=monolithic will result in a libtensorflow_cc.so library (which can reduce the number of libraries generated, facilitate linking, etc.) and, if not, a libtensorflow_framework.so library will be generated together. Among the problems you may encounter, you will also talk about the pros and cons of this option.
Compiling may take a long time. According to the results of compiling by the author himself, it takes about half an hour for a 10-core 20-thread CPU and 50 minutes for a 6-core 12-thread CPU. In the meantime, you can go out for a meal.
After compiling, copy some header files and link libraries to a suitable place.
# Take / usr/local/include/tf as an example $ sudo mkdir -p /usr/local/include/tf/tensorflow $ sudo cp -r bazel-genfiles/ /usr/local/include/tf $ sudo cp -r tensorflow/cc /usr/local/include/tf/tensorflow $ sudo cp -r tensorflow/core /usr/local/include/tf/tensorflow $ sudo cp -r third_party /usr/local/include/tf $ sudo cp -r bazel-bin/tensorflow/libtensorflow* /usr/local/lib # Put the link libraries under / usr/local/lib separately # The completed directory structure should be as follows: # _/usr/local/include/ # |_tf/ # |_tensorflow/ # | |_cc/ # | |_core/ # |_bazel-genfiles/ # |_third_party/
Finally, do some post-installation work to add third-party dependencies needed by TF. The library dependencies in the third_part you just copied are incomplete. TF has prepared a script for us to do this.
# Under tensorflow_src directory # Executing scripts to download third-party library dependencies generates a downloads folder full of third-party libraries $ ./tensorflow/contrib/makefile/download_dependencies.sh # Move all the libraries in downloads to / usr/local/include. You can also choose a place that you feel is right for you. # I moved them all to / usr/local/include because I could write fewer include paths when compiling my own project $ sudo cp -r tensorflow/contrib/makefile/downloads/* /usr/local/include # The proto file of TF r1.14 needs the support of protobuf 3.7.1. In this third-party library, we checkout version to 3.7.1 $ cd /usr/local/include/protobuf $ git checkout v3.7.1 # If libtensorflow_framework.so does not exist under / usr/local/lib, symbolically link to libtensorflow_framework.so.1 $ cd /usr/local/lib $ ln -s /libtensorflow_framework.so.1 libtensorflow_framework.so
So far, our TF manual installation has been completed.
2) How to compile TF C++ API automatically when building Docker image
The basic idea is to write a bash script for the above installation steps when writing Dockerfile, and then execute the script with RUN command in Dockerfile. To skip some of the interactive steps required for configure, we define some of the required environment variables in the bash script:
# Pull TF repo from git, check out to r1.14. Here we skip # Python path options export PYTHON_BIN_PATH=$(which python3) export PYTHON_LIB_PATH="$($PYTHON_BIN_PATH -c 'import site; print(site.getsitepackages()[0])')" # Compilation parameters export TF_NEED_CUDA=1 export TF_NEED_GCP=0 export TF_CUDA_COMPUTE_CAPABILITIES=5.2,6.1,7.0,7.5 export TF_NEED_HDFS=0 export TF_NEED_OPENCL=0 export TF_NEED_JEMALLOC=0 export TF_ENABLE_XLA=1 export TF_NEED_VERBS=0 export TF_CUDA_CLANG=0 export TF_DOWNLOAD_CLANG=0 export TF_NEED_MKL=0 export TF_DOWNLOAD_MKL=0 export TF_NEED_MPI=0 export TF_NEED_S3=0 export TF_NEED_KAFKA=0 export TF_NEED_GDR=0 export TF_NEED_OPENCL_SYCL=0 export TF_SET_ANDROID_WORKSPACE=0 export TF_NEED_AWS=0 export TF_NEED_IGNITE=0 export TF_NEED_ROCM=0 # Compiler parameters export GCC_HOST_COMPILER_PATH=$(which gcc) # Parameters for bazel compilation optimization export CC_OPT_FLAGS="-march=native" # CUDA and cuDNN parameters export CUDA_TOOLKIT_PATH=$CUDA_HOME export CUDNN_INSTALL_PATH="/usr/include,/usr/lib/x86_64-linux-gnu" # A comma-separated path string that includes the location of header files and. so files, if different from the installation location of CUDA export TF_CUDA_VERSION=10.0 # CUDA Version export TF_CUDNN_VERSION=7.6 # cuDNN version, write 7 directly, if you know the specific version number, such as 7.6.1, can also be export TF_NEED_TENSORRT=0 export TF_NCCL_VERSION=2.4 # NCCL version, similar to the cuDNN version export NCCL_INSTALL_PATH=$CUDA_HOME # Those two lines are important for the linking step. export LD_LIBRARY_PATH="$CUDA_TOOLKIT_PATH/lib64:${LD_LIBRARY_PATH}" ldconfig ./configure bazel build --config=opt --config=cuda --action_env="LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" //tensorflow:libtensorflow_cc.so # Then add the statements mentioned in the first part, such as copying, downloading third parties and so on. Here we skip
3) Test your installation
Here is a good off-the-shelf github repo link: https://github.com/lysukhin/tensorflow-object-detection-cpp We can use the code to test the installation. The code itself reads the camera data and identifies the human hand in the video frame. C++ OpenCV support is required. Friends without OpenCV can focus on how CMakeLists.txt is written.
$ git clone https://github.com/lysukhin/tensorflow-object-detection-cpp tf_test # Rename it tf_test folder $ cd tf_test && mkdir build
Next we will modify some paths in CMakeLists.txt to point to some header files we need:
# New CMakeLists.txt,Replace the original document with this cmake_minimum_required(VERSION 3.7) project(tf_detector_example) set(CMAKE_CXX_STANDARD 11) set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -g -Wall") # Use this one if you want to do video real time detection set(SOURCE_FILES do_infer.cc) add_executable(tf_detector_example ${SOURCE_FILES}) # OpenCV libs find_package(OpenCV REQUIRED) include_directories(${OpenCV_INCLUDE_DIRS}) target_link_libraries(tf_detector_example ${OpenCV_LIBS}) # ==================== PATHS TO SPECIFY! ==================== # # The third-party library path required by TensorFlow. If eigen, absl, and protobuf are not explicitly specified, they may report no such file or directory error. include_directories("/usr/local/include") include_directories("/usr/local/include/eigen") include_directories("/usr/local/include/absl") include_directories("/usr/local/include/protobuf/src") # TensorFlow header file path include_directories("/usr/local/include/tf/") include_directories("/usr/local/include/tf/bazel-genfiles/") include_directories("/usr/local/include/tf/tensorflow/") include_directories("/usr/local/include/tf/third-party/") # TensorFlow Dynamic Link Library Path target_link_libraries(tf_detector_example "/usr/local/lib/libtensorflow_cc.so") target_link_libraries(tf_detector_example "/usr/local/lib/libtensorflow_framework.so")
After that, we can compile and run:
$ cd build && cmake .. $ make $ ./tf_detector_example
If you can link and run correctly, congratulations. Everything is going well.
4) Probable problems
- NVIDIA NCCL was not found when configuring: it is usually installed with CUDA, but different NCCLs may not be found under the CUDA path depending on the base image you use when building Docker. Then you need to install NCCL manually.
- The protobuf version is incompatible: The general error is that This file was generated by an older version of protoc which is... or This file was generated by an newer version of protoc which is... These files are generated by protobuf, you can open them and see what version of protobuf is displayed in the macro on the head. If the version number is 3007001, the corresponding version is 3.7.1. Accordingly, go to the protobuf / folder to check out a version.
- Duplicate registration of device factory for type GPU with the same priority 210: occurs at runtime and is currently due to the addition of -- config=monolithic in bazel build. Removing this flag and recompiling can solve this problem, but there are also places where the absence of this flag would conflict with OpenCV, as detailed in https://github.com/tensorflow/tensorflow/issues/14826 . The author's current build does not take this flag, there is no problem, it remains to be observed.
13 August 2019