TensorRT is designed by Nvidia to accelerate the reasoning based on its own GPU training model. After we train the model, TensorRT can directly conduct one-to-one correspondence on the network layer of the model, so as to accelerate the reasoning deployment of larger models. Recently, TensorRT has accelerated some models. I will use two articles to record and explain the use process and the pits encountered. This article will record the TensorRT transformation of the general model, and the transformation of the transformer class model will be recorded in the next article. The installation documented in this article is based on TensorRT 6.0.
1. Installation of TensorRT
1.1 download the installation package from the official website
First, we can download the installation package in the tensorrt section of nvidia's official website at https://developer.nvidia.com/nvidia-tensorrt-download , because I use ubuntu version 18.04, python version 3.6 and cuda version 10.1, I choose TensorRT 6.0.1.5 GA for Ubuntu 18.04 and CUDA 10.1 tar package. stay Official website The cross reference relationship between the system version and the python version is described in. You can cross reference it yourself.
1.2 installation of TensorRT and pycuda
Before installing TensorRT, we first use pip to install pycuda
pip install pycuda
Then we install TensorRT.
## Unzip the installation package tar zxvf TensorRT-6.0.1.5.Ubuntu-18.04.x86_64-gnu.cuda-10.1.cudnn7.6.tar ## Install TensorRT's library into the system library sudo cp -r ~/TensorRT-6.0.1.5/lib/* /usr/lib sudo cp -r ~/TensorRT-6.0.1.5/include/* /usr/include # Install TensorRT pip install ~/TensorRT-6.0.1.5/python/tensorrt-6.0.1.5-cp36-none-linux_x86_64.whl # Install UFF, which is a tool for converting models pip install ~/TensorRT-6.0.1.5/uff/uff-0.6.5-py2.py3-none-any.whl # Installing graphsurgeon pip install ~/TensorRT-6.0.1.5/graphsurgenon/graphsurgeon-0.4.1-py2.py3-none-any.whl #Finally, we add the lib absolute path of TensorRT to the environment variable export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/TensorRT-6.0.1.5/lib
So far, we have installed TensorRT and have not encountered any pits. Use the following statement to test whether our installation was successful.
import tensorrt import uff
2. Use of TensorRT
We use the example given by TensorRT_ to_ end_ tensorflow_ MNIST for simple instructions.
## First, let's go to the example code cd ~/TensorRT-6.0.1.5/samples/python/end_to_end_tensorflow_mnist ## Then, we build a folder of models mkdir models ## Run the model.py file, and we can get lenet5.pb in the models folder python models.py ## Model conversion convert-to-uff ./models/lenet5.pb
At this point, we will get the pb file generated by the model, and then we use convert to UFF to convert the model, but we will encounter the following errors
bash: convert-to-uff: command not found
We use https://forums.developer.nvidia.com/t/convert-to-uff-command-not-found/116782 The solution given in.
UFF_PATH="$(python -c 'import uff; print(uff.__path__[0])')" chmod +x ${UFF_PATH}/bin/convert_to_uff.py ln -sf ${UFF_PATH}/bin/convert_to_uff.py /usr/local/bin/convert-to-uff
At this point, run the conversion statement again to convert the model. We will get the lenet5.uf file in the models folder.
## Test transformed model python sample.py
Encountered a problem. Course not find 8.pgm. Searched in data paths: ['~ / tensorrt-6.0.1.5 / data / MNIST'], refer to https://github.com/NVIDIA/TensorRT/issues/256#issuecomment-568382745 Answer questions.
python /opt/tensorrt/data/mnist/generate_pgms.py -d /opt/tensorrt/data/mnist/train-images-idx3-ubyte -l /opt/tensorrt/data/mnist/train-labels-idx1-ubyte -o /opt/tensorrt/data/mnist
However, we will find that the two folders train images idx1 ubyte and train labels idx1 ubyte do not exist. We can go there https://github.com/Manuel4131/GoMNIST/tree/master/data Download this URL, and then use the gzip -d command to unzip it in the corresponding folder.
After all the above steps are completed, we can run python sample.py again. At this point, we can get the running results.
3. Summary
In this article, we explain the installation and simple use of tensorrt. However, the author is mainly engaged in the NLP direction, and the model he hopes to accelerate is the bert model. Using convert to UFF directly can only convert the simple network layer, not the transformer. Therefore, in the next article, the author will record the steps of transforming the bert class model using tensorrt and the pits encountered in it.