Write in front: strongly accuse those who copy and paste other people's blogs but don't know whether it is feasible to practice; I have read countless blogs, experimented with many methods, and summarized the pitfalls: almost all articles are source code compilation, and the compilation methods are similar, but no one has ever written how much swap jetson-tx2 needs to compile normally, and there is no supplement for missing files; Do not think that intellectual property rights have been infringed, please do not misunderstand;
The environment is as follows
- System: Ubuntu 18.04
- jetpack: 4.5.1
- cuda: 10.2
- cudnn: 8.0.0
- pytorch: 1.7
- torchvison: 0.8.1
- opencv: 4.5.4
- archiconda3
Installation and backup of system image
System Image Recovery
Enter burn mode:
Connect the machine to the virtual machine ubunutu18.04 through the USB data cable, press and hold the repower key, and then press the reset key to enter the burning mode.
Note: generally use another machine, virtual machine or linux link
Mirror installation
# Replace the image file. This example is to replace the newly cloned image with the system image root@mm-desktop:~/tx2-NX/tx2-BIOS-4.5/tx2-nx4.5.1sdk# cd bootloader/ root@mm-desktop:~/tx2-NX/tx2-BIOS-4.5/tx2-nx4.5.1sdk/bootloader# mv system.img system.img.bak root@mm-desktop:~/tx2-NX/tx2-BIOS-4.5/tx2-nx4.5.1sdk/bootloader# cp ../clone.img system.img root@mm-desktop:~/tx2-NX/tx2-BIOS-4.5/tx2-nx4.5.1sdk/bootloader# cd ../../ # Burn image root@mm-desktop:~/tx2-NX/tx2-BIOS-4.5# ./run_tx2_bios_4.5.1.sh
image copies
root@mm-desktop:~/tx2/tx2-BIOS-4.5# ls run_tx2_bios_4.5.1.sh tx2-nx4.5.1sdk tx2-nx4.5.1sdk.tar.gz root@mm-desktop:~/tx2/tx2-BIOS-4.5# cd tx2-nx4.5.1sdk/ root@mm-desktop:~/tx2/tx2-BIOS-4.5/tx2-nx4.5.1sdk# sudo ./flash.sh -r -k APP -G clone.img jetson-xavier-nx-devkit-tx2-nx mmcblk0p1 ... [ 13.2921 ] Reading partition [ 13.2964 ] tegradevflash_v2 --read APP /home/xm/tx2-NX/tx2-BIOS-4.5/tx2-nx4.5.1sdk/clone.img [ 13.3010 ] Bootloader version 01.00.0000 [ 13.4607 ] [................................................] 100% [ 1919.0289 ] *** The [APP] has been read successfully. *** Converting RAW image to Sparse image... # After success, clone.img and clone.img.raw files will be generated in this directory
Crop system desktop
Later, we will explain in detail why we want to cut the system
apt-get purge xorg* -y apt-get purge x11* -y apt-get purge gnome* -y apt-get purge printer-driver-* -y apt-get purge libreoffice* -y apt autoremove -y
cuda, cudnn installation
cd install_cuda bash install_package.sh #Select 1 to install cuda, and then select 2 to install cudnn
When moving cuda to other hard disks, ensure that the system has a hard disk to store files in addition to its own 16GB storage. The reason will be explained later
# files is another hard disk mount directory mv /usr/local/cuda-10.2 /files # Modify cuda environment variable vim ~/.bashrc #Amend as follows: export CUDA_HOME=/files/cuda-10.2 export LD_LIBRARY_PATH=/files/cuda-10.2/lib64:$LD_LIBRARY_PATH export PATH=/files/cuda-10.2/bin:$PATH
archiconda3
Why archiconda 3 is used? First, to facilitate the management of the environment, the tester needs to install many different environments to test different models; Second, because the storage of the original machine is not enough, the way I can think of is to mount a hard disk and install archiconda3 under the mounted hard disk, which solves the problem of insufficient storage
bash archiconda3.sh
Install python, torchvison, opencv python, onnxruntime GPU
Install pytorch
The reason why pytorch1.7 is finally used is that the latest version of pytorch1.11 needs a lot of memory to compile the source code. If the memory is not enough, it can only be collected by swap. When the swap is extended to 8GB, pytorch can be compiled. The system needs 5-6 GB, and cudnn accounts for hundreds of MB in the root directory. I only have 9.1GB left after following the above steps, which is why the system desktop needs to be cut, cuda must be removed. Note: I won't use the small system, and I don't have the conditions I can provide
Installing pytorch1.11
third_ There will be many missing files in the party directory. I don't know whether it's my network or other reasons. Just go to github one by one
First come dependency
apt install libatlas-dev liblapack-dev apt install liblapacke-dev checkinstall apt install libffi-dev apt install ninja-build apt install cmake (cmake version > 3.17)
For the packages required by the environment, the entire cycle is a little long, so it is not sure which packages are required, but it is necessary to install the above large packages
pip install cython pip install numpy pip install pyyaml pip install scikit-build pip install cffi pip install typing-extensions pip install dataclass
Finally, install pytroch 1.11
cd torch mkdir build cd buld cmake .. make -j6 cd .. python setup.py install # Because I have been reporting errors in the last step directly, I tried this method, which is feasible
This is troublesome and easy to break. It takes 5-6 hours to compile if there is no problem. I haven't experienced it for 2 hours in other blogs. Maybe due to the version problem, the more new support, the more files need to be compiled; So after my continuous exploration, I found a simpler way. Continue to look at the following
Installing pytorch 1.7
Why 1.7? I didn't download other versions of pytorch. This is a whl file, that is, NVIDIA has compiled it
Attach the installation package link provided by NVIDIA: https://elinux.org/Jetson_Zoo#PyTorch_.28Caffe2.29 The latest version is only 1.10
apt-get install libopenblas-base libopenmpi-dev python3-pip pip install torch-1.7.0...
This is simple and convenient, but the problem comes. NVIDIA I didn't find the corresponding torchvision whl installation package. After looking for it, I can only use the source code to compile; This local version must be correct, otherwise it will only report errors endlessly, which will lead to doubt about life. The version can only be checked on the official website: https://pytorch.org/get-started/previous-versions/
Installing torchvision
Or 8 g swap
git clone -b v0.8.1 https://github.com/pytorch/vision vision-0.8.1 cd vision-0.8.1 sudo python3 setup.py install
Install opencv Python
A package is missing. I'm not sure whether it is caused by my network, but the download is very slow during installation, which is easy to report an error bootdesc_ bgm.i,vgg_ generated_ 48. I and other download addresses https://download.csdn.net/download/m0_37661841/44322678
git clone https://github.com/opencv/opencv.git git checkout <What you need OpenCV Version, the latest version is recommended> git clone git://github.com/Itseez/opencv_contrib cmake -D CMAKE_BUILD_TYPE=RELEASE \ -D BUILD_opencv_python3=YES \ -D CMAKE_INSTALL_PREFIX=./install \ -D INSTALL_PYTHON_EXAMPLES=ON \ -D INSTALL_C_EXAMPLES=OFF \ -D OPENCV_ENABLE_NONFREE=ON \ # Contrib path -D OPENCV_EXTRA_MODULES_PATH=../opencv_contrib/modules \ # Your virtual environment's Python executable # You need to specify the result of echo $(which python) -D PYTHON_EXECUTABLE=~/env/bin/python \ -D BUILD_EXAMPLES=ON .. make install
Install onnxruntime GPU
pip install onnxruntime-gpu==1.7
yolov5 test
My test has some additional logic processing. The performance may be a little worse than that of the original, but not too much
Model | Enter size | speed |
---|---|---|
yolov5s | 640*384 | 90ms |
yolov5s6 | 1280*768 | 270ms |
After the installation, swap can be released. Delete files and 4GB of storage. archiconda3 can be moved to the root directory during deployment, so it can be used without attaching the hard disk;
In the end: it's actually all small problems, not enough memory, not enough storage, missing files, which files can be moved and which can't be moved, etc., but the blog bosses copy and paste, and no one records these small problems. They expand from swap 2GB to 4GB, then to 6GB to 8GB, keep trying, try to doubt life, climb in the pit for half a month, and finally get ashore; The performance test of other models will be updated later