- Environmental description
[root@localhost Python-3.6.6]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.4 (Maipo) [root@localhost Python-3.6.6]# uname -a Linux localhost.localdomain 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux [root@localhost Python-3.6.6]# getenforce Disabled [root@localhost Python-3.6.6]# systemctl status firewalld.service ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled) Active: inactive (dead) Docs: man:firewalld(1) [root@localhost Python-3.6.6]#
- requests library selenium Library
pip3 install requests pip3 install selenium
- chromederiver installation
yum install Xvfb yum install libXfont yum install xorg-x11-fonts* vim /etc/yum.repos.d/google.repo [google] name=Google-x86_64 baseurl=http://dl.google.com/linux/rpm/stable/x86_64 enabled=1 gpgcheck=0 gpgkey=https://dl-ssl.google.com/linux/linux_signing_key.pub yum install google-chrome-stable yum install GConf2-3.2.6-8.el7.x86_64 wget http://chromedriver.storage.googleapis.com/70.0.3538.67/chromedriver_linux64.zip unzip chromedriver_linux64.zip mv chromedrive /usr/bin chmod +x /usr/bin/chromedrive chromedriver Starting ChromeDriver (v2.9.248304) on port 9515 #Verification python3 >>> from selenium import webdriver >>> browser = webdriver.Chrome() #A blank chrome will pop up #By default, the root user cannot call chrome. It is recommended to create a separate user for chrome
- GeckoDriver installation
yum install firefox wget https://github.com/mozilla/geckodriver/releases/download/v0.23.0/geckodriver-v0.23.0-linux64.tar.gz tar xf geckodriver-v0.23.0-linux64.tar.gz -C /usr/bin chmod +x geckodriver #Verification python3 >>> from selenium import webdriver >>> browser = webdriver.Firefox() #A blank Firefox will pop up
Above, we can use chrome or firefox to grab web pages, but there will be a problem: the browser needs to be open all the time during the operation of the program. So we can choose PhantomJS, the browser of XX.
- PhantomJS installation
wget https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2 tar xf https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2 cd phantomjs-2.1.1-linux-x86_64/bin mv phantomjs /usr/bin/ chmod +x /usr/bin/phantomjs #Verification python3 >>> from selenium import webdriver >>> browser = webdriver.PhantomJS() >>> browser.get('https://www.baidu.com') >>> print (browser.current_url) https://www.baidu.com/ >>> #At this point, the browser does not open, but the request address is printed through print. It can be used normally.
- aiohttp installation
Aiohttp is a request library similar to requests. The difference is that aiohttp is a library that provides asynchronous web services.
The installation method is as follows:
pip3 install aiohttp pip3 install cchardet aiodns #Character encoding detection library and accelerating DNS resolution Library