python3 crawler grabs job information code of Zhilian

Keywords: Python JSON Windows Firefox

On the code, if you have any questions, please leave a message.

# -*- coding: utf-8 -*-
"""
Created on Tue Aug  7 20:41:09 2018
@author: brave-man
blog: http://www.cnblogs.com/zrmw/
"""

import requests
from bs4 import BeautifulSoup
import json

def getDetails(url):
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20100101 Firefox/6.0'}
    res = requests.get(url, headers = headers)
    res.encoding = 'utf-8'
    soup = BeautifulSoup(res.text, 'html.parser')
    soup = json.loads(str(soup))
    
    try:
        with open('jobDetails.txt', 'w') as f:
            print('Establish {} Successful file'.format('jobDetails.txt'))
    except:
        print('failure')
    
    details = {}    
    for i in soup['data']['results']:
        jobName = i['jobName']
        salary = i['salary']
        company = i['company']['name']
        companyUrl = i['company']['url']
        positionURL = i['positionURL']
        details = {'jobName': jobName,
                   'salary': salary,
                   'company': company,
                   'companyUrl': companyUrl,
                   'positionURL': positionURL
                   }
#        print(details)
        toFile(details)

def toFile(d):
    dj = json.dumps(d)
    try:
        with open('jobDetails.txt', 'a') as f:
            f.write(dj)
#            print('sucessful')
    except:
        print('Error')

def main():
    url = 'https://fe-api.zhaopin.com/c/i/sou?pageSize=60&cityId=635&workExperience=-1&education=-1&companyType=-1&employmentType=-1&jobWelfareTag=-1&kw=python&kt=3&lastUrlQuery={"jl":"635","kw":"python","kt":"3"}'
    getDetails(url)

if __name__ == "__main__":
    main()

After executing the above code, a TXT file, jobDetails.txt, will be created under the same directory as the code to save the position information.

This is just the code to get one page of recruitment information, which will be added later. How to get the url and the code of all pages of recruitment information.

There is still a little hole in the recruitment website of Zhilian, that is, not all the recruitment position details pages are in the official website format of Zhilian. After clicking on a recruitment position, the link is directed to the recruitment website of the official website of a company, which will be dealt with in detail later.

Posted by sonny on Sun, 05 Jan 2020 10:45:34 -0800