Come on, Wuhan! Crawling Baidu migration map data

Keywords: JSON Windows

Recently, in order to build a model to predict the spread of the virus, we need to crawl the data on Baidu's migration, and write here

1, Find the law

First of all, we open http://qianxi.baidu.com/, and randomly query the information of a city, such as Beijing 29:

You can find the data here. Let's look at the hearers

Find: request URL: http://huiyan.baidu.com/migration/cityrank.json? DT = Province & id = 110000 & type = move_out & date = 20200129 & callback = json_1580386984204_
Ignoring the last &, we can find that only two variables (id, date) corresponding parameters can be found to batch crawl data. Date, we can confirm. Now we need to find id

Next, find http://qianxi.cdn.bcebos.com/app/index.js?b8e016517dc0b92ce531 in the source code:
After opening, the id corresponding to each region is found:

2, Code:

#coding:utf-8
import urllib.request
import os
import pandas as pd



CODE = "Beijing|110000,Tianjin|120000,Xingan Meng|152200,Chaohu Lake|340181,Ding An|469021,Tuen Chang|469022,Chengmai|469023,ascend a height|469024,Haidong Prefecture|630200,Hong Kong|810000,Macao|820000,Changdu area|540300,Shannan Prefecture|540500,Shigatse region|540200,Nagqu Prefecture|540600,nyingchi prefecture|540400,Turpan region|650400,Tongren area|520600,Bijie area|520500,Shijiazhuang|130100,Tangshan|130200,Qinghuangdao|130300,Handan|130400,Xingtai|130500,Baoding|130600,Zhangjiakou|130700,Chengde|130800,Cangzhou|130900,Langfang|131000,Hengshui|131100,Taiyuan|140100,Da Tong|140200,Yangquan|140300,CiH|140400,Jincheng|140500,Shuozhou|140600,Jinzhong|140700,Yuncheng|140800,Xinzhou|140900,Linfen|141000,Lvliang|141100,Hohhot|150100,Baotou|150200,Wuhai|150300,Chifeng|150400,Tongliao|150500,Erdos|150600,Hulun Buir|150700,Bayannaoer|150800,Wulanchabu|150900,Shenyang|210100,Dalian|210200,Anshan|210300,Fushun|210400,Benxi|210500,Dandong|210600,Jinzhou|210700,Yingkou|210800,Fuxin|210900,Liaoyang|211000,Panjin|211100,Tieling|211200,Chaoyang|211300,Huludao|211400,Changchun|220100,Siping|220300,Liaoyuan|220400,make well-connected|220500,Mount Bai|220600,Songyuan|220700,Baicheng|220800,Harbin|230100,Qiqihar|230200,Jixi|230300,Hegang|230400,Shuangyashan|230500,Daqing|230600,Yichun|230700,Jiamusi|230800,Qitaihe|230900,Mudanjiang|231000,Heihe|231100,Suihua|231200,Shanghai|310000,Nanjing|320100,Wuxi|320200,Xuzhou|320300,Changzhou|320400,Suzhou|320500,Nantong|320600,Lianyungang|320700,Huaian|320800,ynz|320900,Yangzhou|321000,Zhenjiang|321100,Taizhou|321200,Suqian|321300,Hangzhou|330100,Ningbo|330200,Wenzhou|330300,Jiaxing|330400,Huzhou|330500,Shaoxing|330600,Jinhua|330700,Quzhou|330800,Zhoushan|330900,Taizhou|331000,Lishui|331100,Hefei|340100,Wuhu|340200,Bengbu|340300,Huainan|340400,Ma'anshan|340500,Huaibei|340600,Tongling|340700,Anqing|340800,Mount Huangshan|341000,Chuzhou|341100,Fuyang|341200,Suzhou|341300,Lu'an|341500,Bozhou|341600,Chizhou|341700,Xuancheng|341800,Fuzhou|350100,Xiamen|350200,Putian|350300,Sanming|350400,Quanzhou|350500,Zhangzhou|350600,Nanping|350700,Longyan|350800,Ningde|350900,Nanchang|360100,Jingdezhen|360200,Pingxiang|360300,Jiujiang|360400,Xinyu|360500,Yingtan|360600,Ganzhou|360700,Ji'an|360800,Yichun|360900,Fuzhou|361000,Shangrao|361100,Ji'nan|370100,Qingdao|370200,Zibo|370300,Zaozhuang|370400,doy|370500,Yantai|370600,Weifang|370700,Jining|370800,Tai'an|370900,Weihai|371000,sunshine|371100,Laiwu prefecture level city in Shandong|370100,Linyi|371300,Texas|371400,Liaocheng|371500,Binzhou|371600,Heze|371700,Zhengzhou|410100,Kaifeng|410200,Luoyang|410300,Pingdingshan|410400,Anyang|410500,Hebi|410600,Xinxiang|410700,Jiaozuo|410800,Puyang|410900,Xuchang|411000,Luohe|411100,Sanmenxia|411200,Nanyang|411300,Shangqiu|411400,Xinyang|411500,Zhoukou|411600,Zhumadian|411700,Wuhan|420100,Huangshi|420200,Shiyan|420300,Yichang|420500,Xiangyang|420600,Ezhou|420700,Jingmen|420800,Xiaogan|420900,Jingzhou|421000,Huanggang|421100,Xianning|421200,Suizhou|421300,peach of immortality|429004,Qianjiang|429005,Tianmen|429006,Changsha|430100,Zhuzhou|430200,Xiangtan|430300,city in Hunan|430400,Shaoyang|430500,Yueyang|430600,Changde|430700,Zhangjiajie|430800,Yiyang|430900,Chenzhou|431000,Yongzhou|431100,Huaihua|431200,Loudi|431300,Guangzhou|440100,Shaoguan|440200,Shenzhen|440300,Zhuhai|440400,Shantou|440500,Foshan|440600,Jiangmen|440700,Zhanjiang|440800,Maoming|440900,Zhaoqing|441200,Huizhou|441300,Meizhou|441400,Shanwei|441500,Heyuan|441600,Yangjiang|441700,Qingyuan|441800,Dongguan|441900,Zhongshan|442000,Chaozhou|445100,Jieyang|445200,Yunfu|445300,Nanning|450100,city in Guangxi|450200,Guilin|450300,Wuzhou|450400,The North Sea|450500,Port of Fangcheng|450600,Qinzhou|450700,Guigang|450800,Yulin|450900,Baise|451000,Hezhou|451100,Hechi|451200,Guest|451300,Chongzuo|451400,Haikou|460100,Sanya|460200,Five Fingers Group|469001,Qionghai|469002,Danzhou|460400,God of Literature|469005,Wanning|469006,east|469007,Chongqing|500000,Chengdu|510100,Zigong|510300,Panzhihua|510400,Luzhou|510500,Deyang|510600,Mianyang|510700,Guangyuan|510800,Suining|510900,Neijiang|511000,Leshan|511100,Nao|511300,Meishan|511400,Yibin|511500,Guang'an|511600,Dazhou|511700,Ya'an|511800,Bazhong|511900,Ziyang|512000,Guiyang|520100,Liupanshui|520200,Zunyi|520300,Anshun|520400,Kunming|530100,Qujing|530300,Yuxi|530400,Baoshan|530500,Zhaotong|530600,Lijiang|530700,Lincang|530900,Pu'er Tea|530800,Lhasa|540100,Xi'an|610100,Tongchuan|610200,Baoji|610300,Xianyang|610400,Weinan|610500,Yan'an|610600,Hanzhoung|610700,Yulin|610800,Ankang|610900,Shangluo|611000,Lanzhou|620100,Jiayuguan|620200,Jinchang|620300,silver|620400,Tianshui|620500,Wuwei|620600,Zhangye|620700,Pingliang|620800,Jiuquan|620900,Qingyang|621000,Dingxi|621100,Longnan|621200,Xining|630100,Yinchuan|640100,Shizuishan|640200,Wu Zhong|640300,Guyuan|640400,Central defender|640500,Urumqi|650100,Karamay|650200,Shihezi|659001,Alar|659002,Tumu Shuker|659003,Wu Jia Qu|659004,Enshi|422800,Enshi Tujia and Miao Autonomous Prefecture |422800,Yanbian|222400,Yanbian Korean Autonomous Prefecture|222400,Shennongjia area|429021,Shennongjia Forestry District |429021,Xiangxi Prefecture|433100,Xiangxi Tujia and Miao Autonomous Prefecture|433100,Daxinganling area|232700,Baisha County|469025,Baisha Li Autonomous County |469025,Changjiang Li Autonomous County|469026,Ledong Li Autonomous County |469027,Lingshui Li Autonomous County |469028,Baoting Li and Miao Autonomous County|469029,Qiongzhong Li and Miao Autonomous County|469030,Aba Prefecture|513200,Aba Tibetan and Qiang Autonomous Prefecture|513200,The state of Gansu|513300,Ganzi Tibetan Autonomous Prefecture|513300,Liangshan Prefecture|513400,Liangshan Yi Autonomous Prefecture|513400,Qianxinan Buyei and Miao Autonomous Prefecture |522300,Qiandongnan Miao and Dong Autonomous Prefecture|522600,Qiannan Buyi and Miao Autonomous Prefecture|522700,Chuxiong|532300,Chuxiong Yi Autonomous Prefecture|532300,Honghe Prefecture|532500,Honghe Hani and Yi Autonomous Prefecture|532500,Wenshan Prefecture|532600,Wenshan Zhuang and Miao Autonomous Prefecture|532600,Xishuangbanna Dai Autonomous Prefecture|532800,Dali Prefecture|532900,Dali Bai Autonomous Prefecture|532900,Dehong|533100,Dehong Dai Jingpo Autonomous Prefecture|533100,Nujiang|533300,Nujiang Lisu Autonomous Prefecture|533300,Diqing Prefecture|533400,Diqing Tibetan Autonomous Prefecture|533400,Ali Area|542500,Linxia Hui Autonomous Prefecture|622900,Gannan Tibetan Autonomous Prefecture|623000,Haibei Tibetan Autonomous Prefecture|632200,Huangnan Tibetan Autonomous Prefecture|632300,Hainan Tibetan Autonomous Prefecture|632500,Golog Tibetan Autonomous Prefecture|632600,Yushu Tibetan Autonomous Prefecture|632700,Haixi Mongol and Tibetan Autonomous Prefecture |632800,Changji Hui Autonomous Prefecture|652300,Bortala Mongolian Autonomous Prefecture|652700,Bayingolin Mongolian Autonomous Prefecture|652800,Hami area|650500,Hami|650500,Aksu Region|652900,Kizilsu Kirgiz Autonomous Prefecture|653000,Ili Kazak Autonomous Prefecture|654000,Kashi area|653100,Hotan Prefecture|653200,Tacheng area|654200,Altay Region|654300,Xilingol League|152500,alxa league|152900"
name = [x.split("|")[0] for x in CODE.split(',')]
number = [x.split("|")[1] for x in CODE.split(',')]
code = list(zip(number, name))
code = {val : index for val, index in code}
time = list(range(20200110, 20200126))
if not os.path.exists('Wuhan'):
     os.mkdir('Wuhan')
os.chdir('Wuhan')
exist = os.listdir()
exist = {i[:-5] for i in exist}

def Open(url):
    heads = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}
    req = urllib.request.Request(url, headers=heads)
    response = urllib.request.urlopen(url)
    html = response.read()
    return html.decode('unicode_escape')

def conserve(html, time, name):
    global work
    city = []
    value = []
    for i in html['list']:
        city.append(i['province_name'] + i['city_name'])
        value.append(i['value'])
    res = {'City':city, 'Proportion':value}
    res = pd.DataFrame(res)
    res.to_excel(excel_writer=work, sheet_name=time)

def main():
    for num, name in code.items():
        if name in exist:
            continue
        global work
        f = pd.DataFrame()
        f.to_excel(name + '.xlsx')
        work = pd.ExcelWriter(name + '.xlsx')
        for t in time:
            try:
                print(name, t)
                utl = 'http://huiyan.baidu.com/migration/cityrank.jsonp?dt=province&id=' + num + '&type=move_in&date=' + str(t)
                html = Open(utl).split('(')[1][:-1]
                conserve(eval(html)['data'], str(t), name)
            except SyntaxError:
                pass

        work.save()


main()

3, Results:


It's good to continue running if the error is reported, because the data in some places is empty, and other dates can be modified in the time list. Remember to modify the headers

If migration index is needed:

#coding:utf-8
import urllib.request
import os
import pandas as pd
import json



CODE = "Beijing|110000,Tianjin|120000,Xingan Meng|152200,Chaohu Lake|340181,Ding An|469021,Tuen Chang|469022,Chengmai|469023,ascend a height|469024,Haidong Prefecture|630200,Hong Kong|810000,Macao|820000,Changdu area|540300,Shannan Prefecture|540500,Shigatse region|540200,Nagqu Prefecture|540600,nyingchi prefecture|540400,Turpan region|650400,Tongren area|520600,Bijie area|520500,Shijiazhuang|130100,Tangshan|130200,Qinghuangdao|130300,Handan|130400,Xingtai|130500,Baoding|130600,Zhangjiakou|130700,Chengde|130800,Cangzhou|130900,Langfang|131000,Hengshui|131100,Taiyuan|140100,Da Tong|140200,Yangquan|140300,CiH|140400,Jincheng|140500,Shuozhou|140600,Jinzhong|140700,Yuncheng|140800,Xinzhou|140900,Linfen|141000,Lvliang|141100,Hohhot|150100,Baotou|150200,Wuhai|150300,Chifeng|150400,Tongliao|150500,Erdos|150600,Hulun Buir|150700,Bayannaoer|150800,Wulanchabu|150900,Shenyang|210100,Dalian|210200,Anshan|210300,Fushun|210400,Benxi|210500,Dandong|210600,Jinzhou|210700,Yingkou|210800,Fuxin|210900,Liaoyang|211000,Panjin|211100,Tieling|211200,Chaoyang|211300,Huludao|211400,Changchun|220100,Siping|220300,Liaoyuan|220400,make well-connected|220500,Mount Bai|220600,Songyuan|220700,Baicheng|220800,Harbin|230100,Qiqihar|230200,Jixi|230300,Hegang|230400,Shuangyashan|230500,Daqing|230600,Yichun|230700,Jiamusi|230800,Qitaihe|230900,Mudanjiang|231000,Heihe|231100,Suihua|231200,Shanghai|310000,Nanjing|320100,Wuxi|320200,Xuzhou|320300,Changzhou|320400,Suzhou|320500,Nantong|320600,Lianyungang|320700,Huaian|320800,ynz|320900,Yangzhou|321000,Zhenjiang|321100,Taizhou|321200,Suqian|321300,Hangzhou|330100,Ningbo|330200,Wenzhou|330300,Jiaxing|330400,Huzhou|330500,Shaoxing|330600,Jinhua|330700,Quzhou|330800,Zhoushan|330900,Taizhou|331000,Lishui|331100,Hefei|340100,Wuhu|340200,Bengbu|340300,Huainan|340400,Ma'anshan|340500,Huaibei|340600,Tongling|340700,Anqing|340800,Mount Huangshan|341000,Chuzhou|341100,Fuyang|341200,Suzhou|341300,Lu'an|341500,Bozhou|341600,Chizhou|341700,Xuancheng|341800,Fuzhou|350100,Xiamen|350200,Putian|350300,Sanming|350400,Quanzhou|350500,Zhangzhou|350600,Nanping|350700,Longyan|350800,Ningde|350900,Nanchang|360100,Jingdezhen|360200,Pingxiang|360300,Jiujiang|360400,Xinyu|360500,Yingtan|360600,Ganzhou|360700,Ji'an|360800,Yichun|360900,Fuzhou|361000,Shangrao|361100,Ji'nan|370100,Qingdao|370200,Zibo|370300,Zaozhuang|370400,doy|370500,Yantai|370600,Weifang|370700,Jining|370800,Tai'an|370900,Weihai|371000,sunshine|371100,Laiwu prefecture level city in Shandong|370100,Linyi|371300,Texas|371400,Liaocheng|371500,Binzhou|371600,Heze|371700,Zhengzhou|410100,Kaifeng|410200,Luoyang|410300,Pingdingshan|410400,Anyang|410500,Hebi|410600,Xinxiang|410700,Jiaozuo|410800,Puyang|410900,Xuchang|411000,Luohe|411100,Sanmenxia|411200,Nanyang|411300,Shangqiu|411400,Xinyang|411500,Zhoukou|411600,Zhumadian|411700,Wuhan|420100,Huangshi|420200,Shiyan|420300,Yichang|420500,Xiangyang|420600,Ezhou|420700,Jingmen|420800,Xiaogan|420900,Jingzhou|421000,Huanggang|421100,Xianning|421200,Suizhou|421300,peach of immortality|429004,Qianjiang|429005,Tianmen|429006,Changsha|430100,Zhuzhou|430200,Xiangtan|430300,city in Hunan|430400,Shaoyang|430500,Yueyang|430600,Changde|430700,Zhangjiajie|430800,Yiyang|430900,Chenzhou|431000,Yongzhou|431100,Huaihua|431200,Loudi|431300,Guangzhou|440100,Shaoguan|440200,Shenzhen|440300,Zhuhai|440400,Shantou|440500,Foshan|440600,Jiangmen|440700,Zhanjiang|440800,Maoming|440900,Zhaoqing|441200,Huizhou|441300,Meizhou|441400,Shanwei|441500,Heyuan|441600,Yangjiang|441700,Qingyuan|441800,Dongguan|441900,Zhongshan|442000,Chaozhou|445100,Jieyang|445200,Yunfu|445300,Nanning|450100,city in Guangxi|450200,Guilin|450300,Wuzhou|450400,The North Sea|450500,Port of Fangcheng|450600,Qinzhou|450700,Guigang|450800,Yulin|450900,Baise|451000,Hezhou|451100,Hechi|451200,Guest|451300,Chongzuo|451400,Haikou|460100,Sanya|460200,Five Fingers Group|469001,Qionghai|469002,Danzhou|460400,God of Literature|469005,Wanning|469006,east|469007,Chongqing|500000,Chengdu|510100,Zigong|510300,Panzhihua|510400,Luzhou|510500,Deyang|510600,Mianyang|510700,Guangyuan|510800,Suining|510900,Neijiang|511000,Leshan|511100,Nao|511300,Meishan|511400,Yibin|511500,Guang'an|511600,Dazhou|511700,Ya'an|511800,Bazhong|511900,Ziyang|512000,Guiyang|520100,Liupanshui|520200,Zunyi|520300,Anshun|520400,Kunming|530100,Qujing|530300,Yuxi|530400,Baoshan|530500,Zhaotong|530600,Lijiang|530700,Lincang|530900,Pu'er Tea|530800,Lhasa|540100,Xi'an|610100,Tongchuan|610200,Baoji|610300,Xianyang|610400,Weinan|610500,Yan'an|610600,Hanzhoung|610700,Yulin|610800,Ankang|610900,Shangluo|611000,Lanzhou|620100,Jiayuguan|620200,Jinchang|620300,silver|620400,Tianshui|620500,Wuwei|620600,Zhangye|620700,Pingliang|620800,Jiuquan|620900,Qingyang|621000,Dingxi|621100,Longnan|621200,Xining|630100,Yinchuan|640100,Shizuishan|640200,Wu Zhong|640300,Guyuan|640400,Central defender|640500,Urumqi|650100,Karamay|650200,Shihezi|659001,Alar|659002,Tumu Shuker|659003,Wu Jia Qu|659004,Enshi|422800,Enshi Tujia and Miao Autonomous Prefecture |422800,Yanbian|222400,Yanbian Korean Autonomous Prefecture|222400,Shennongjia area|429021,Shennongjia Forestry District |429021,Xiangxi Prefecture|433100,Xiangxi Tujia and Miao Autonomous Prefecture|433100,Daxinganling area|232700,Baisha County|469025,Baisha Li Autonomous County |469025,Changjiang Li Autonomous County|469026,Ledong Li Autonomous County |469027,Lingshui Li Autonomous County |469028,Baoting Li and Miao Autonomous County|469029,Qiongzhong Li and Miao Autonomous County|469030,Aba Prefecture|513200,Aba Tibetan and Qiang Autonomous Prefecture|513200,The state of Gansu|513300,Ganzi Tibetan Autonomous Prefecture|513300,Liangshan Prefecture|513400,Liangshan Yi Autonomous Prefecture|513400,Qianxinan Buyei and Miao Autonomous Prefecture |522300,Qiandongnan Miao and Dong Autonomous Prefecture|522600,Qiannan Buyi and Miao Autonomous Prefecture|522700,Chuxiong|532300,Chuxiong Yi Autonomous Prefecture|532300,Honghe Prefecture|532500,Honghe Hani and Yi Autonomous Prefecture|532500,Wenshan Prefecture|532600,Wenshan Zhuang and Miao Autonomous Prefecture|532600,Xishuangbanna Dai Autonomous Prefecture|532800,Dali Prefecture|532900,Dali Bai Autonomous Prefecture|532900,Dehong|533100,Dehong Dai Jingpo Autonomous Prefecture|533100,Nujiang|533300,Nujiang Lisu Autonomous Prefecture|533300,Diqing Prefecture|533400,Diqing Tibetan Autonomous Prefecture|533400,Ali Area|542500,Linxia Hui Autonomous Prefecture|622900,Gannan Tibetan Autonomous Prefecture|623000,Haibei Tibetan Autonomous Prefecture|632200,Huangnan Tibetan Autonomous Prefecture|632300,Hainan Tibetan Autonomous Prefecture|632500,Golog Tibetan Autonomous Prefecture|632600,Yushu Tibetan Autonomous Prefecture|632700,Haixi Mongol and Tibetan Autonomous Prefecture |632800,Changji Hui Autonomous Prefecture|652300,Bortala Mongolian Autonomous Prefecture|652700,Bayingolin Mongolian Autonomous Prefecture|652800,Hami area|650500,Hami|650500,Aksu Region|652900,Kizilsu Kirgiz Autonomous Prefecture|653000,Ili Kazak Autonomous Prefecture|654000,Kashi area|653100,Hotan Prefecture|653200,Tacheng area|654200,Altay Region|654300,Xilingol League|152500,alxa league|152900"
name = [x.split("|")[0] for x in CODE.split(',')]
number = [x.split("|")[1] for x in CODE.split(',')]
code = list(zip(number, name))
code = {val : index for val, index in code}
time = list(range(20200110, 20200126))
if not os.path.exists('Wuhan 1'):
     os.mkdir('Wuhan 1')
os.chdir('Wuhan 1')

def Open(url):
    heads = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}
    req = urllib.request.Request(url, headers=heads)
    response = urllib.request.urlopen(url)
    html = response.read()
    return html.decode('unicode_escape')

def conserve(html, name):
    times = []
    work = pd.ExcelWriter(name + '.xlsx')
    value = []
    for i in html['list']:
        times.append(i)
        value.append(html['list'][i])
    res = {'time':times, 'Migration scale index':value}
    res = pd.DataFrame(res)
    res.to_excel(excel_writer=work)
    work.save()


def main():
    for num, name in code.items():
        f = pd.DataFrame()
        f.to_excel(name + '.xlsx')
        try:
            print(name)
            utl = 'http://huiyan.baidu.com/migration/historycurve.jsonp?dt=province&id=' + num + '&type=move_out&startDate=20200110&endDate=20200125'
            html = Open(utl).split('(')[1][:-1]
            conserve(eval(html)['data'], name)
        except:
            pass



main()

In addition, the data to be migrated in only needs to be modified to 'http://huiyan.baidu.com/migration/historycycle.json? DT = Province & id =' + num + '& type = move_in & StartDate = 20200110 & enddate = 20200125'

100 original articles published, 10 praised, 3364 visited
Private letter follow

Posted by heckenschutze on Thu, 30 Jan 2020 07:49:47 -0800