python crawler query station information
Catalog:
1. Find the url to query
2. Analysis of information
3. Processing of information
python crawler queries the same station
Catalog:
1. Find the url to query
2. Analysis of information
3. Processing of information
1. Find the url of station information
2. Analyze station information and find that each station information is separated by "@"
Station information query
#Station information query import requests #1.Get url(Access to station information url)And read, according to the characteristics of the obtained information, remove the useless information and convert it to the list url="https://kyfw.12306.cn/otn/resources/js/framework/station_name.js?station_version=1.9090" txt=requests.get(url).text #print(txt) inf=txt[:-2].split("@")[1:] #The result is a list of all station information #print(inf) #2.Turn the list into a new list by dividing it in cycle,Take one of them(Station sequence number)As a dictionary key,The rest are stored as values in a new dictionary stations={} for record in inf: rlist=record.split("|") stations[int(rlist[-1])]={"cname":rlist[1],"id":rlist[2],"qp":rlist[3],"jx":rlist[4]} #print(stations[0]) #print(stations.get(2848)) #print(stations.values()) #3.Determine whether the query condition exists. If it is unique, print and jump out the cycle. If it is not unique, all the results of the query will be displayed. Options are provided,Print out the result according to the selected information and then jump out of the cycle. If not, print the prompt information,Re export while True: s1=input("Departure station:") flag=0 result=[] for station in stations.values(): if s1 in station.values(): #print(station) result.append(station) flag=1 if flag: break else: print("There is no such station!") print("Please re-enter!") if len(result)==1: resultId=result[0]["id"] print("The departure station you entered is%s,Corresponding station ID yes%s"%(result[0]["cname"],resultId)) else: print("The conditions you entered are vague,Please select from the following stations:") for i in range(len(result)): print(i+1,result[i]["cname"],result[i]["id"]) sel=int(input("Your choice:"))-1 resultId=result[sel]["id"] print("The departure station you entered is%s,Corresponding station ID yes%s"%(result[sel]["cname"],resultId)) while True: s2=input("Destination station:") flag2=0 result2=[] for station in stations.values(): if s2 in station.values(): #print(station) result2.append(station) flag2=1 if flag2: break else: print("There is no such station!") print("Please re-enter!") if len(result2)==1: result2Id=result2[0]["id"] print("The destination station you entered is%s,Corresponding station ID yes%s"%(result2[0]["cname"],result2Id)) else: print("The conditions you entered are vague,Please select from the following stations:") for i in range(len(result2)): print(i+1,result2[i]["cname"],result2[i]["id"]) sel2=int(input("Your choice:"))-1 result2Id=result2[sel2]["id"] print("The destination station you entered is%s,Corresponding station ID yes%s"%(result2[sel]["cname"],result2Id)) #Generate a query with url(url Search in browser developer mode) qurl="https://kyfw.12306.cn/otn/leftTicket/queryZ?leftTicketDTO.train_date=2019-01-14&leftTicketDTO.from_station=%s&leftTicketDTO.to_station=%s&purpose_codes=ADULT" print(qurl %(resultId,result2Id))
The operation effect is as follows:
2. Find out all station names with the same spelling
#python Find out all station names with the same spelling import requests url="https://kyfw.12306.cn/otn/resources/js/framework/station_name.js?station_version=1.9090" txt=requests.get(url).text inf=txt[:-2].split("@")[1:] stations={} for record in inf: rlist=record.split("|") stations[int(rlist[-1])]={"cname":rlist[1],"id":rlist[2],"qp":rlist[3],"jx":rlist[4]} pyin=[] for station in stations.values(): #All the information"qp"Put all the corresponding values in the list(pyin) pyin.append(station["qp"]) npy=list(set(pyin)) #Using the de duplication feature of sets to remove duplicates from lists npy.sort() #Sort list c={} for station in stations.values(): #All the whole spellings are used as keys and values are added to the new dictionary c[station["qp"]]=c.get(station["qp"],0)+1 #print(c) c2=[] for k,v in c.items(): #Judge whether the value of the dictionary is greater than 1. If it is greater than 1, it means that there is a station name with the same spelling if v>1: c2.append(k) #Add all the matches to the new list c2.sort() #print(c2) for p in c2: #Traverse the list and print out the qualified stations print(p,end=":") for station in stations.values(): if p==station["qp"]: print(station["cname"])
The operation effect is as follows: