Group tour
import time import random import math people=[('seymour','BOS'), ('FRANNY','DAL'), ('ZOOEY','CAK'), ('WALT','MIA'), ('buddy','ORD'), ('LES','OMA')] destination='LGA'
Flight data schedule.txt
Start, end, departure time, arrival time, price
Code optimization.py for loading data
flights={} for line in file('schedule.txt'): origin,dest,depart,arrive,price=line.strip().split(',') #key:(origin,dest) value:(depart,arrive,price) flights.setdefault((origin,dest),[]) flights[(origin,dest)].append((depart,arrive,int(price))) def getminutes(t): x=time.strptime(t,'%H:%M') return x[3]*60+x[4] #x[3] is hours; x[4] is minutes
Descriptive title
r=[1,4,3,2,7,3,6,3,2,4,5,3] def printschedule(r): for d in range(len(r)/2): name=people[d][0] origin=people[d][1] out=flights[(origin,destination)][r[2*d]] ret=flights[(destination,origin)][r[2*d+1]] print '%10s%10s %5s-%5s $%3s %5s-%5s $3s'%(name,origin,out[0],out[1],out[2],ret[0],ret[1],ret[2])
cost function
The smaller the return value, the better
This function examines the total travel cost and the total waiting time of different family members at the airport. If the car is returned after the rental time, a further $50 fine will be imposed.
sol=r def schedulecost(sol): totalprice=0 latestarrival=0 earliestdep=24*60 for d in range(len(sol)/2): origin=people[d][1] outbound=flights[(origin,desitination)][int(sol[2*d])] returnf=flights[(desitination,origin)][int(sol[2*d+1])] #Total price: round trip totalprice+=outbound[2] totalprice+=returnf[2] #Record the latest arrival time and the earliest departure time if latestarrival<getminutes(outbound[1]):latestarrival=getminutes(outbound[1]) if earliestdep>getminutes(returnf[0]):earliestdep=getminutes(returnf[0]) #Everyone has to wait at the airport until the last one arrives #They must also arrive at the same time and wait for their return flight totalwait=0 for d in range(len(sol)/2): origin=people[d][1] outbound=flights[(origin,desitination)][int(sol[2*d])] returnf=flights[(desitination,origin)][int(sol[2*d+1])] totalwait+=latestarrival-getminutes(outbound[1]) totalwait+=getminutes(returnf[0])-earliestdep if latestarrival>earliestdep: totalprice+=50 return totalprice+totalwait
random search
It is the baseline for us to evaluate other algorithms.
domain is the sum of flight numbers of each person's round-trip flight, and the flight number is the number between (0,9).
domain=[(0,9)]*(len(optimization.people)*2) def randomoptimize(domain,costf): best=99999999 bestr=None for i in range(1000): #r=[1,4,3,2,7,3,6,3,2,4,5,3] r=[random.randint(domain[i][0],domain[i][1]) for i in range(len(domain))] cost=costf(r) if cost<best: best=cost bestr=r return r
Mountain climbing method
It is very inefficient to try all kinds of solutions at random, and the optimal solutions that have been found are not fully utilized.
Start with a random travel arrangement scheme, and then find all the arrangements adjacent to it, that is, find all the arrangements that can make everyone travel earlier or later than the original random arrangement. When we calculate the cost of the adjacent time arrangement, the arrangement of the lowest cost will become a new problem. Until there is no arrangement to improve the cost.
domain is the sum of flight numbers of each person's round-trip flight, and the flight number is the number between (0,9).
def hillclimb(domain,costf): #Create a random solution SOL similar to r sol=[random.randint(domain[i][0],domain[i][1]) for i in range(len(domain))] #Main cycle while 1: #Create a list of adjacent solutions neighbors=[] for j in range(len(domain)): #Deviation from original value in each direction if sol[j]>domain[j][0]: #A string of r is passed in, but sol[j] is changed neighbors.append(sol[0:j]+[sol[j]-1]+sol[j+1:]) if sol[j]<domain[j][0]: #A string of r is passed in, but sol[j] is changed neighbors.append(sol[0:j]+[sol[j]+1]+sol[j+1:]) #Finding the best solution in the neighborhood current=costf(sol) best=current for j in range(len(neighbors)): cost=costf(neighbors[j]) if cost<best: best=cost sol=neighbors[j] if best==current: break return sol
The output sol is the optimal initialization r flight table
sol=optimization.hillclimb(domain,optimization.schedulecost) optimization.schedulecost(sol) #Print flight schedule #sol=[1,4,3,2,7,3,6,3,2,4,5,3] optimization.printschedule(sol)
The final result may be the local optimal solution rather than the global optimal solution.
Simulated annealing
It can avoid falling into the local optimal solution.
In some cases, it is necessary to turn to a worse explanation before we can get a better one. Simulated annealing algorithm not only because it will accept a better solution, but also because it will accept a lower performance value at the beginning of the annealing process. With the continuous process of annealing, it is more and more impossible for the algorithm to accept poor solutions, until the end, it will only accept better solutions.
The algorithm will only tend to a slightly worse solution rather than a very bad one.
def annealingoptimize(domain,costf,T=10000,cool=0.95.step=1): #Random initialization value vec=[float(random.randint(domain[i][0],domain[i][1])) for i in range(len(domain))] while T>0.1: i=random.randint(0,len(domain)-1) #Select a direction to change the index dir=random.randint(-step,step) vecb=vec[:] vecb[i]+=dir if vecb[i]<domain[i][0]:vecb[i]=domain[i][0] elif vecb[i]>domain[i][1]: vecb[i]=domain[i][0] ea=costf(vec) eb=costf(vecb) if(eb<ea or random.random()<pow(math.e,-(eb-ea)/T)) vec=vecb T=T*cool return vec
sol=optimization.annealingoptimize(domain,optimization.schedulecost) optimization.schedulecost(sol) #Print flight schedule #sol=[1,4,3,2,7,3,6,3,2,4,5,3] optimization.printschedule(sol)
genetic algorithm
First, we randomly generate a set of solutions, which we call population.
At each step of the optimization, the algorithm will calculate the cost function of the whole population, so as to get an ordered list of solutions.
After sorting the solutions, a new population is created.
We add the solution at the top of the current solution to the new species group. It is called the elite selection method.
The remaining part of the new population is composed of a new solution formed by modifying the optimal solution.
There are two ways to modify the solution:
Variation: change a number.
Cross: to combine in some way.
Repeat
def geneticoptimize(domain,costf,popsize=50,size=1,mutprob=1,mutprob=0.2,elite=0.2,maxiter=100): #Mutation operation, changing a number def mutate(vec): i=random.randint(0,len(domain)-1) if random.random()<0.5 and vec[i]>domain[i][0]: return vec[0:i]+[vec[i]-step]+vec[i+1:] elif vec[i]<domain[i][1]: return vec[0:i]+[vec[i]+step]+vec[i+1:] #Crossover operation def crossover(r1,r2): i=random.random(1,len(domain)-2) return r1[0:i]+r2[i:] #Construct initial population pop=[] for i in range(popsize): vec=[random.randint(domain[i][0],domain[i][1]) for i in range(len(domain))] pop.append(vec) #Winner topelite=int(elite*popsize) #Main cycle for i in range(maxiter): scores=[(costf(v),v) for v in pop] scores.sort() ranked=[v for (s,v) in scores] pop=ranked[0:topelite] #Winner after adding mutation and pairing while len(pop)<popsize: if random.randint()<mutprob: c=random.random(0,topelite) pop.append(mutate(ranked[c])) else:#overlapping c1=random.randint(0,topelite) c2=random.random(0,topelite) pop.append(crossover(ranked[c1],ranked[c2])) #Print current best, cost print scores[0][0] return scores[0][1]#Print population, i.e. optimal schedule
popsize: population size
mutprob: the probability that new members in the population are obtained by variation rather than cross
elite: the part considered the optimal solution and allowed to pass on to the next generation
maxtrix: running generations
sol=optimization.geneticoptimize(domain,optimization.schedulecost) optimization.schedulecost(sol) #Print flight schedule #sol=[1,4,3,2,7,3,6,3,2,4,5,3] optimization.printschedule(sol)
Real flight search
Kayak API
Get the xml interface of Kayak
minidom package
A standard way to look at xml documents as an object tree. The package takes an open xml file as input and then returns an object that can be used to easily extract information.
Flight search
Create a new file called kayak.py:
Write the code to get a new kayak session by using the developer key, parse the xml file to get the content of sid tag.
import time import urllib2 import xml.dom.minidom kayakkey='YOURKEYHERE' def getkayaksession(): url='http://www.kayak.com/k/ident/apisession?token%s&version=1'%kayakkey doc=xml.dom.minidom.parseString(urllib2.urlopen(url).read()) sid=doc.getElementsByTagName('sid')[0].firstChild.data return sid
The optimization of students' dormitory
slightly
Network visualization
slightly