Homework: The travelling delivery drone problem¶
Topic: discrete optimization, simulated annealing
Template: A3_salesman.py
Theory¶
The travelling salesman problem is a famous, difficult optimization problem. The problem is this: You have a set of points on a map. We call these cities, but in a more general case they could be any kinds of points that have some measurable distance. Your task is to visit each point exactly once and then return to the starting point taking the shortest route possible.
The problem is simple, but since the number of permutations (different ways to order the cities) grows as \(N!\), where \(N\) is the number of cities, it is impossible to test all the possible routes when the number of cities goes past about 10. Since the problem is discrete, you also cannot calculate derivatives to help in the optimization task. Instead, your task is to find as good a solution as you can using the simulated annealing method.
Simulated annealing uses the same algorithm as Metropolis Monte Carlo, which we used previously to sample states from the canonical ensemble. In the canonical ensemble, the total energy of the system is not constant because the system can exhange energy with its environment, and the probability of finding the system in a state with total energy \(E\) is given by the Boltzmann factor
On one hand, if the temperature is high, the system can have high energies and access many different states. On the other hand, if the temperature is low, the probability to find the system in a high energy state is very low. That means the system must end up in an energy minimum as temperature approaches zero.
Now, we are obviously not interested in any real energy or temperature, but we can formulate the optimization problem using similar principles. Let us denote the total length of the salesman’s route as \(L\). Let us also introduce a “temperature-like” quantity \(T\). If we now use the Metropolis algorithm to build a simulation that treats \(L\) like energy and \(T\) like temperature, it will sample the routes according to probabilities
When \(T\) is large, the algorithm samples both long and short routes. But when \(T\) approaches zero, the algorithm should only find short routes. There is no guarantee it will find the shortest route, but it should converge towards some local minimum. Running the algorithm several times will likely locate several short routes and possibly also the shortest one.
The algorithm works by starting from some initial route (e.g., random), assigning a temperature and repeating the following:
Produce a new route by making a small random change to the current route.
Calculate the change in route length, \(\Delta L\).
If the new route is shorter than the old one, accept the new route.
If the new route is longer than the old one, accept the new route with probability
\[P = e^{- \frac{\Delta L}{T}}.\]Otherwise keep the old route.
Lower the temperature by a small amount.
The efficiency of the algorithm depends strongly on how you produce new routes. Firstly, the changes you make must be small enough. Large changes will produce essentially random states and the algorithm will not converge. But it also matters how, you make these small changes. A bad choice can create lots of local minima where the algorithm gets easily stuck. I especially suggest you try to figure out a simple operation that can get rid of crossings, because the optimal route clearly cannot have any crossings, and getting rid of them is important.
Also note that besides the 10 and 20 city cases, your program will be tested using a 40 city geography. It is not expected to solve the 40 city case exactly, but it should produce a somewhat reasonable solution.
Task¶
To complete this assignment, you need to complete the following tasks, submit your code as a plain .py file, and answer the questions. In the computing tasks, you must implement the incomplete functions so that they operate exactly as described in the documentation.
- Computing task:
Implement a function to calculate route length.
Implement a function to calculate the Boltzmann factor.
Come up with 1-3 good operations for making small changes to the route and implement them.
Implement a function to run simulated annealing.
Find the shortest routes for the provided 10 and 20 city geographies.
Adjust the simulation parameters so that your program always produces at least a reasonable result.
- Questions to answer:
What kinds of operations did you end up using for generating the new routes? Why did you choose them?
How do you vary temperature in your simulation? Why did you end up with these parameters?
Are the best solutions you found reasonable? Include pictures of optimized routes and explain why do you believe them to be good (or bad) solutions.
How reliably do you think your program solves the problem with 20 cities or more? Is there a lot of variance in the end result? Include data in your explanation! What steps did you take to make your program more reliable?
A3_salesman.py¶
- A3_salesman.Boltzmann_P(new, old, T)[source]¶
The Boltzmann probaility for accepting a change.
If the new value is better (smaller) than the old one, the change is always accepted. In this case, the function may return any number greater or equal to one.
If the new value is worse (larger) than the old one, the function returns the Boltzmann probability, which is calculated as follows: Denote the old and new values as \(x_\text{old}\) and \(x_\text{new}\), respectively. The change in the value is then \(\Delta x = x_\text{new} - x_\text{old}\), and the Boltzmann probability for accepting such a change is
\[P_{\text{old} \to \text{new}} = e^{-\frac{\Delta x}{T} }.\]The result is physically analogous to the canonical ensemble in statistical physics, where the probability that the system at temperature \(T\) is in a state \(i\) with total energy \(E_i\) is given by
\[P_i = \frac{1}{Z} e^{-\frac{E_i}{k_B T} },\]where \(Z\) is the partition function. Especially the ratio of the probabilites between two states is
\[\frac{P_j}{P_i} = \frac{ \frac{1}{Z}e^{-\frac{E_j}{k_B T} } }{ \frac{1}{Z}e^{-\frac{E_i}{k_B T} } } = e^{-\frac{E_j - E_i}{k_B T} } = e^{-\frac{\Delta E}{k_B T} }.\]The differences between the canonical distribution and the values calculated by this function is that there is no real temperature in this function. The argument T acts like temperature, but we are not simulating a physical system. Therefore we set \(k_B = 1\) since the temperature scale is arbitrary anyway.
Note
This function is incomplete!
- Parameters:
new (float) – new value
old (float) – old value
T (float) – virtual temperature
- Returns:
the Boltzmann probability
- Return type:
float
- A3_salesman.calculate_distances(cities)[source]¶
Calculate the distances between all pairs of cities.
Since the positions of all cities are given as xy coordinates on a plane, the distances are calculated simply by Pythagorean theorem.
The distances are stored in an N x N array, where the element (i,j) is the distance between cities i and j. The function returns this array.
- Parameters:
cities (array) – city coordinates as N x 2 array
- Returns:
city-city distances in a matrix
- Return type:
array
- A3_salesman.change_route_randomly(route)[source]¶
Make random changes to the given route.
A new route is returned. The original route is not modified.
Note
This function is incomplete!
- Parameters:
route (array) – integer sequence telling the order in which the cities are visited
- Returns:
the new route
- Return type:
array
- A3_salesman.create_random_route(cities)[source]¶
Create a random route.
The route is represented as an array of integers, where each integer is the index of a city from the list of cities and the order of these indices represents the order in which the cities are visited.
Since each city is visited exactly once, the route must be a vector containing the numbers 0, 1, …, N-1 where N is the number of cities. This function generates one such vector where the numbers are randomly ordered.
- Parameters:
cities (array) – city coordinates as N x 2 array (only the length N matters)
- A3_salesman.get_trip_length(route, distances)[source]¶
Calculates the length of the route for a closed round-trip that returns to the starting city in the end.
Here route is defined as an integer vector whose elements are indices of all cities in visit order, and distances is a matrix storing city-city distances for all city pairs.
More precisely, if route contains the elements
\[[ i_0, i_1, i_2, ... i_{N-1} ],\]and we denote elements of the distances matrix as \(d_{i,j}\), the distance between, say, the first and second city is \(d_{i_0, i_1}\) and the total trip length is calculated as
\[\sum_{n = 0}^{N-2} d_{i_n, i_{n+1}} + d_{i_{N-1}, i_0}.\]The last term in the sum is the length of the final stretch of the trip where the traveler returns to the starting location.
Note
This function is incomplete!
- Parameters:
route (array) – integer sequence telling the order in which the cities are visited
distances (array) – matrix storing the distances between all cities, as calculated by
calculate_distances()
- Returns:
the total length of the trip
- Return type:
float
- A3_salesman.main(cityfile, routefile=None)[source]¶
The main program.
Reads the coordinates of all the cities from the given file and also tries to read an initial guess for the route from a file.
The city file must exist, because it defines the problem to be optimized.
The route file is optional and if no valid file is found, a random route is created as a starting point.
- Parameters:
cityfile (str) – name of the file containing city coordinates
routefile (str) – name of the optional file containing the initial route as a sequence of integers (city indices)
- A3_salesman.optimize_route(initial_route, distances)[source]¶
Searches for as short a route as possible.
The optimization is carried out using several simulated annealing cycles by repeatedly calling the function
run_annealing_cycle()
.It is beneficial to run several annealing cycles since there may be several local optima and one annealing cycle can only converge to one of them. So in order to search for the global optimum, one should anneal many times to find different optima and finally choose the best one.
- Parameters:
initial_route (array) – starting route
distances (array) – matrix storing the distances between all cities, as calculated by
calculate_distances()
Note
Modify this function to adjust performance!
- Returns:
the best route found during the simulation
- Return type:
array
- A3_salesman.print_progress(step, total)[source]¶
Prints a progress bar.
- Parameters:
step (int) – progress counter
total (int) – counter at completion
- A3_salesman.read_cities_from_file(filename)[source]¶
Read the coordinates of cities from a file.
Each line in the file should contain the x and y coordinates of one city.
- Parameters:
filename (str) – name of the file to read
- Returns:
N x 2 array containing xy coordinates for all cities.
- Return type:
array
- A3_salesman.read_route_from_file(filename)[source]¶
Read a route from file.
For a route of N cities, the file should contain the numbers 0, …, N-1 in some order, each number on its own line.
The function returns an integer array containing the read numbers.
- Parameters:
filename (str) – name of the file to read
- Returns:
sequence of integers representing the route
- Return type:
array
- A3_salesman.run_annealing_cycle(route, distances, initial_temperature=50.0, trials=10000, final_temperature=1.0)[source]¶
Perform simulated annealing optimization for the route length.
Annealing is a heat treatment where heating up a physical material, say, a metal, increases the mobility of the particles that make up the material. Especially defects become more mobile and may thus diffuse out of the material or combine to eliminate each other. Thus, annealing can remove defects from the material. Once the material is cooled, it will contain very few defects.
Theoretically, annealing allows the system to sample the phase space so that it can find a state that minimizes the total energy. Simulated annealing uses the same idea for an abstract optimization problem. If we treat the length of our route as if it was the energy of a physical system, we can define a virtual temperature and allow the system to evolve according to the probabilities given by statistical physics. When our “temperature” is high, we will find new solutions. Then, as “temperature” gradually decreases, the solution should converge to some local optimum.
The annealing algorithm works by repeating the following steps:
A copy of the current route is made to prevent accidental changes.
This copy is modified using
change_route_randomly()
.The length of the new route is measured using
get_trip_length()
.The probability for accepting the new route is calculated using
Boltzmann_P()
.The new route is randomly accepted or discarded according to this probability.
If the new route was accepted, it becomes the current route. Otherwise it is discarded and the old route remains current.
If the current route is the shortest one yet found, it is saved as the best route.
Note
This function is incomplete!
- Parameters:
route (array) – starting route
distances (array) – matrix storing the distances between all cities, as calculated by
calculate_distances()
initial_temperature (float) – virtual temperature in the beginning
final_temperature (float) – virtual temperature in the end
trials (int) – the number of times a new route is generated and tried
- Returns:
the best route found during the simulation
- Return type:
array
- A3_salesman.show_route(cities, route)[source]¶
Plot the route.
Cities are plotted as a scatter plot in the given coordinates and the route as city connecting lines. The route is drawn as a closed loop that returns to the starting city in the end.
- Parameters:
cities (array) – city coordinates as N x 2 array
route (array) – integer sequence telling the order in which the cities are visited
- A3_salesman.write_route_to_file(route, filename)[source]¶
Write the route to a file.
The route is written as a sequence of integers. These integers are the indices of all the cities and their order represents the order in which the cities are visited.
The written file will have each integer on a separate line.
- Parameters:
route (array) – integer sequence telling the order in which the cities are visited
filename (str) – name of the file to write