It’s all downhill¶
Topic: nonlinear optimization, gradient and conjugate gradient search
- Task A:
Implement the deepest descent algorithm
Check where you end up with different initial guesses. Also set
alternative = True
to change the target function you are optimizing.
- Task B:
Implement the conjugate gradients algorithm
Again, try different starting points and both target functions.
You may also try the more accurate mode or the scipy version.
- Task C:
Test the simulated annealing algorithm.
Template: gradients.py
gradients.py¶
- gradients.animate(history, minx, miny, maxx, maxy)[source]¶
Animates the simulation.
- Parameters:
history (list) – steps of the optimization process
minx (float) – x axis minimum
maxx (float) – x axis maximum
miny (float) – y axis minimum
maxy (float) – y axis maximum
- gradients.conjugate_gradients_algorithm(initial_guess, max_gradient=0.1, max_steps=1000)[source]¶
Searches for the minimum of the
target()
function \(f\).The conjugate gradients algorithm starts from the initial guess and repeats the following:
Calculate the
gradient()
\(\vec{g}_i = \nabla f\) at current position.Check the length of the gradient vector \(g_i = | \vec{g}_i |\) and stop if it is below a given threshold. If not, continue.
Calculate a mixing parameter. There are many ways to do this. This function uses
\[\beta_i = \frac{ \vec{g}_i \cdot (\vec{g}_i - \vec{g}_{i-1}) } { \vec{g}_{i-1} \cdot \vec{g}_{i-1}}\]Do a
linear_search()
in the direction \(\vec{d}_i = - \vec{g}_i + \beta_i \vec{d}_{i-1}\) and move to the lowest point found. (For the first step, \(\vec{d}_i = - \vec{g}_i\).)
Once finished, the algorithm returns the coordinates where the minimum was found, the minimum target value and the path it took as a list of coordinates.
Note
This function is incomplete!
- Parameters:
initial_guess (array) – starting point (x,y)
max_gradient (float) – stop if \(g\) is less than this
max_steps (int) – stop if the iteration takes more than this many steps
- Returns:
final coordinates, final target value, list of coordinates
- Return type:
array, float, array
- gradients.deepest_descent_algorithm(initial_guess, max_gradient=0.1, max_steps=1000)[source]¶
Searches for the minimum of the
target()
function \(f\).The deepest descent algorithm starts from the initial guess and repeats the following:
Calculate the
gradient()
\(\vec{g} = \nabla f\) at current position.Check the length of the gradient vector \(g = | \vec{g} |\) and stop if it is below a given threshold. If not, continue.
Do a
linear_search()
in the direction \(\vec{d} = - \vec{g}\) and move to the lowest point found.
Once finished, the algorithm returns the coordinates where the minimum was found, the minimum target value and the path it took as a list of coordinates.
Note
This function is incomplete!
- Parameters:
initial_guess (array) – starting point (x,y)
max_gradient (float) – stop if \(g\) is less than this
max_steps (int) – stop if the iteration takes more than this many steps
- Returns:
final coordinates, final target value, list of coordinates
- Return type:
array, float, array
- gradients.draw(frame, history, x, y)[source]¶
Draws a snapshot of the optimization process.
Used for animation.
- Parameters:
frame (int) – index of the frame to be drawn
history (list) – steps of the optimization process
x (array) – x coordinates
y (array) – y coordinates
- gradients.gradient(coordinates)[source]¶
Gradient of the target function
target()
.- Parameters:
coordinates (array) – (x,y) coordinates
- Returns:
gradient \(\vec{g} = \nabla f\) as a vector [gx, gy].
- Return type:
array
- gradients.linear_search(coordinates, direction, stepsize=1, accuracy=5)[source]¶
Searches for the minimum of the
target()
function on a line.Starting from given coordinates, the function takes equally spaced steps in the given direction. The function continues as long as the target function decreases. (As a safety, there is a maximum number of steps allowed after which the process is aborted.)
When the target starts to increase again, the step size is reduced and a new search is performed near the minium of the previous search. This allows the algorithm to find the minimum with better accuracy.
- Parameters:
coordinates (array) – starting position (x,y)
direction (array) – vector pointing the search direction
stepsize (float) – scaling for the initial step size
accuracy (int) – how many times the step length is shortened
- Returns:
(x,y) coordinates at the minimum, minimum target value
- Return type:
array, float
- gradients.main(guess, algorithm='dd')[source]¶
Find a minimum of
target()
function.The function also visualizes the optimization process.
- Parameters:
guess (array) – Initial coordinates (x,y) as array.
algorithm (str) – One of: “dd”: deepest descent gradient search, “cg”: conjugate gradients search, “acg”: more accurate conjugate gradients search, “scipy”: built-in scipy conjugate gradients search, “sa”: simulated annealing
- gradients.random_displacement(coords, length)[source]¶
Randomly changes the given coordinate vector.
Each coordinate is changed by a uniformly distributed random variable \(\Delta x \in [-L, L]\).
The function returns the new coordinates as a new vector. The original vector is not affected.
- Parameters:
coords (array) – coordinates as a vector
length (float) – maximum change \(L\)
- Returns:
new coordinates
- Return type:
array
- gradients.show_convergence(steps, minx=- 5, maxx=5, miny=- 5, maxy=5)[source]¶
Plots the
target()
function and the optimization path.- Parameters:
steps (array) – sequence of coordinates representing optimization progress
minx (float) – x axis minimum
maxx (float) – x axis maximum
miny (float) – y axis minimum
maxy (float) – y axis maximum
- gradients.simulated_annealing(initial_guess, initial_temperature, final_temperature, n_steps, max_step_length, recording_interval=10)[source]¶
Searches for the minimum of the
target()
function \(f\).The simulated annealing algorithm starts from the initial guess and searches for better solutions using a random walk. The process is controlled by a temperature-like quantity \(T\). The algorithm repeats the following:
Make a small random change in the current coordinates.
Calculate the change in target function due to this change, \(\Delta f\).
If \(\Delta f \le 0\) (new coordinates are better than the old ones), accept the change.
If the new solution is the best one yet, record it.
If \(\Delta f > 0\) (new coordinates are worse than the old ones), accept the change with probability \(P = e^{- \Delta f / T}\).
Lower the temperature by a small amount.
The algorithm has no specific stopping criteria. Instead, it continues searching for a predetermined period. Once finished, the algorithm returns the coordinates where the minimum was found, the minimum target value and the path it took as a list of coordinates.
- Parameters:
initial_guess (array) – starting point (x,y)
initial_temperature (float) – value of \(T\) in the beginning
final_temperature (float) – value of \(T\) at the end
n_steps (int) – number of steps
max_step_length (float) – the maximum allowed change in each coordinate in one step
recording_interval (int) – records the current coordinates after this many steps
- Returns:
final coordinates, final target value, list of coordinates
- Return type:
array, float, array
- gradients.target(coordinates)[source]¶
The target function.
For this exercise, the function is chosen to be
\[f(x,y) = -0.9 e^{ -2.5 (x-2.2)^2 - 0.3(y-0.8)^2 } - 1.1 e^{ -2.1(x-1.0)^2 - 0.4(y-2.1)^2 } - 0.1 x\]or alternatively (if the global variable alternative = True)
\[f(x,y) = (\cos 3y - 2 e^{x/2} + 2)^2 + (x-y^3)^2\]Our task is to find the minimum of this function.
- Parameters:
coordinates (array) – (x,y) coordinates
- Returns:
function value \(f(x,y)\)
- Return type:
float