10 Roots and Optima Introduction
This week our goals are to be able to:
- Identify problem statements that are asking to identify roots or optima of equations.
- Outline the algorithm for solving root and optima problems.
- Correctly set up equations and solve for roots and optima.
- Visualize functions to assist in identifying search ranges for roots and optima.
Read and/or watch: - Read https://en.wikipedia.org/wiki/Newton%27s_method (through Description section) and https://en.wikipedia.org/wiki/Newton%27s_method_in_optimization (through Geometric interpretation section) - Watch https://www.youtube.com/watch?v=lBfqvBJaFmc first 10 min.
Then review the following examples in R.
10.1 Roots
10.1.1 Example 1
Suppose you have a model that represents the velocity of a bungee jumper (\(v\)) that depends on the mass of the person (\(m\)), the drag force (\(c_d\)), the time in free fall (\(t\)), and the gravitational constant (\(g\)). Medical studies have shown that if the velocity increases above 36 m/s after 4 seconds of free fall, the health risk to the jumper is elevated. What is the critical mass of a person given the following:
\[v(t) = \sqrt{\frac{gm}{c_d}} \tanh \left( t\sqrt{\frac{gc_d}{m}} \right)\]
(This equation is written with in-line LaTeX code in Rmarkdown).
\(c_d = 0.25\) kg/m; \(g = 9.81\) m/s2
How would you solve?
- graph
- root(s)
10.1.1.1 Approach
- Rearrange equation
- Plot equation
- Run uniroot
10.1.1.1.1 Rearrange equation
All we need to do here is subtract over \(v\) to the right side.
\[0 = \sqrt{\frac{gm}{c_d}} \tanh \left( t\sqrt{\frac{gc_d}{m}} \right) - v(t)\]
Now we can set up this equation along with all of the variable and constant values in our R session.
10.1.1.1.2 Plot Equation
c_d <- 0.25 # drag coefficient
g <- 9.81 # gravitational constant
v <- 36 # velocity (m/s)
t <- 4 # time (s)
mp <- seq(0,200) # mass predictions,
# a range of values where the root might be.
# You might need to play with this range.
f <- function(mp = 1,c_d = 1,g = 9.8,v = 36,t = 4){ # all variables in the equation are arguments
sqrt(g*mp/c_d)*tanh(sqrt(g*c_d/mp)*t)-v # what the function does
# by default our output is the result of the above line
}
f()
## [1] -32.8695
## [1] 0.08490453
10.1.1.1.3 Solve equation using uniroot
Now we can solve for the root of this equation, i.e. where the plot crosses the x-axis. We will use the uniroot
function.
“The function uniroot searches the interval from lower to upper for a root (i.e., zero) of the function f with respect to its first argument.”
# Don't run this, this is from the help documentation of uniroot
uniroot(f, interval, ...,
lower = min(interval), upper = max(interval),
f.lower = f(lower, ...), f.upper = f(upper, ...),
extendInt = c("no", "yes", "downX", "upX"), check.conv = FALSE,
tol = .Machine$double.eps^0.25, maxiter = 1000, trace = 0)
You can set extendInt = "yes"
to extend the interval if you didn’t see a root in your plot.
Now we can use uniroot
to solve.
## $root
## [1] 142.7376
##
## $f.root
## [1] 1.377941e-08
##
## $iter
## [1] 8
##
## $init.it
## [1] 2
##
## $estim.prec
## [1] 6.103516e-05
# c_d, g, v, t are `... =` additional named or unnamed arguments to be passed
# to f. I've named c_d and g, v and t are not named, so they need to be in the
# same order as our function definition in lines 45-47.
So the root of this equation, the mass at which this maximum velocity without spinal damage is 142.7376338. It took 8 iterations to reach this root, and the function has a value of 1.3779413^{-8} as this value. If we change the tol
argument to be lower we could get a more accurate number, but to be safe let’s just round down our root a bit to 142; nobody wants spinal damage.
10.1.2 Example 2
You buy a $25,000 piece of equipment for nothing down at $5,500 per year for 6 years. What interest rate are you paying? The formula relating present worth \(P\), annual payments \(A\), number of years \(n\), and interest rate \(i\) is
\[ A = P \frac{i(1 + i)^n}{(1 + i)^n - 1}\]
Add your own comments to the code chunk below, describing what each line is doing.
A <- 25000
p <- 6500
n <- 5
i <- seq(0,10)
f <- function(i,A,p,n) (p*i*(1+i)^n)/((1+i)^n-1)-A
plot(i,f(i,A,p,n))
## $root
## [1] 3.844713
##
## $f.root
## [1] -0.001068101
##
## $iter
## [1] 4
##
## $init.it
## [1] NA
##
## $estim.prec
## [1] 6.103516e-05
10.1.3 Example 3 more than 1 root
In this case we will need to use uniroot.all
to find the multiple roots.
x <- seq(3,6,.01)
f <- function(x) sin(10*x)+cos(3*x)
fx <- f(x)
d <- data.frame(x,fx)
ggplot(d,aes(x,fx)) + geom_line()
## $root
## [1] 5.679042
##
## $f.root
## [1] 0.0001150379
##
## $iter
## [1] 7
##
## $init.it
## [1] NA
##
## $estim.prec
## [1] 6.103516e-05
# uniroot.all is in the rootSolve package
library(rootSolve)
rootSolve::uniroot.all(f, c(3,6)) -> roots
roots
## [1] 3.262388 3.366010 3.745617 4.229124 4.263600 4.712411 5.161005 5.195734
## [9] 5.678858
10.2 Optima/Optimization
##
## Attaching package: 'pracma'
## The following objects are masked from 'package:rootSolve':
##
## gradient, hessian
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
10.3 Optimization, 1-Dimensional
Definition: The optimize
R function performs one dimensional optimization.
You can find the basic R programming syntax of the optimize
function below.
#from
?optimize
#(You can also spell it `optimise`)
# Usage
optimize(f, interval, ..., lower = min(interval), upper = max(interval),
maximum = FALSE,
tol = .Machine$double.eps^0.25)
# f
# the function to be optimized. The function is either minimized or maximized over its first argument depending on the value of maximum.
# interval
# a vector containing the end-points of the interval to be searched for the minimum.
# ...
# additional named or unnamed arguments to be passed to f.
This looks very similar to uniroot
, right? Our approach to solving these will also be similar.
10.3.1 Approach:
- Write function.
- Plot function.
- Apply optimize command. Set maximum=TRUE if you want to find maximum.
10.3.2 Example: Optimizing User-Defined Function Using optimize() in R
10.3.2.1 Write function
The following R programming syntax illustrates how to use the optimize
function in R. First, we have to create our own function that we want to optimize:
10.3.2.2 Plot
You can use geom_function(fun = f)
now to plot functions
# which is much better than the old way
x <- seq(0,4,0.1) # create x vector, you can change the limits
fx <- f(x) # evaluate using x vector
d <- data.frame(x,fx) # create data frame
ggplot(d,aes(x,fx))+geom_line() # plot
10.3.2.3 Find optimum
Finally we can find the local optima of this function. From our plot it is between 1 and 2.
## $minimum
## [1] 1.427552
##
## $objective
## [1] -1.775726
The interval range for plotting provides the opportunity to hone in on the minimum. Here, I started larger and then chose to plot between 0 and 4. This allowed me to select a reasonable interval for the optimize function. The minimum value occurs at an x-value of 1.43 and the minimum is -1.78.
10.3.3 Example: Finding a maximum
10.3.3.1 Write function
In the example above, we found the minimum. Suppose we want to find the maximum value of a function. Let’s imagine we have a soccer ball that we kick into the air at a starting velocity, and we want to compute how high the ball gets. The model equation is below: \[ z = z_0 + \frac{m}{c} \left( v_0 + \frac{m g}{c}\right)\left( 1 - e^{(-c/mt)}\right) - \frac{mg}{c}t \] where \(z =\) height above surface [m] \(z_0 =\) initial height [m] \(m =\) mass [kg] \(c =\) drag coefficient [kg/s] \(v_0 =\) initial velocity [m/s] \(t =\) time [s]
Given the following values:
10.3.3.2 Plot
t <- seq(0,5,.01)
fx <- f(t = t, g = g, z_0 = z_0, c, m, v_0)
d <- data.frame(t,fx)
ggplot(d,aes(t,fx)) +
geom_line() +
xlab("t [s]") +
ylab("Height [m]")
# OR
ggplot() +
geom_function(fun = f, args = list(g = g, z_0 = z_0,
c = c,m = m, v_0 = v_0)) + xlim(0,5) +
xlab("t [s]") +
ylab("Height [m]")
10.4 Optimization, 2-Dimensional
Now let’s look at a 2-D example.
Here we will do our plotting and function writing a bit backwards, as plotting in 3D, can be cumbersome. You can create contour or heatmap plots where you define a grid of x and y values using meshgrid
to the calculate the z value, z(x,y). Once you plot, you can play around with the start and end value under contours to zoom in on the optimum you are interested in. The plot allows you to identify the location in the x and y vector, and from there you can identify the x and y value. We will use plotly
package to make these plots. Plotly conveniently allows you to see the values of a plot as you hover over it.
Plotly makes interactive HTML plots, which are really cool but don’t play well with PDFs, unfortunately. So you will not see your plot_ly
plots if you knit to PDF. You can knit to HTML and then print the page to a PDF if you want to see these plots.
10.4.1 Example: 2-D
Here is a purely mathematical example where we want to solve for the optima of the following function \[ f(x, y) = 2 + x-y+2x^2+2xy+y^2\]
10.4.1.1 Plot equation
library(plotly)
x <- seq(-3,3,0.01) # define range in x values
y <- seq(-3,3,0.1) # define range in y values
FF <- pracma::meshgrid(x,y) # create grid of x,y pairs to evaluate
# NOTE: meshgrid takes two vectors and makes a matrix of all pairs of these values and renames these values X and Y
Z <- 2 + FF$X-FF$Y+2*FF$X^2+2*FF$X*FF$Y+FF$Y^2 # use grid to create Z values (3rd dimension)
fig <- plot_ly(z = Z, x = x, y = y, type = "contour", contours = list(
start = 0,
end = 10,
size = 0.5
))
fig
# we can also add labels to the contour
fig <- plot_ly(z = Z, x = x, y = y, type = "contour", contours = list(showlabels = TRUE))
fig
Try hovering over the plot to find a good guess for the minimum.
We will need this starting value for the fminsearch
function. You’ll also have to rewrite the function as below where x[1]
is x
, and x[2]
is y
. This helps the code run more efficiently by passing a single x
object with 2 values, instead of 2 objects x
and y
. Then use fminsearch
, where x0
is the starting value for the search. If we wanted to find the maximum, set minimize=FALSE
.
10.4.1.3 Find optimum
## $xmin
## [1] -1.0 1.5
##
## $fmin
## [1] 0.75
##
## $count
## [1] 107
##
## $convergence
## [1] 0
##
## $info
## $info$solver
## [1] "Nelder-Mead"
##
## $info$restarts
## [1] 0
## $xmin
## [1] -1.0 1.5
##
## $fmin
## [1] 0.75
##
## $count
## [1] 343
##
## $convergence
## [1] 0
##
## $info
## $info$solver
## [1] "Hooke-Jeeves"
##
## $info$iterations
## [1] 26
10.4.2 Example: Another simple example
# Write Function
f <- function(c) -2*c/(4+0.8*c+c^2+0.2*c^3)
# Plot
c <- seq(0,10,0.1) # create x vector, you can change the limits
fc <- f(c) # evaluate using x vector
d <- data.frame(c,fc) # create data frame
ggplot(d,aes(c,fc))+geom_line() # plot
## $minimum
## [1] 1.567899
##
## $objective
## [1] -0.3696349