Optimization of the function f
¶
It was shown in chapter Theory [see Eq. (6)] that
the contact function was defined as the maximum for 0 ≤ λ ≤ 1
of the function
f
discussed in chapter Implementation of the function f.
Given that the first and second derivatives of f
can be computed
explicitely (see section Implementation #1: using Cholesky decompositions in chapter
Implementation of the function f) it would be tempting to use the Newton–Raphson method to
solve f’(λ)
iteratively. However, our experiments show that this method
performs very poorly in the present case, because the variations of f
can
be quite sharp in the neighborhood of λ = 0
or λ = 1
. To carry out the
otpimization of f
, we therefore proceed in two steps.
In the first step, we use a robust optimization algorithm. We selected here Brent’s method, as implemented in the Boost::Math library. However, this method delivers a relatively low accuracy of the maxmimizer and the maximum.
Therefore, in the second step, we use a few Newton–Raphson iterations to refine
the previously obtained estimates of the minimizer and minimum of f
. In the
remainder of this chapter, we describe how these Newton–Raphson iterations are
performed.
Our starting point is Eqs. (9) and (13) in chapter Theory, from which it results that for a given
value of λ
we can define two values of μ²
: one is provided by
Eq. (9a), the other one is given by Eq. (9b) (both in chapter Theory):
(1a) μ₁² = [x₀(λ₀)-c₁]ᵀ⋅Q₁⁻¹⋅[x₀(λ₀)-c₁] = (1-λ)²sᵀ⋅Q₁⋅s,
(1b) μ₂² = [x₀(λ₀)-c₂]ᵀ⋅Q₂⁻¹⋅[x₀(λ₀)-c₂] = λ²sᵀ⋅Q₂⋅s,
where we have introduced s = Q⁻¹⋅r₁₂
. We further define the matrix Q₁₂ =
Q₂-Q₁
, so that:
(2) Q₁ = Q - λQ₁₂ and Q₂ = Q + (1-λ)Q₁₂.
Combining Eqs. (1) and (2)
and recalling that Q⋅s = r
then delivers the following expressions:
(3a) μ₁² = (1-λ)²rᵀ⋅s - λ(1-λ)²sᵀ⋅u,
(3b) μ₂² = λ²rᵀ⋅s + λ²(1-λ)sᵀ⋅u,
where we have introduced u = Q₁₂⋅s
.
The above expressions seem to behave slightly better from a numerical point of
view. Our problem is now to find λ
such that μ₁² = μ₂²
. We therefore
define the following residual:
(4) g(λ) = μ₂² - μ₁² = (2λ-1)rᵀ⋅s + λ(1-λ)sᵀ⋅u,
and we need to find λ
such that g(λ) = 0
. In order to implement
Newton–Raphson iterations, we need the expression of the derivative of the
residual. Using results that are presented in section
Implementation #1: using Cholesky decompositions, we readily find that:
(5) g’(λ) = 2rᵀ⋅s + 2(1-2λ)sᵀ⋅u - 2λ(1-λ)uᵀ⋅v.
Eqs. (4) and (5) are then
used for the final, refinement step of determination of λ
.