Vector-valued functions, $f:R \rightarrow R^n$

We discuss functions of a single variable that return a vector in $R^n$. There are many parallels to univariate functions (when $n=1$) and differences.

Before beginning, we load our CalculusWithJulia package, which will provide a few functions used in the following.

using CalculusWithJulia
using Plots


A function $\vec{f}: R \rightarrow R^n$, $n > 1$ is called a vector valued function. Some examples:

\[ ~ \vec{f}(t) = \langle \sin(t), 2\cos(t) \rangle, \quad \vec{g}(t) = \langle \sin(t), \cos(t), t \rangle, \quad \vec{h}(t) = \langle 2, 3 \rangle + t \cdot \langle 1, 2 \rangle. ~ \]

The components themselves are also functions of $t$, in this case univariate functions. Depending on the context, it can be useful to view vector-valued functions as a function that returns a vector, or a vector of the component functions.

The above example functions have $n$ equal $2$, $3$, and $2$ respectively. We will see that many concepts of calculus for univariate functions ($n=1$) have direct counterparts.

Representation in Julia

In Julia, the representation of a vector-valued function is straightforward: we define a function of a single variable that returns a vector. For example, the three functions above would be represented by:

f(t) = [sin(t), 2*cos(t)]
g(t) = [sin(t), cos(t), t]
h(t) = [2, 3] + t * [1, 2]
h (generic function with 1 method)

For a given t, these evaluate to a vector. For example:

2-element Array{Int64,1}:

We can create a vector of functions, e.g., F = [cos, sin, identify], but calling this object, as in F(t), would require some work, such as t = 1; [f(t) for f in F].

Space curves

A vector-valued function is typically visualized as a curve. That is, for some range, $a \leq t \leq b$ the set of points $\{\vec{f}(t): a \leq t \leq b\}$ are plotted. If, say in $n=2$, we have $x(t)$ and $y(t)$ as the component functions, then the graph would also be the parametric plot of $x$ and $y$. The term planar curve is common for the $n=2$ case and space curve for the $n \geq 3$ case.

This plot represents the vectors with their tails at the origin.

There is a convention for plotting the component functions to yield a parametric plot within the Plots package (e.g., plot(x, y, a, b)). This can be used to make polar plots, where x is t -> r(t)*cos(t) and y is t -> r(t)*sin(t).

However, we will use a different approach, as the component functions are not naturally produced from the vector-valued function.

In Plots, the command plot(xs, ys), where, say, xs=[x1, x2, ..., xn] and ys=[y1, y2, ..., yn], will make a connect-the-dot plot between corresponding pairs of points. This can be used as an alternative to plotting a function through plot(f, a, b): first make a set of $x$ values, say xs=range(a, b, length=100); then the corresponding $y$ values, say ys = f.(xs); and then plotting through plot(xs, ys). (If using Julia 1.0 the second argument to range should be named, as in stop=b.)

Similarly, were a third vector, zs, for $z$ components used, plot(xs, ys, zs) will make a 3-dimensional connect the dot plot

However, our representation of vector-valued functions naturally generates a vector of points: [[x1,y1], [x2, y2], ..., [xn, yn]], as this comes from broadcasting f over some time values. That is, for a collection of time values, ts, typically generated, as above, by range, the command f.(ts) will produce the vector of points.

To get the xs and ys from this, is conceptually easy: just iterate over all the points and extract the corresponding component. For example, to get xs we would have a command like [p[1] for p in f.(ts)]. Similarly, the ys would use p[2] in place of p[1]. The following function, from the CalculusWithJulia package and employed previously, does this for us, returning the vectors in a tuple:

unzip(vs) = Tuple(eltype(first(vs))[xyz[j] for xyz in vs] for j in eachindex(first(vs)))

The name comes from how the zip function in base Julia takes two vectors and returns a vector of the values paired off. This is the reverse.

It was noted in vectors that the unzip function turns a vector of points (or vectors) into a tuple of vectors collecting the $x$ values, the $y$ values, and, if present, the $z$ values:

[[x1, y1, z1],         (⌈x1⌉,  ⌈y1⌉, ⌈z1⌉,
 [x2, y2, z2],          |x2|, |y2|, |z2|,
 [x3, y3, z3],   -->    |x3|, |y3|, |z3|,
     ⋮                         ⋮
 [xn, yn, zn]]          ⌊xn⌋,  ⌊yn⌋, ⌊zn⌋ )

To turn a tuple of vectors into separate arguments for a function, splatting (the ...) is used.

Finally, with these definitions, we can visualize the three functions we have defined.

Here we show the plot of f over the values between $0$ and $2\pi$ and also add a vector anchored at the origin defined by f(1).

ts = range(0, 2pi, length=200)
arrow!([0, 0], f(1))

The trace of the plot is an ellipse. If we describe the components as $\vec{f}(t) = \langle x(t), y(t) \rangle$, then we have $x(t)^2 + y(t)^2/4 = 1$. That is, for any value of $t$, the resulting point satisfies the equation $x^2 + y^2/4 =1$ for an ellipse.

The plot of $g$ needs 3-dimensions to render. For most plotting backends, the following should work with no differences, save the additional vector is anchored in 3 dimensions now:

ts = range(0, 6pi, length=200)
arrow!([0, 0, 0], g(2pi))

Here the graph is a helix; three turns are plotted. If we write $g(t) = \langle x(t), y(t), z(t) \rangle$, as the $x$ and $y$ values trace out a circle, the $z$ value increases. When the graph is viewed from above, as below, we see only $x$ and $y$ components, and the view is circular.

plot(unzip(g.(ts))..., camera=(0, 90))

The graph of $h$ shows that this function parameterizes a line in space. The line segment for $-2 \leq t \leq 2$ is shown below:

ts = range(-2, 2, length=200)

The plot_parametric_curve function

While the unzip function is easy to understand as a function that reshapes data from one format into one that plot can use, it's usage is a bit cumbersome. The CalculusWithJulia package provides a function plot_parametric_curve which hides the use of unzip and the splatting within a function definition. It expects a vector-valued function and a range of $t$ values specified by two values. For example, the last plot can be produced with

plot_parametric_curve(h, -2, 2)

Familiarity with equations for lines, circles, and ellipses is important, as these fundamental geometric shapes are often building blocks in the description of other more complicated things.

The point-slope equation of a line, $y = y_0 + m \cdot (x - x_0)$ finds an analog. The slope, $m$, is replaced with a vector $\vec{v}$ and the point, $(x_0, y_0)$ is replaced with a vector $\vec{p}$ identified with a point in the plane. A parameterization would then be $f(t) = \vec{p} + (t - t_0) \vec{v}$. From this, we have $f(t_0) = \vec{p}$.

The unit circle is instrumental in introducing the trigonometric functions though the identification of and angle $t$ with a point on the unit circle $(x,y)$ through $y = \sin(t)$ and $x=\cos(t)$. With this identification certain properties of the trigonometric functions are immediately seen, such as the period of $\sin$ and $\cos$ being $2\pi$, or the angles for which $\sin$ and $\cos$ are positive or even increasing. Further, this gives a natural parameterization for a vector-valued function whose plot yields the unit circle, namely $\vec{f}(t) = \langle \cos(t), \sin(t) \rangle$. This parameterization starts (at $t=0$) at the point $(1, 0)$. More generally, we might have additional parameters $\vec{f}(t) = \vec{p} + R \cdot \langle \cos(\omega(t-t_0)), \sin(\omega(t-t_0)) \rangle$ to change the origin, $\vec{p}$; the radius, $R$; the starting angle, $t_0$; and the rotational frequency, $\omega$.

An ellipse has a slightly more general equation than a circle and in simplest forms may satisfy the equation $x^2/a^2 + y^2/b^2 = 1$, where when $a=b$ a circle is being described. A vector-valued function of the form $\vec{f}(t) = \langle a\cdot\cos(t), b\cdot\sin(t) \rangle$ will trace out an ellipse.

The above description of an ellipse is useful, but it can also be useful to re-express the ellipse so that one of the foci is at the origin. With this, the ellipse can be given in polar coordinates through a description of the radius:

\[ ~ r(\theta) = \frac{a (1 - e^2)}{1 + e \cos(\theta)}. ~ \]

Here, $a$ is the semi-major axis ($a > b$); $e$ is the eccentricity given by $b = a \sqrt{1 - e^2}$; and $\theta$ a polar angle.

Using the conversion to Cartesian equations, we have $\vec{x}(\theta) = \langle r(\theta) \cos(\theta), r(\theta) \sin(\theta)\rangle$.

For example:

a, ecc = 20, 3/4
f(t) = a*(1-ecc^2)/(1 + ecc*cos(t)) * [cos(t), sin(t)]
plot_parametric_curve(f, 0, 2pi, legend=false)
scatter!([0],[0], markersize=4)

The Spirograph is "... a geometric drawing toy that produces mathematical roulette curves of the variety technically known as hypotrochoids and epitrochoids. It was developed by British engineer Denys Fisher and first sold in 1965." These can be used to make interesting geometrical curves.

Following Wikipedia: Consider a fixed outer circle $C_o$ of radius $R$ centered at the origin. A smaller inner circle $C_i$ of radius $r < R$ rolling inside $C_o$ and is continuously tangent to it. $C_i$ will be assumed never to slip on $C_o$ (in a real Spirograph, teeth on both circles prevent such slippage). Now assume that a point $A$ lying somewhere inside $C_{i}$is located a distance $\rho < r$ from $C_i$'s center.

The center of the inner circle will move in a circular manner with radius $R-r$. The fixed point on the inner circle will rotate about this center. The accumulated angle may be described by the angle the point of contact of the inner circle with the outer circle. Call this angle $t$.

Suppose the outer circle is centered at the origin and the inner circle starts ($t=0$) with center $(R-r, 0)$ and rotates around counterclockwise. Then if the point of contact makes angle $t$, the arc length along the outer circle is $Rt$. The inner circle will have moved a distance $r t'$ in the opposite direction, so $Rt =-r t'$ and solving the angle will be $t' = -(R/r)t$.

If the initial position of the fixed point is at $(\rho, 0)$ relative to the origin, then the following function will describe the motion:

\[ ~ \vec{s}(t) = (R-r) \cdot \langle \cos(t), \sin(t) \rangle + \rho \cdot \langle \cos(-\frac{R}{r}t), \sin(-\frac{R}{r}t) \rangle. ~ \]

To visualize this we first define a helper function to draw a circle at point $P$ with radius $R$:

circle!(P, R; kwargs...) = plot_parametric_curve!(t -> P + R*[cos(t), sin(t)], 0, 2pi;kwargs...)
circle! (generic function with 1 method)

Then we have this function to visualize the spirograph for different $t$ values:

function spiro(t; r=2, R=5, rho=0.8*r)

    cent(t) = (R-r) * [cos(t), sin(t)]

    p = plot(legend=false, aspect_ratio=:equal)
    circle!([0,0], R, color=:blue)
    circle!(cent(t), r, color=:black)

    tp(t) = -R/r * t

    s(t) = cent(t) + rho * [cos(tp(t)), sin(tp(t))]
    plot_parametric_curve!(s, 0, t, n=1000, color=:red)

spiro (generic function with 1 method)

And we can see the trace for $t=\pi$:


The point of contact is at $(-R, 0)$, as expected. Carrying this forward to a full circle's worth is done through:


The curve does not match up at the start. For that, a second time around the outer circle is needed:


Whether the curve will have a period or not is decided by the ratio of $R/r$ being rational or irrational.


Ivars Peterson described the carnival ride "tilt-a-whirl" as a chaotic system, whose equations of motion are presented in American Journal of Physics by Kautz and Huggard. The tilt-a-whirl has a platform that moves in a circle that also moves up and down. To describe the motion of a point on the platform assuming it has radius $R$ and period $T$ and rises twice in that period could be done with the function:

\[ ~ \vec{u}(t) = \langle R \sin(2\pi t/T), R \cos(2\pi t/T), h + h \cdot \sin(2\pi t/ T) \rangle. ~ \]

A passenger sits on a circular platform with radius $r$ attached at some point on the larger platform. The dynamics of the person on the tilt-a-whirl depend on physics, but for simplicity, let's assume the platform moves at a constant rate with period $S$ and has no relative $z$ component. The motion of the platform in relation to the point it is attached would be modeled by:

\[ ~ \vec{v}(t) = \langle r \sin(2\pi t/S), r \sin(2\pi t/S), 0 \rangle. ~ \]

And the motion relative to the origin would be the vector sum, or superposition:

\[ ~ \vec{f}(t) = \vec{u}(t) + \vec{v}(t). ~ \]

To visualize for some parameters, we have:

M, m = 25, 5
height = 5
S, T = 8, 2
outer(t) = [M * sin(2pi*t/T), M * cos(2pi*t/T), height*(1 +sin(2pi * (t-pi/2)/T))]
inner(t) = [m * sin(2pi*t/S), m * cos(2pi*t/S), 0]
f(t) = outer(t) + inner(t)
plot_parametric_curve(f, 0, 8, n=1000)

Limits and continuity

The definition of a limit for a univariate function is: For every $\epsilon > 0$ there exists a $\delta > 0$ such that if $0 \leq |x-c| < \delta$ then $|f(x) - L | < \epsilon$.

If the notion of "$f$ is close to $L$" is replaced by close in the sense of a norm, or vector distance, then the same limit definition can be used, with the new wording "... $\| \vec{f}(x) - L \| < \epsilon$".

The notion of continuity is identical: $\vec{f}(t)$ is continuous at $t_0$ if $\lim_{t \rightarrow t_0}\ve{f}(t) = \vec{f}(t_0)$.

A consequence of the triangle inequality is that a vector-valued function is continuous or has a limit if and only it its component functions do.


If $\vec{f}(t)$ is vector valued, and $\Delta t > 0$ then we can consider the vector:

\[ ~ \vec{f}(t + \Delta t) - \vec{f}(t) ~ \]

For example, if $\vec{f}(t) = \langle 3\cos(t), 2\sin(t) \rangle$ and $t=\pi/4$ and $\Delta t = \pi/16$ we have this picture:

gr()  # better arrows
f(t) = [3cos(t), 2sin(t)]
t, Deltat = pi/4, pi/16
df = f(t + Deltat) - f(t)

arrow!([0,0], f(t))
arrow!([0,0], f(t+Deltat))
arrow!(f(t), df)

The length of the difference appears to be related to the length of $\Delta t$, in a similar manner as the univariate derivative. The following limit defines the derivative of a vector-valued function:

\[ ~ \vec{f}'(t) = \lim_{\Delta t \rightarrow 0} \frac{f(t + \Delta t) - f(t)}{\Delta t}. ~ \]

The limit exists if the component limits do. The component limits are just the derivatives of the component functions. So, if $\vec{f}(t) = \langle x(t), y(t) \rangle$, then $\vec{f}'(t) = \langle x'(t), y'(t) \rangle$.

If the derivative is never $\vec{0}$, the curve is called regular. For a regular curve the derivative is a tangent vector to the parameterized curve, akin to the case for a univariate function. We can use ForwardDiff to compute the derivative in the exact same manner as was done for univariate functions:

using ForwardDiff
D(f,n=1) = n > 1 ? D(D(f),n-1) : x -> ForwardDiff.derivative(f, float(x))
Base.adjoint(f::Function) = D(f)         # allow f' to compute derivative

(This is already done by the `CalculusWithJulia package.)

We can visualize the tangential property through a graph:

f(t) = [3cos(t), 2sin(t)]
p = plot_parametric_curve(f, 0, 2pi, legend=false, aspect_ratio=:equal)
for t in [1,2,3]
    arrow!(f(t), f'(t))   # add arrow with tail on curve, in direction of derivative

Symbolic representation

Were symbolic expressions used in place of functions, the vector-valued function would naturally be represented as a vector of expressions:

using SymPy
@vars t
vvf = [cos(t), sin(t), t]
\[ \left[ \begin{array}{r}\cos{\left(t \right)}\\\sin{\left(t \right)}\\t\end{array} \right] \]

We will see working with these expressions is not identical to working with a vector-valued function.

To plot, we can avail ourselves of the the parametric plot syntax. The following expands to plot(cos(t), sin(t), t, 0, 2pi):

plot(vvf..., 0, 2pi)

The unzip usage, as was done above, could be used, but it would be more trouble in this case.

To evaluate the function at a given value, say $t=2$, we can use subs with broadcasting to substitute into each component:

subs.(vvf, t.=>2)
\[ \left[ \begin{array}{r}\cos{\left(2 \right)}\\\sin{\left(2 \right)}\\2\end{array} \right] \]

The notation is a bit subtle: subs.( will broadcast over the 3 components in vvf. This is familiar, but broadcasting tries to match up sizes of collections. Without a "dot", vvf would have size 3 and t=>2 has size 2 as an iterable object; and this causes a mismatch. We need to make t=>2 appear to have just length 1. The t.=>2 expresses this. (The "dot" syntactically indicates that t .=> 2 will be a scalar.). Alternatively, would wrapping the pair in a tuple, as in (t=>2, ) or using Ref, as in Ref(t=>2), will ensure broadcasting treats the pair as a scalar value. (This will be unnecessary as version 1.3 of Julia, as pairs will not be considered iterable for broadcasting purposes.)

Limits are performed component by component, and can also be defined by broadcasting, again with the need to adjust the values:

@syms delta
limit.((subs.(vvf, t .=> t + delta) - vvf) / delta, delta .=> 0)
\[ \left[ \begin{array}{r}- \sin{\left(t \right)}\\\cos{\left(t \right)}\\1\end{array} \right] \]

Derivatives, as was just done through a limit, are a bit more straightforward than evaluation or limit taking, as we won't bump into the shape mismatch when broadcasting:

diff.(vvf, t)
\[ \left[ \begin{array}{r}- \sin{\left(t \right)}\\\cos{\left(t \right)}\\1\end{array} \right] \]

The second derivative, can be found through:

diff.(vvf, t, t)
\[ \left[ \begin{array}{r}- \cos{\left(t \right)}\\- \sin{\left(t \right)}\\0\end{array} \right] \]

Applications of the derivative

Here are some sample applications of the derivative.

Example: equation of the tangent line

The derivative of a vector-valued function is similar to that of a univariate function, in that it indicates a direction tangent to a curve. The point-slope form offers a straightforward parameterization. We have a point given through the vector-valued function and a direction given by its derivative. (After identifying a vector with its tail at the origin with the point that is the head of the vector.)

With this, the equation is simply $\vec{tl}(t) = \vec{f}(t_0) + \vec{f}'(t_0) \cdot (t - t_0)$, where the dot indicates scalar multiplication.

Example: parabolic motion

In physics, we learn that the equation $F=ma$ can be used to derive a formula for postion, when acceleration, $a$, is a constant. The resulting equation of motion is $x = x_0 + v_0t + (1/2) at^2$. Similarly, if $x(t)$ is a vector-valued postion vector, and the second derivative, $x''(t) =\vec{a}$, a constant, then we have: $x(t) = \vec{x_0} + \vec{v_0}t + (1/2) \vec{a} t^2$.

For two dimensions, we have the force due to gravity acts downward, only in the $y$ direction. The acceleration is then $\vec{a} = \langle 0, -g \rangle$. If we start at the origin, with initial velocity $\vec{v_0} = \langle 2, 3\rangle$, then we can plot the trajectory until the object returns to ground ($y=0$) as follows:

gravity = 9.8
x0, v0, a = [0,0], [2, 3], [0, -gravity]
xpos(t) = x0 + v0*t + (1/2)*a*t^2

using Roots
t_0 = fzero(t -> xpos(t)[2], 1/10, 100)  # find when y=0

plot_parametric_curve(xpos, 0, t_0)
Example: tractrix

A tractrix, studied by Perrault, Newton, Huygens, and many others, is the curve along which an object moves when pulled in a horizontal plane by a line segment attached to a pulling point (Wikipedia). If the object is placed at $(a,0)$ and the puller at the origin, and the puller moves along the positive $x$ axis, then the line will always be tangent to the curve and of fixed length, so determinable from the motion of the puller. In this example $dy/dx = -\sqrt{a^2-x^2}/x$.

This is the key property: "Due to the geometrical way it was defined, the tractrix has the property that the segment of its tangent, between the asymptote and the point of tangency, has constant length $a$."

The tracks made by the front and rear bicycle wheels also have this same property and similarly afford a mathematical description. We follow Dunbar, Bosman, and Nooij from The Track of a Bicycle Back Tire below, though Levi and Tabachnikov and Foote, Levi, and Tabachnikov were also consulted. Let $a$ be the distance between the front and back wheels, whose positions are parameterized by $\vec{F}(t)$ and $\vec{B}(t)$, respectively. The key property is the distance between the two is always $a$, and, as the back wheel is always moving in the direction of the front wheel, we have $\vec{B}'(t)$ is in the direction of $\vec{F}(t) - \vec{B}(t)$, that is the vector $(\vec{F}(t)-\vec{B}(t))/a$ is a unit vector in the direction of the derivative of $\vec{B}$. How long is the derivative vector? That would be answered by the speed of the back wheel, which is related to the velocity of the front wheel. But only the component of the velocity in the direction of $\vec{F}(t)-\vec{B}(t)$, so the speed of the back wheel is the length of the projection of $\vec{F}'(t)$ onto the unit vector $(\vec{F}(t)-\vec{B}(t))/a$, which is identified through the dot product.

Combined, this gives the following equations relating $\vec{F}(t)$ to $\vec{B}(t)$:

\[ ~ s_B(t) = \vec{F}'(t) \cdot \frac{\vec{F}(t)-\vec{B}(t)}{a}, \quad \vec{B}'(t) = s_B(t) \frac{\vec{F}(t)-\vec{B}(t)}{a}. ~ \]

This is a differential equation describing the motion of the back wheel in terms of the front wheel.

If the back wheel trajectory is known, the relationship is much easier, as the two differ by a vector of length $a$ in the direction of $\vec{B}'(t)$, or:

\[ ~ F(t) = \vec{B}(t) + a \frac{\vec{B'(t)}}{\|\vec{B}'(t)\|}. ~ \]

We don't discuss when a differential equation has a solution, or if it is unique when it does, but note that the differential equation above may be solved numerically, in a manner somewhat similar to what was discussed in ODEs. Though here we will use the DifferentialEquations package for finding the numeric solution.

We can define our equation as follows, using p to pass in the two parameters: the wheel-base length $a$, and $F(t)$, the parameterization of the front wheel in time:

using DifferentialEquations, LinearAlgebra

function bicycle(dB, B, p, t)

  a, F = p   # unpack parameters

  speed =  F'(t)  (F(t) - B) / a
  dB[1], dB[2] = speed * (F(t) - B) / a

bicycle (generic function with 1 method)

Let's consider a few simple cases first. We suppose $a=1$ and the front wheel moves in a circle of radius $3$. Here is how we can plot two loops:

t0, t1 = 0.0, 4pi
tspan = (t0, t1)  # time span to consider

a = 1
F(t) = 3 * [cos(t), sin(t)]
p = (a, F)      # combine parameters

B0 = F(0) - [0, a]  # some initial position for the back
prob = ODEProblem(bicycle, B0, tspan, p)

out = DifferentialEquations.solve(prob, reltol=1e-6)

We qualify the solve function, as SymPy also has a solve function. (We would also need to qualify that as SymPy.solve once DifferentialEquations is loaded.) The object out holds the answer. This object is callable, in that out(t) will return the numerically computed value for the answer to our equation at time point t.

To plot the two trajectories, we could use that out.u holds the $x$ and $y$ components of the computed trajectory, but more simply, we call out like a function.

plt = plot_parametric_curve(F, t0, t1, legend=false)
plot_parametric_curve!(out,  t0, t1, linewidth=3)

## add the bicycle as a line segment at a few times along the path
for t in range(t0, t1, length=11)
    plot!(unzip([out(t), F(t)])..., linewidth=3, color=:black)

That the rear wheel track appears shorter, despite the rear wheel starting outside the circle, is typical of bicycle tracks and also a reason to rotate tires on car, as the front ones move a bit more than the rear, so presumably wear faster.

Let's look what happens if the front wheel wobbles back and forth following a sine curve. Repeating the above, only with $F$ redefined, we have:

a = 1
F(t) = [t, 2sin(t)]
p = (a, F)

B0 = F(0) - [0, a]  # some initial position for the back
prob = ODEProblem(bicycle, B0, tspan, p)

out = DifferentialEquations.solve(prob, reltol=1e-6)
plt = plot_parametric_curve(F, t0, t1, legend=false)
plot_parametric_curve!(t->out(t),  t0, t1, linewidth=3)

Again, the back wheel moves less than the front.

The motion of the back wheel need not be smooth, even if the motion of the front wheel is, as this curve illustrates:

a = 1
F(t) = [cos(t), sin(t)] + [cos(2t), sin(2t)]
p = (a, F)

B0 = F(0) - [0,a]
prob = ODEProblem(bicycle, B0, tspan, p)

out = DifferentialEquations.solve(prob, reltol=1e-6)
plt = plot_parametric_curve(F, t0, t1, legend=false)
plot_parametric_curve!(t->out(t),  t0, t1, linewidth=3)

The back wheel is moving backwards for part of the above trajectory.

This effect can happen even for a front wheel motion as simple as a circle when the front wheel radius is less than the wheelbase:

a = 1
F(t) = a/3 * [cos(t), sin(t)]
p = (a, F)

t0, t1 = 0.0, 25pi
tspan = (t0, t1)

B0 = F(0) - [0,a]
prob = ODEProblem(bicycle, B0, tspan, p)

out = DifferentialEquations.solve(prob, reltol=1e-6)
plt = plot_parametric_curve(F, t0, t1, legend=false, aspect_ratio=:equal)
plot_parametric_curve!(t->out(t),  t0, t1, linewidth=3)

Later we will characterize when there are cusps in the rear-wheel trajectory.

Derivative rules.

From the definition, as it is for univariate functions, for vector-valued functions $\vec{f}, \vec{g}: R \rightarrow R^n$:

\[ ~ [\vec{f} + \vec{g}]'(t) = \vec{f}'(t) + \vec{g}'(t), \quad\text{and } [a\vec{f}]'(t) = a \vec{f}'(t). ~ \]

If $a(t)$ is a univariate (scalar) function of $t$, then a product rule holds:

\[ ~ [a(t) \vec{f}(t)]' = a'(t)\vec{f}(t) + a(t)\vec{f}'(t). ~ \]

If $s$ is a univariate function, then the composition $\vec{f}(s(t))$ can be differentiated. Each component would satisfy the chain rule, and consequently:

\[ ~ \frac{d}{dt}\left(\vec{f}(s(t))\right) = \vec{f}'(s(t)) \cdot s'(t), ~ \]

The dot being scalar multiplication by the derivative of the univariate function $s$.

vector-valued functions do not have multiplication or division defined for them, so there are no ready analogues of the product and quotient rule. However, the dot product and the cross product produce new functions that may have derivative rules available.

For the dot product, the combination $\vec{f}(t) \cdot \vec{g}(t)$ we have a univariate function of $t$, so we know a derivative is well defined. Can it be represented in terms of the vector-valued functions? In terms of the component functions, we have this calculation specific to $n=2$, but that which can be generalized:

\[ ~ \frac{d}{dt}(\vec{f}(t) \cdot \vec{g}(t)) = \frac{d}{dt}(f_1(t) g_1(t) + f_2(t) g_2(t)) = f_1'(t) g_1(t) + f_1(t) g_1'(t) + f_2'(t) g_2(t) + f_2(t) g_2'(t) = f_1'(t) g_1(t) + f_2'(t) g_2(t) + f_1(t) g_1'(t) + f_2(t) g_2'(t) = \vec{f}'(t)\cdot \vec{g}(t) + \vec{f}(t) \cdot \vec{g}'(t). ~ \]

Suggesting the that a product rule like formula applies for dot products.

For the cross product, we let SymPy derive a formula for us.

u1, u2, u3, v1, v2, v3 = SymFunction("u1, u2, u3, v1, v2, v3")
@vars t
u = [u1(t), u2(t), u3(t)]
v = [v1(t), v2(t), v3(t)]
\[ \left[ \begin{array}{r}\operatorname{v_{1}}{\left(t \right)}\\\operatorname{v_{2}}{\left(t \right)}\\\operatorname{v_{3}}{\left(t \right)}\end{array} \right] \]

Then the cross product a derivative:

using LinearAlgebra
diff.(u × v, t)
\[ \left[ \begin{array}{r}\operatorname{u_{2}}{\left(t \right)} \frac{d}{d t} \operatorname{v_{3}}{\left(t \right)} - \operatorname{u_{3}}{\left(t \right)} \frac{d}{d t} \operatorname{v_{2}}{\left(t \right)} - \operatorname{v_{2}}{\left(t \right)} \frac{d}{d t} \operatorname{u_{3}}{\left(t \right)} + \operatorname{v_{3}}{\left(t \right)} \frac{d}{d t} \operatorname{u_{2}}{\left(t \right)}\\- \operatorname{u_{1}}{\left(t \right)} \frac{d}{d t} \operatorname{v_{3}}{\left(t \right)} + \operatorname{u_{3}}{\left(t \right)} \frac{d}{d t} \operatorname{v_{1}}{\left(t \right)} + \operatorname{v_{1}}{\left(t \right)} \frac{d}{d t} \operatorname{u_{3}}{\left(t \right)} - \operatorname{v_{3}}{\left(t \right)} \frac{d}{d t} \operatorname{u_{1}}{\left(t \right)}\\\operatorname{u_{1}}{\left(t \right)} \frac{d}{d t} \operatorname{v_{2}}{\left(t \right)} - \operatorname{u_{2}}{\left(t \right)} \frac{d}{d t} \operatorname{v_{1}}{\left(t \right)} - \operatorname{v_{1}}{\left(t \right)} \frac{d}{d t} \operatorname{u_{2}}{\left(t \right)} + \operatorname{v_{2}}{\left(t \right)} \frac{d}{d t} \operatorname{u_{1}}{\left(t \right)}\end{array} \right] \]

Admittedly, that isn't very clear. With a peek at the answer, we show that the derivative is the same as the product rule would suggest ($\vec{u}' \times \vec{v} + \vec{u} \times \vec{v}'$):

diff.(u × v, t) - (diff.(u,t) × v + u × diff.(v,t))
\[ \left[ \begin{array}{r}0\\0\\0\end{array} \right] \]

In summary, these two derivative formulas hold for vector-valued functions $R \rightarrow R^n$:

\[ ~ \begin{align} (\vec{u} \cdot \vec{v})' &= \vec{u}' \cdot \vec{v} + \vec{u} \cdot \vec{v}',\\ (\vec{u} \times \vec{v})' &= \vec{u}' \times \vec{v} + \vec{u} \times \vec{v}'. \end{align} ~ \]

Application. Circular motion and the tangent vector.

The parameterization $\vec{r}(t) = \langle \cos(t), \sin(t) \rangle$ describes a circle. Characteristic of this motion is a constant radius, or in terms of a norm: $\| \vec{r}(t) \| = c$. The norm squared, can be expressed in terms of the dot product:

\[ ~ \| \vec{r}(t) \|^2 = \vec{r}(t) \cdot \vec{r}(t). ~ \]

Differentiating this for the case of a constant radius yields the equation $0 = [\vec{r}\cdot\vec{r}]'(t)$, which simplifies through the product rule and commutativity of the dot product to $0 = 2 \vec{r}(t) \cdot \vec{r}'(t)$. That is, the two vectors are orthogonal to each other. This observation proves to be very useful, as will be seen.

Example: Kepler's laws

Kepler's laws of planetary motion are summarized by:

Kepler was a careful astronomer, and derived these laws empirically. We show next how to derive these laws using vector calculus assuming some facts on Newtonian motion, as postulated by Newton. This approach is borrowed from Joyce.

We adopt a sun-centered view of the universe, placing the sun at the origin and letting $\vec{x}(t)$ be the position of a planet relative to this origin. We can express this in terms of a magnitude and direction through $r(t) \hat{x}(t)$.

Newton's law of gravitational force between the sun and this planet is then expressed by:

\[ ~ \vec{F} = -\frac{G M m}{r^2} \hat{x}(t). ~ \]

Newton's famous law relating force and acceleration is

\[ ~ \vec{F} = m \vec{a} = m \ddot{\vec{x}}. ~ \]

Combining, Newton states $\vec{a} = -(GM/r^2) \hat{x}$.

Now to show the first law. Consider $\vec{x} \times \vec{v}$. It is constant, as:

\[ ~ \begin{align} (\vec{x} \times \vec{v})' &= \vec{x}' \times \vec{v} + \vec{x} \times \vec{v}'\\ &= \vec{v} \times \vec{v} + \vec{x} \times \vec{a}. \end{align} ~ \]

Both terms are $\vec{0}$, as $\vec{a}$ is parallel to $\vec{x}$ by the above, and clearly $\vec{v}$ is parallel to itself.

This says, $\vec{x} \times \vec{v} = \vec{c}$ is a constant vector, meaning, the motion of $\vec{x}$ must lie in a plane, as $\vec{x}$ is always orthogonal to the fixed vector $\vec{c}$.

Now, by differentiating $\vec{x} = r \hat{x}$ we have:

\[ ~ \begin{align} \vec{v} &= \vec{x}'\\ &= (r\hat{x})'\\ &= r' \hat{x} + r \hat{x}', \end{align} ~ \]

and so

\[ ~ \begin{align} \vec{c} &= \vec{x} \times \vec{v}\\ &= (r\hat{x}) \times (r'\hat{x} + r \hat{x}')\\ &= r^2 (\hat{x} \times \hat{x}'). \end{align} ~ \]

From this, we can compute $\vec{a} \times \vec{c}$:

\[ ~ \begin{align} \vec{a} \times \vec{c} &= (-\frac{GM}{r^2})\hat{x} \times r^2(\hat{x} \times \hat{x}')\\ &= -GM \hat{x} \times (\hat{x} \times \hat{x}') \\ &= GM (\hat{x} \times \hat{x}')\times \hat{x}. \end{align} ~ \]

The last line by anti-commutativity.

But, the triple cross product can be simplified through the identify $(\vec{u}\times\vec{v})\times\vec{w} = (\vec{u}\cdot\vec{w})\vec{v} - (\vec{v}\cdot\vec{w})\vec{u}$. So, the above becomes:

\[ ~ \begin{align} \vec{a} \times \vec{c} &= GM ((\hat{x}\cdot\hat{x})\hat{x}' - (\hat{x} \cdot \hat{x}')\hat{x})\\ &= GM (1 \hat{x}' - 0 \hat{x}). \end{align} ~ \]

Now, since $\vec{c}$ is constant, we have:

\[ ~ \begin{align} (\vec{v} \times \vec{c})' &= (\vec{a} \times \vec{c})\\ &= GM \hat{x}'\\ &= (GM\hat{x})'. \end{align} ~ \]

The two sides have the same derivative, hence differ by a constant:

\[ ~ \vec{v} \times \vec{c} = GM \hat{x} + \vec{d}. ~ \]

As $\vec{u}$ and $\vec{v}\times\vec{c}$ lie in the same plane - orthogonal to $\vec{c}$ - so does $\vec{d}$. With a suitable re-orientation, so that $\vec{d}$ is along the $x$ axis, $\vec{c}$ is along the $z$-axis, then we have $\vec{c} = \langle 0,0,c\rangle$ and $\vec{d} = \langle d ,0,0 \rangle$, and $\vec{x} = \langle x, y, 0 \rangle$. Set $\theta$ to be the angle, then $\hat{x} = \langle \cos(\theta), \sin(\theta), 0\rangle$.


\[ ~ \begin{align} c^2 &= \|\vec{c}\|^2 \\ &= \vec{c} \cdot \vec{c}\\ &= (\vec{x} \times \vec{v}) \cdot \vec{c}\\ &= \vec{x} \cdot (\vec{v} \times \vec{c})\\ &= r\hat{x} \cdot (GM\hat{x} + \vec{d})\\ &= GMr + r \hat{x} \cdot \vec{d}\\ &= GMr + rd \cos(\theta). \end{align} ~ \]

Solving, this gives the radial distance in the form of an ellipse:

\[ ~ r = \frac{c^2}{GM + d\cos(\theta)} = \frac{c^2/(GM)}{1 + (d/GM) \cos(\theta)}. ~ \]

Kepler's second law can also be derived from vector calculus. This derivation follows that given at MIT OpenCourseWare OpenCourseWare.

The second law states that the area being swept out during a time duration only depends on the duration of time, not the time. Let $\Delta t$ be this duration. Then if $\vec{x}(t)$ is the position vector, as above, we have the area swept out between $t$ and $t + \Delta t$ is visualized along the lines of:

x1(t) = [cos(t), 2 * sin(t)]
t0, t1, Delta = 1.0, 2.0, 1/10
plot_parametric_curve(x1, 0, pi/2)

arrow!([0,0], x1(t0)); arrow!([0,0], x1(t0 + Delta))
arrow!(x1(t0), x1(t0+Delta)- x1(t0), linewidth=5)

The area swept out, is basically the half the area of the parallelogram formed by $\vec{x}(t)$ and $\Delta \vec{x}(t) = \vec{x}(t + \Delta t) - \vec{x}(t))$. This area is $(1/2) (\vec{x} \times \Delta\vec{x}(t))$.

If we divide through by $\Delta t$, and take a limit we have:

\[ ~ \frac{dA}{dt} = \| \frac{1}{2}\lim_{\Delta t \rightarrow 0} (\vec{x} \times \frac{\vec{x}(t + \Delta t) - \vec{x}(t)}{\Delta t})\| = \frac{1}{2}\|\vec{x} \times \vec{v}\|. ~ \]

But we saw above, that for the motion of a planet, that $\vec{x} \times \vec{v} = \vec{c}$, a constant. This says, $dA$ is a constant independent of $t$, and consequently, the area swept out over a duration of time will not depend on the particular times involved, just the duration.

The third law relates the period to a parameter of the ellipse. We have from the above a strong suggestion that area of the ellipse can be found by integrating $dA$ over the period, say $T$. Assuming that is the case and letting $a$ be the semi-major axis length, and $b$ the semi-minor axis length, then

\[ ~ \pi a b = \int_0^T dA = \int_0^T (1/2) \|\vec{x} \times \vec{v}\| dt = \| \vec{x} \times \vec{v}\| \frac{T}{2}. ~ \]

As $c = \|\vec{x} \times \vec{v}\|$ is a constant, this allows us to express $c$ by: $2\pi a b/T$.

But, we have

\[ ~ r(\theta) = \frac{c^2}{GM + d\cos(\theta)} = \frac{c^2/(GM)}{1 + d/(GM) \cos(\theta)}. ~ \]

So, $e = d/(GM)$ and $a (1 - e^2) = c^2/(GM)$. Using $b = a \sqrt{1-e^2}$ we have:

\[ ~ a(1-e^2) = c^2/(GM) = (\frac{2\pi a b}{T})^2 \frac{1}{GM} = \frac{(2\pi)^2}{GM} \frac{a^2 (a^2(1-e^2))}{T^2}, ~ \]

or after cancelling $(1-e^2)$ from each side:

\[ ~ T^2 = \frac{(2\pi)^2}{GM} \frac{a^4}{a} = \frac{(2\pi)^2}{GM} a^3. ~ \]

The above shows how Newton might have derived Kepler's observational facts. Next we show, that assuming the laws of Kepler can anticipate Newton's equation for gravitational force. This follows Wikipedia.

Now let $\vec{r}(t)$ be the position of the planet relative to the Sun at the origin, in two dimensions (we used $\vec{x}(t)$ above). Assume $\vec{r}(0)$ points in the $x$ direction. Write $\vec{r} = r \hat{r}$. Define $\hat{\theta(t)}$ to be the mapping from time $t$ to the angle defined by $\hat{r}$ through the unit circle.

Then we express the velocity ($\dot{\vec{r}}$) and acceleration ($\ddot{\vec{r}}$) in terms of the orthogonal vectors $\hat{r}$ and $\hat{\theta}$, as follows:

\[ ~ \frac{d}{dt}(r \hat{r}) = \dot{r} \hat{r} + r \dot{\hat{r}} = \dot{r} \hat{r} + r \dot{\theta}\hat{\theta}. ~ \]

The last equality from expressing $\hat{r}(t) = \hat{r}(\theta(t))$ and using the chain rule, noting $d(\hat{r}(\theta))/d\theta = \hat{\theta}$.


\[ ~ \frac{d^2}{dt^2}(r \hat{r}) = (\ddot{r} \hat{r} + \dot{r} \dot{\hat{r}}) + (\dot{r} \dot{\theta}\hat{\theta} + r \ddot{\theta}\hat{\theta} + r \dot{\theta}\dot{\hat{\theta}}). ~ \]

Noting, similar to above, $\dot{\hat{\theta}} = d\hat{\theta}/dt = d\hat{\theta}/d\theta \cdot d\theta/dt = -\dot{\theta} \hat{r}$ we can express the above in terms of $\hat{r}$ and $\hat{\theta}$ as:

\[ ~ \vec{a} = \frac{d^2}{dt^2}(r \hat{r}) = (\ddot{r} - r (\dot{\theta})^2) \hat{r} + (r\ddot{\theta} + 2\dot{r}\dot{\theta}) \hat{\theta}. ~ \]

That is, in general, the acceleration has a radial component and a transversal component.

Kepler's second law says that the area increment over time is constant ($dA/dt$), but this area increment is approximated by the following wedge in polar coordinates: $dA = (1/2) r \cdot rd\theta$. We have then $dA/dt = r^2 \dot{\theta}$ is constant.

Differentiating, we have:

\[ ~ 0 = \frac{d(r^2 \dot{\theta})}{dt} = 2r\dot{r}\dot{\theta} + r^2 \ddot{\theta}, ~ \]

which is the tranversal component of the acceleration times $r$, as decomposed above. This means, that the acceleration of the planet is completely towards the Sun at the origin.

Kepler's first law, relates $r$ and $\theta$ through the polar equation of an ellipse:

\[ ~ r = \frac{p}{1 + \epsilon \cos(\theta)}. ~ \]

Expressing in terms of $p/r$ and differentiating in $t$ gives:

\[ ~ -\frac{p \dot{r}}{r^2} = -\epsilon\sin(\theta) \dot{\theta}. ~ \]


\[ ~ p\dot{r} = \epsilon\sin(\theta) r^2 \dot{\theta} = \epsilon \sin(\theta) C, ~ \]

For a constant $C$, used above, as the second laws implies $r^2 \dot{\theta}$ is constant. (This constant can be expressed in terms of parameters describing the ellipse.)

Differentiating again in $t$, gives:

\[ ~ p \ddot{r} = C\epsilon \cos(\theta) \dot{\theta} = C\epsilon \cos(\theta)\frac{C}{r^2}. ~ \]

So $\ddot{r} = (C^2 \epsilon / p) \cos{\theta} (1/r^2)$.

The radial acceleration from above is:

\[ ~ \ddot{r} - r (\dot{\theta})^2 = (C^2 \epsilon/p) \cos{\theta} \frac{1}{r^2} - r\frac{C^2}{r^4} = \frac{C^2}{pr^2}(\epsilon \cos(\theta) - \frac{p}{r}). ~ \]

Using $p/r = 1 + \epsilon\cos(\theta)$, we have the radial acceleration is $C^2/p \cdot (1/r^2)$. That is the acceleration, is proportional to the inverse square of the position, and using the relation between $F$, force, and acceleration, we see the force on the planet follows the inverse-square law of Newton.

Moving frames of reference

In the last example, it proved useful to represent vectors in terms of other unit vectors, in that case $\hat{r}$ and $\hat{\theta}$. Here we discuss a coordinate system defined intrinsically by the motion along the trajectory of a curve.

Let $r(t)$ be a smooth vector-valued function in $R^3$. It gives rise to a space curve, through its graph. This curve has tangent vector $\vec{r}'(t)$, indicating the direction of travel along $\vec{r}$ as $t$ increases. The length of $\vec{r}'(t)$ depends on the parameterization of $\vec{r}$, as for any increasing, differentiable function $s(t)$, the composition $\vec{r}(s(t))$ will have derivative, $\vec{r}'(s(t)) s'(t)$, having the same direction as $\vec{r}'(t)$ (at suitably calibrated points), but not the same magnitude, the factor of $s(t)$ being involved.

To discuss properties intrinsic to the curve, the unit vector is considered:

\[ ~ \hat{T}(t) = \frac{\vec{r}'(t)}{\|\vec{r}'(t)\|}. ~ \]

The function $\hat{T}(t)$ is the unit tangent vector. Now define the unit normal, $\hat{N}(t)$, by:

\[ ~ \hat{N}(t) = \frac{\hat{T}'(t)}{\| \hat{T}'(t) \|}. ~ \]

Since $\|\hat{T}(t)\| = 1$, a constant, it must be that $\hat{T}'(t) \cdot \hat{T}(t) = 0$, that is, the $\hat{N}$ and $\hat{T}$ are orthogonal.

Finally, define the binormal, $\hat{B}(t) = \hat{T}(t) \times \hat{N}(t)$. At each time $t$, the three unit vectors are orthogonal to each other. They form a moving coordinate system for the motion along the curve that does not depend on the parameterization.

We can visualize this, for example along a Viviani curve, as is done in a Wikipedia animation:

function viviani(t, a=1)
    [a*(1-cos(t)), a*sin(t), 2a*sin(t/2)]

Tangent(t) = viviani'(t)/norm(viviani'(t))
Normal(t) = Tangent'(t)/norm(Tangent'(t))
Binormal(t) = Tangent(t) × Normal(t)

p = plot(legend=false, aspect_ratio=:equal)
plot_parametric_curve!(viviani, -2pi, 2pi)

t0, t1 = -pi/3, pi/2 + 2pi/5
r0, r1 = viviani(t0), viviani(t1)
arrow!(r0, Tangent(t0)); arrow!(r0, Binormal(t0)); arrow!(r0, Normal(t0))
arrow!(r1, Tangent(t1)); arrow!(r1, Binormal(t1)); arrow!(r1, Normal(t1))

The curvature of a 3-dimensional space curve is defined by:

\[ ~ \kappa = \frac{\| r'(t) \times r''(t) \|}{\| r'(t) \|^3} ~ \]

For $2$ dimensional space curves, the same formula applies after embedding a $0$ third component. It can also be expressed directly as $(x'y''-x''y')/\|r'\|^3$.

This can also be defined as derivative of the tangent vector, $\hat{T}$, when the curve is parameterized by arc length, a topic still to be taken up. The vector $\vec{r}'(t)$ is the direction of motion, whereas $\vec{r}''(t)$ indicates how fast and in what direction this is changing. For curves with little curve in them, the two will be nearly parallel and the cross product small (reflecting the presence of $\cos(\theta)$ in the definition). For "curvy" curves, $\vec{r}''$ will be in a direction opposite of $\vec{r}'$ to the $\cos(\theta)$ term in the cross product will be closer to $1$.

Let $\vec{r}(t) = k \cdot \langle \cos(t), \sin(t), 0 \rangle$. This will have curvature:

@vars k positive=true
@vars t real=true
r1 = k * [cos(t), sin(t),0]
norm(diff.(r1,t) × diff.(r1,t,t)) / norm(diff.(r1,t))^3 |> simplify
ERROR: UndefVarError: simplify not defined

For larger circles (bigger $\|k\|$) there is less curvature. The limit being a line with curvature $0$.

If a curve is imagined to have a tangent "circle" (second order Taylor series approximation), then the curvature of that circle matches the curvature of the curve.

The torsion, $\tau$, of a space curve ($n=3$), is a measure of how sharply the curve is twisting out of the plane of curvature.

The torsion is defined for smooth curves by

\[ ~ \tau = \frac{(\vec{r}' \times \vec{r}'') \cdot \vec{r}'''}{\|\vec{r}' \times \vec{r}''\|^2}. ~ \]

For the torsion to be defined, the cross product $\vec{r}' \times \vec{r}''$ must be non zero, that is the two must not be parallel or zero.

Arc length

In Arc length there is a discussion of how to find the arc length of a parameterized curve in 2 dimensions. The general case is discussed by Destafano who shows if the curve $C$ is parameterized by a smooth function $\vec{r}(t)$ over an interval $I$, that the arc length of $C$ is $\int_I \| \vec{r}'(t) \| dt$.

If we associate $\vec{r}'(t)$ with the velocity, then this is the integral of the speed (the magnitude of the velocity).

Let $I=[a,b]$ and $s(t): [v,w] \rightarrow [a,b]$ such that $s$ is increasing and differentiable. Then $\vec{\phi} = \vec{r} \circ s$ will have have

\[ ~ \text{arc length} = \int_v^w \| \vec{\phi}'(t)\| dt = \int_v^w \| \vec{r}'(s(t))\| s'(t) dt = \int_a^b \| \vec{r}'(u) \| du, ~ \]

by a change of variable $u=s(t)$. As such the arc length is a property of the curve and not the parameterization of the curve.

For some parameterization, we can define

\[ ~ s(t) = \int_0^t \| \vec{r}'(u) \| du ~ \]

Then by the fundamental theorem of calculus, $s(t)$ is non-decreasing. If $\vec{r}'$ is assumed to be non-zero and continuous (regular), then $s(t)$ has a derivative and an inverse which is monotonic. Using the inverse function $s^{-1}$ to change variables ($\vec{\phi} = \vec{r} \circ s^{-1}$) has

\[ ~ \int_0^c \| \phi'(t) \| dt = \int_{s^{-1}(0)}^{s^{-1}(c)} \| \vec{r}'(u) \| du = s(s^{-1}(c)) - s(s^{-1}(0)) = c ~ \]

That is, the arc length from $[0,c]$ for $\phi$ is just $c$; the curve $C$ is parameterized by arc length.


Viviani's curve is the intersection of sphere of radius $a$ with a cylinder of radius $a$. A parameterization was given previously by:

function viviani(t, a=1)
    [a*(1-cos(t)), a*sin(t), 2a*sin(t/2)]
viviani (generic function with 2 methods)

The curve is traced out over the interval $[0, 4\pi]$. We try to find the arc-length:

@vars t a positive=true
speed = simplify(norm(diff.(viviani(t, a), t)))
integrate(speed, (t, 0, 4*PI))
ERROR: UndefVarError: simplify not defined

We see that the answer depends linearly on $a$, but otherwise is a constant expressed as an integral. We use QuadGk to provide a numeric answer for the case $a=1$:

using QuadGK
quadgk(t -> norm(viviani'(t)), 0, 4pi)
(15.280791156110851, 3.750801136348514e-8)

Very few parameterized curves admit a closed-form expression for parameterization by arc-length. Let's consider the helix expressed by $\langle a\cos(t), a\sin(t), bt\rangle$, as this does allow such a parameterization.

@vars a b t al positive=true
helix = [a*cos(t), a*sin(t), b*t]
speed = simplify( norm(diff.(helix, t)) )
s = integrate(speed, (t, 0, al))
ERROR: UndefVarError: simplify not defined

So s is a linear function. We can re-parameterize by:

eqn = subs.(helix, t.=> al/sqrt(a^2 + b^2))
\[ \left[ \begin{array}{r}a \cos{\left(\frac{al}{\sqrt{a^{2} + b^{2}}} \right)}\\a \sin{\left(\frac{al}{\sqrt{a^{2} + b^{2}}} \right)}\\\frac{al b}{\sqrt{a^{2} + b^{2}}}\end{array} \right] \]

To see that the speed, $\| \vec{\phi}' \|$, is constantly $1$:

ERROR: UndefVarError: simplify not defined

From this, we have the arc length is:

\[ ~ \int_0^t \| \vec{\phi}'(u) \| du = \int_0^t 1 du = t ~ \]

Parameterizing by arc-length is only explicitly possible for a few examples, however knowing it can be done in theory, is important. Some formulas are simplified, such as the tangent, normal, and binormal. Let $\vec{r}(s)$ be parameterized by arc length, then:

\[ ~ \hat{T}(s)= \vec{r}'(s) / \| \vec{r}'(s) \| = \vec{r}'(s),\quad \hat{N}(s) = \hat{T}'(s) / \| \hat{T}'(s)\| = \hat{T}'(s)/\kappa,\quad \hat{B} = \hat{T} \times \hat{N}, ~ \]

As before, but further, we have if $\kappa$ is the curvature and $\tau$ the torsion, these relationships expressing the derivatives with respect to $s$ in terms of the components in the frame:

\[ ~ \begin{array}{} \hat{T}'(s) &= &\kappa \hat{N}(s) &\\ \hat{N}'(s) &= -\kappa \hat{T}(s) & &+ \tau \hat{B}(s)\\ \hat{B}'(s) &= &-\tau \hat{N}(s) & \end{array} ~ \]

These are the Frenet-Serret formulas.


Continuing with our parameterization of a helix by arc length, we can compute the curvature and torsion by differentiation:

gamma = subs.(helix, t.=> al/sqrt(a^2 + b^2))   # gamma parameterized by arc length
@vars u positive=true
gamma = subs.(gamma, al .=> u)                  # u is arc-length parameterization
\[ \left[ \begin{array}{r}a \cos{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}\\a \sin{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}\\\frac{b u}{\sqrt{a^{2} + b^{2}}}\end{array} \right] \]
T = diff.(gamma, u)
\begin{equation*}\sqrt{\frac{a^{2} \sin^{2}{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{a^{2} + b^{2}} + \frac{a^{2} \cos^{2}{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{a^{2} + b^{2}} + \frac{b^{2}}{a^{2} + b^{2}}}\end{equation*}

The length is one, as the speed of a curve parameterized by arc-length is 1.

out = diff.(T, u)
\[ \left[ \begin{array}{r}- \frac{a \cos{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{a^{2} + b^{2}}\\- \frac{a \sin{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{a^{2} + b^{2}}\\0\end{array} \right] \]

This should be $\kappa \hat{N}$, so we do:

kappa = norm(out)
Norm = out/kappa
kappa |> simplify
ERROR: UndefVarError: simplify not defined

Interpreting, $a$ is the radius of the circle and $b$ how tight the coils are. If $a$ gets much larger than $b$, then the curvature is like $1/a$, just as with a circle. If $b$ gets very big, then the trajectory looks more stretched out and the curvature gets smaller.

To find the torsion, we find, $\hat{B}$ then differentiate:

B = T × Norm
out = diff.(B, u)
\[ \left[ \begin{array}{r}\frac{a b \cos{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right)^{2} \sqrt{\frac{a^{2} \sin^{2}{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right)^{2}} + \frac{a^{2} \cos^{2}{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right)^{2}}}}\\\frac{a b \sin{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right)^{2} \sqrt{\frac{a^{2} \sin^{2}{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right)^{2}} + \frac{a^{2} \cos^{2}{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right)^{2}}}}\\0\end{array} \right] \]

This looks complicated, as does Norm:

\[ \left[ \begin{array}{r}- \frac{a \cos{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right) \sqrt{\frac{a^{2} \sin^{2}{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right)^{2}} + \frac{a^{2} \cos^{2}{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right)^{2}}}}\\- \frac{a \sin{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right) \sqrt{\frac{a^{2} \sin^{2}{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right)^{2}} + \frac{a^{2} \cos^{2}{\left(\frac{u}{\sqrt{a^{2} + b^{2}}} \right)}}{\left(a^{2} + b^{2}\right)^{2}}}}\\0\end{array} \right] \]

However, the torsion, up to a sign, simplifies nicely:

norm(out) |> simplify
ERROR: UndefVarError: simplify not defined

Here, when $b$ gets large, the curve looks more and more "straight" and the torsion decreases. Similarly, if $a$ gets big, the torsion decreases.


Levi and Tabachnikov consider the trajectories of the front and rear bicycle wheels. Recall the notation previously used: $\vec{F}(t)$ for the front wheel, and $\vec{B}(t)$ for the rear wheel trajectories. Consider now their parameterization by arc length, using $u$ for the arc-length parameter for $\vec{F}$ and $v$ for $\vec{B}$. We define $\alpha(u)$ to be the steering angle of the bicycle. This can be found as the angle between the tangent vector of the path of $\vec{F}$ with the vector $\vec{B} - \vec{F}$. Let $\kappa$ be the curvature of the front wheel and $k$ the curvature of the back wheel.

Levi and Tabachnikov prove in their Proposition 2.4:

\[ ~ \begin{align} \kappa(u) &= \frac{d\alpha(u)}{du} + \frac{\sin(\alpha(u))}{a},\\ |\frac{du}{dv}| &= |\cos(\alpha)|, \quad \text{and}\\ k &= \frac{\tan(\alpha)}{a}. \end{align} ~ \]

The first equation relates the steering angle with the curvature. If the steering angle is not changed ($d\alpha/du=0$) then the curvature is constant and the motion is circular. It will be greater for larger angles (up to $\pi/2$). As the curvature is the reciprocal of the radius, this means the radius of the circular trajectory will be smaller. For the same constant steering angle, the curvature will be smaller for longer wheelbases, meaning the circular trajectory will have a larger radius. For cars, which have similar dynamics, this means longer wheelbase cars will take more room to make a U turn.

The second equation may be interpreted in ratio of arc lengths. The infinitesimal arc length of the rear wheel is proportional to that of the front wheel only scaled down by $\cos(\alpha)$. When $\alpha=0$ - the bike is moving in a straight line - and the two are the same. At the other extreme - when $\alpha=\pi/2$ - the bike must be pivoting on its rear wheel and the rear wheel has no arc length. This cosine, is related to the speed of the back wheel relative to the speed of the front wheel, which was used in the initial differential equation.

The last equation, relates the curvature of the back wheel track to the steering angle of the front wheel. When $\alpha=\pm\pi/2$, the rear-wheel curvature, $k$, is infinite, resulting in a cusp (no circle with non-zero radius will approximate the trajectory). This occurs when the front wheel is steered orthogonal to the direction of motion. As was seen in previous graphs of the trajectories, a cusp can happen for quite regular front wheel trajectories.

To derive the first one, we have previously noted that when a curve is parameterized by arc length, the curvature is more directly computed: it is the magnitude of the derivative of the tangent vector. The tangent vector is of unit length, when parametrized by arc length. This implies it's derivative will be orthogonal. If $\vec{r}(t)$ is a parameterization by arc length, then the curvature formula simplifies as:

\[ ~ \kappa(s) = \frac{\| \vec{r}'(s) \times \vec{r}''(s) \|}{\|\vec{r}'(s)\|^3} = \frac{\| \vec{r}'(s) \times \vec{r}''(s) \|}{1} = \| \vec{r}'(s) \| \| \vec{r}''(s) \| \sin(\theta) = 1 \| \vec{r}''(s) \| 1 = \| \vec{r}''(s) \|. ~ \]

So in the above, the curvature is $\kappa = \| \vec{F}''(u) \|$ and $k = \|\vec{B}''(v)\|$.

On the figure, the tangent vector $\vec{F}'(u)$ is drawn, along with this unit vector rotated by $\pi/2$. We call these, for convenience, $\vec{U}$ and $\vec{V}$. We have $\vec{U} = \vec{F}'(u)$ and $\vec{V} = -(1/\kappa) \vec{F}''(u)$.

The key decomposition, is to express a unit vector in the direction of the line segment, as the vector $\vec{U}$ rotated by $\alpha$ degrees. Mathematically, this is usually expressed in matrix notation, but more explicitly by

\[ ~ \langle \cos(\alpha) \vec{U}_1 - \sin(\alpha) \vec{U}_2, \sin(\alpha) \vec{U}_1 + \cos(\alpha) \vec{U}_2 = \vec{U} \cos(\alpha) - \vec{V} \sin(\alpha). ~ \]

With this, the mathematical relationship between $F$ and $B$ is just a multiple of this unit vector:

\[ ~ \vec{B}(u) = \vec{F}(u) - a \vec{U} \cos(\alpha) + a \vec{V} \sin(\alpha). ~ \]

It must be that the tangent line of $\vec{B}$ is parallel to $\vec{U} \cos(\alpha) + \vec{V} \sin(\alpha)$. To utilize this, we differentiate $\vec{B}$ using the facts that $\vec{U}' = \kappa \vec{V}$ and $\vec{V}' = -\kappa \vec{U}$. These coming from $\vec{U} = \vec{F}'$ and so it's derivative in $u$ has magnitude yielding the curvature, $\kappa$, and direction orthogonal to $\vec{U}$.

\[ ~ \begin{align} \vec{B}'(u) &= \vec{F}'(u) -a \vec{U}' \cos(\alpha) -a \vec{U} (-\sin(\alpha)) \alpha' +a \vec{V}' \sin(\alpha) + a \vec{V} \cos(\alpha) \alpha'\\ & = \vec{U} -a (\kappa) \vec{V} \cos(\alpha) + a \vec{U} \sin(\alpha) \alpha' + a (-\kappa) \vec{U} \sin(\alpha) + a \vec{V} \cos(\alpha) \alpha' \\ &= \vec{U} + a(\alpha' - \kappa) \sin(\alpha) \vec{U} + a(\alpha' - \kappa) \cos(\alpha)\vec{V}. \end{align} ~ \]

Extend the 2-dimensional vectors to 3-d, by adding a zero $z$ component, then:

\[ ~ \begin{align} \vec{0} &= (\vec{U} + a(\alpha' - \kappa) \sin(\alpha) \vec{U} + a(\alpha' - \kappa) \cos(\alpha)\vec{V}) \times (-\vec{U} \cos(\alpha) + \vec{V} \sin(\alpha)) \\ &= (\vec{U} \times \vec{V}) \sin(\alpha) + a(\alpha' - \kappa) \sin(\alpha) \vec{U} \times \vec{V} \sin(\alpha)) - a(\alpha' - \kappa) \cos(\alpha)\vec{V} \times \vec{U} \cos(\alpha) \\ &= (\sin(\alpha) + a(\alpha'-\kappa) \sin^2(\alpha) + a(\alpha'-\kappa) \cos^2(\alpha)) \vec{U} \times \vec{V} \\ &= (\sin(\alpha) + a (\alpha' - \kappa)) \vec{U} \times \vec{V}. \end{align} ~ \]

The terms $\vec{U} \times\vec{U}$ and $\vec{V}\times\vec{V}$ being $\vec{0}$, due to properties of the cross product. This says the scalar part must be $0$, or

\[ ~ \frac{\sin(\alpha)}{a} + \alpha' = \kappa. ~ \]

As for the second equation, from the expression for $\vec{B}'(u)$, after setting $a(\alpha'-\kappa) = -\sin(\alpha)$:

\[ ~ \begin{align} \|\vec{B}'(u)\|^2 &= \| (1 -\sin(\alpha)\sin(\alpha)) \vec{U} -\sin(\alpha)\cos(\alpha) \vec{V} \|^2\\ &= \| \cos^2(\alpha) \vec{U} -\sin(\alpha)\cos(\alpha) \vec{V} \|^2\\ &= (\cos^2(\alpha))^2 + (\sin(\alpha)\cos(\alpha))^2\quad\text{using } \vec{U}\cdot\vec{V}=0\\ &= \cos^2(\alpha)(\cos^2(\alpha) + \sin^2(\alpha))\\ &= \cos^2(\alpha). \end{align} ~ \]

From this $\|\vec{B}(u)\| = |\cos(\alpha)\|$. But $1 = \|d\vec{B}/dv\| = \|d\vec{B}/du \| \cdot |du/dv|$ and $|dv/du|=|\cos(\alpha)|$ follows.

Example: evolutes and involutes

Following Fuchs we discuss a geometric phenomenon known and explored by Huygens, and likely earlier. We stick to the two-dimensional case, Fuchs extends this to three dimensions. The following figure