Before defining the derivative of a function, let's begin with two motivating examples.
Imagine motoring along down highway 61 leaving Minnesota on the way to New Orleans; though lost in listening to music, still mindful of the speedometer and odometer, both prominently placed on the dashboard of the car.
The speedometer reads 60 miles per hour, what is the odometer doing? Besides recording total distance traveled, it is incrementing dutifully every hour by 60 miles. Why? Well, the wellknown formula relating distance, time and rate of travel is
\[ ~ \text{distance} = \text{ rate } \times \text{ time.} ~ \]
If the rate is a constant $60$ miles/hour, then in one hour the distance traveled is 60 miles.
Of course, the odometer isn't just incrementing once per hour, it is incrementing once every 1/10th of a mile. How much time does that take? Well, we would need to solve $1/10=60 \cdot t$ which means $t=1/600$ hours, better known as once every 6 seconds.
Using some mathematical notation, would give $x(t) = v\cdot t$, where $x$ is position at time $t$, $v$ is the velocity and $t$ the time traveled in hours. A simple graph of the first three hours of travel would show:
using CalculusWithJulia using Plots position(t) = 60 * t plot(position, 0, 3)
plotly() plot(sin, 0, 2pi)
Oh no, we hit traffic. In the next 30 minutes we only traveled 15 miles. We were so busy looking out for traffic, the speedometer was not checked. What would the average speed have been? Though in the 30 minutes, the displayed speed may have varied, the average speed would simply be the change in distance over the change in time, or $\Delta x / \Delta t$. That is
15/(1/2)
30.0
Now suppose that after $6$ hours of travel the GPS in the car gives us a readout of distance traveled as a function of time. The graph looks like this:
We can see with some effort that the slope is steady for the first three hours, is slightly less between $3$ and $3.5$ hours, then is a bit steeper for the next half hour. After that, it is flat for the about half an hour, then the slope continues on with same value as in the first 3 hours. What does that say about our speed during our trip?
Based on the graph, what was the average speed over the first three hours? Well, we traveled 180 miles, and took 3 hours:
180/3
60.0
What about the next half hour? Squinting shows the amount traveled was 15 miles (195  180) and it took 1/2 an hour:
15/(1/2)
30.0
And the half hour after that? The average speed is found from the distance traveled, 37.5 miles, divided by the time, 1/2 hour:
37.5 / (1/2)
75.0
Okay, so there was some speeding involved.
The next half hour the car did not move. What was the average speed? Well the change in position was 0, but the time was 1/2 hour, so the average was 0.
Perhaps a graph of the speed is a bit more clear. We can do this based on the above:
function speed(t) if 0 < t <= 3 60 elseif 3 < t <= 3.5 30 elseif 3.5 < t <= 4 75 elseif 4 < t <= 4.5 0 else 60 end end plot(speed, 0, 6)
The jumps, as discussed before, are artifacts of the graphing algorithm. What is interesting, is we could have derived the graph of speed
from that of x
by just finding the slopes of the line segments, and we could have derived the graph of x
from that of speed
, just using the simple formula relating distance, rate, and time.
We were pretty loose with some key terms. There is a distinction between "speed" and "velocity", this being the speed is the absolute value of velocity. Velocity incorporates a direction as well as a magnitude. Similarly, distance traveled and change in position are not the same thing when there is back tracking involved. The total distance traveled is computed with the speed, the change in position is computed with the velocity. When there is no change of sign, it is a bit more natural, perhaps, to use the language of speed and distance.
One of history's most famous experiments was performed by Galileo where he rolled balls down inclined ramps, making note of distance traveled with respect to time. As Galileo had no ultraaccurate measuring device, he needed to slow movement down by controlling the angle of the ramp. With this, he could measure units of distance per units of time. (Click through to Galileo and Perspective Dauben.)
Suppose that no matter what the incline was, Galileo observed that in units of the distance traveled in the first second that the distance traveled between subsequent seconds was $3$ times, then $5$ times, then $7$ times, ... This table summarizes.
t  delta  distance 

0  0  0 
1  1  1 
2  3  4 
3  5  9 
4  7  16 
5  9  25 
A graph of distance versus time could be found by interpolating between the measured points:
ts = [0,1,2,3,4, 5] xs = [0,1,4,9,16,25] plot(ts, xs)
The graph looks almost quadratic. What would the following questions have yielded?
What is the average speed between $0$ and $3$?
(90) / (30) # (xs[4]  xs[1]) / (ts[4]  ts[1])
3.0
What is the average speed between $2$ and $3$?
(94) / (32) # (xs[4]  xs[3]) / (ts[4]  ts[3])
5.0
From the graph, we can tell that the slope of the line connecting $(2,4)$ and $(3,9)$ will be greater than that connecting $(0,0)$ and $(3,9)$. In fact, given the shape of the graph (concave up), the line connecting $(0,0)$ with any point will have a slope less than or equal to any of the line segments.
The average speed between $k$ and $k+1$ for this graph is:
xs[2]xs[1], xs[3]  xs[2], xs[4]  xs[3], xs[5]  xs[4]
(1, 3, 5, 7)
We see it increments by $2$. The acceleration is the rate of change of speed. We see the rate of change of speed is constant, as the speed increments by 2 each time unit.
Based on this  and given Galileo's insight  it appears the acceleration will be a constant and the position as a function of time will be quadratic.
In the above examples, we see that the average speed is computed using the slope formula. This can be generalized for any univariate function $f(x)$:
The average rate of change between $a$ and $b$ is $(f(b)  f(a)) / (b  a)$. It is typical to express this as $\Delta y/ \Delta x$, where $\Delta$ means "change".
Geometrically, this is the slope of the line connecting the points $(a, f(a))$ and $(b, f(b))$. This line is called a secant line, which is just a line intersecting two specified points on a curve.
Rather than parameterize this problem using $a$ and $b$, we let $c$ and $c+h$ represent the two values for $x$, then the secantlineslope formula becomes
\[ ~ m = \frac{f(c+h)  f(c)}{h}. ~ \]
The slope of the secant line represents the average rate of change over a given period, $h$. What if this rate is so variable, that it makes sense to take smaller and smaller periods $h$? In fact, what if $h$ goes to $0$?
The graphic suggests that the slopes of the secant line converge to the slope of a "tangent" line. That is, for a given $c$, this limit exists:
\[ ~ \lim_{h \rightarrow 0} \frac{f(c+h)  f(c)}{h}. ~ \]
We'll define the tangent line at $(c, f(c))$ to be the line through the point with the slope from the limit above  provided that limit exists. Informally, the tangent line is the line through the point that best approximates the function.
The tangent line is not just a line that intersects the graph in one point, nor does it need only intersect the line in just one point.
This last point was certainly not obvious at first. Barrow, who had Newton as a pupil and was the first to sketch a proof of part of the Fundamental Theorem of Calculus, understood a tangent line to be a line that intersects a curve at only one point.
What is the slope of the tangent line to $f(x) = \sin(x)$ at $c=0$?
We need to compute the limit $(\sin(c+h)  \sin(c))/h$ which is the limit as $h$ goes to $0$ of $\sin(h)/h$. We know this to be 1.
f(x) = sin(x) c = 0 tl(x) = f(c) + 1 * (x  c) plot(f, pi/2, pi/2) plot!(tl, pi/2, pi/2)
The limit of the slope of the secant line gives an operation: for each $c$ in the domain of $f$ there is a number (the slope of the tangent line) or it does not exist. That is, there is derived function from $f$. Call this function the derivative of $f$. There are many notations for this, here we use the "prime" notation:
\[ ~ f'(x) = \lim_{h \rightarrow 0} \frac{f(x+h)  f(x)}{h}. ~ \]
The limit above is identical, only it uses $x$ instead of $c$ to emphasize that we are thinking of function now, and not just a value at a point.
The power rule. What is the derivative of the monomial $f(x) = x^n$? We need to look at $(x+h)^n  x^n$ for positive, integervalue $n$. Let's look at a case, $n=5$
@vars x h real=true n = 5 ex = expand((x+h)^n  x^n)
All terms have an h
in them, so we cancel it out:
cancel(ex/h, h)
We see the lone term 5x^4
without an $h$, so as we let $h$ go to $0$, this will be the limit. That is, $f'(x) = 5x^4$.
For general integervalue, positive $n$, the binomial theorem gives an expansion $(x+h)^n = x^n + nx^{n1}\cdot h^1 + n\cdot(n1)x^{n2}\cdot h^2 + \cdots$. Subtracting $x^n$ then dividing by $h$ leaves just the term $nx^{n1}$ without a power of $h$, so the limit, in general, is just this term. That is $[x^n]' = nx^{n1}$.
It isn't a special case, but when $n=0$, we also have the above formula applies, as $x^0$ is the constant $1$, and all constant functions will have a derivative of $0$ at all $x$. We will see that in general, the power rule applies for any $n$ where $x^n$ is defined.
What is the derivative of $f(x) = \sin(x)$? We know that $f'(0)= 1$ by an earlier example, here we solve in general.
We need to consider the difference $\sin(x+h)  \sin(x)$:
ex = sympy.expand_trig(sin(x+h)  sin(x)) # expand_trig is not exposed in `SymPy`
That used the formula $\sin(x+h) = \sin(x)\cos(h) + \sin(h)\cos(x)$.
We could then rearrange the secant line slope formula to become:
\[ ~ \cos(x) \cdot \frac{\sin(h)}{h} + \sin(x) \cdot \frac{\cos(h)  1}{h} ~ \]
and take a limit. If the answer isn't clear, we can let SymPy
do this work:
limit((sin(x+h)  sin(x))/ h, h=>0)
Let's see what the derivative of $\log(x)$ is. We have
\[ ~ \frac{\log(x+h)  \log(x)}{h} = \frac{1}{h}\log(\frac{x+h}{x}) = \log((1+h/x)^{1/h}). ~ \]
Earlier we saw that the limit as $u$ goes to $0$ of $f(u) = (1 + u)^{1/u}$ is $e$. Reexpressing the above we can get $1/x \cdot \log(f(h/x))$. The limit as $h$ goes to $0$ of this is found from the composition rules for limits: as $\lim_{h \rightarrow 0} f(h/x) = e$, and since $\log(e)$ is $1$ we get this expression has a limit of $1/x$.
We verify through:
limit((log(x+h)  log(x))/h, h=>0)
There are several different notations for derivatives. Some are historical, some just add flexibility. We use the prime notation of Lagrange: $f'(x)$, $u'$ and $[\text{expr}]'$, where the first emphasizes that the derivative is a function with a value at $x$, the second emphasizes the derivative operates on functions, the last emphasize that we are taking the derivative of some expression (the "rule" part of an unnamed function).
There are many other notations:
The Leibniz notation uses the infinitesimals: $dy/dx$ to relate to $\Delta y/\Delta x$. This notation is very common, and especially useful when more than one variable is involved. SymPy
uses Leibniz notation in some of its output, expressing somethings such as:
The notation  $\big$  on the righthand side separates the tasks of finding the derivative and evaluating the derivative at a specific value.
Euler used D
for the operator D(f)
. This notation is appropriated by an operator in the CalculusWithJulia
package. This was initially used by Argobast.
Newton used a "dot" $\dot{x}(t)$, which is still widely used in physics to indicate a derivative in time.
We could proceed in a similar manner to find other derivatives, but let's not. If we have a function $f(x) = x^5 \sin(x)$, it would be nice to leverage our previous work on the derivatives of $f(x) =x^5$ and $g(x) = \sin(x)$, rather than derive an answer from scratch.
As with limits and continuity, it proves very useful to consider rules that make the process of finding derivatives of combinations of functions a matter of combining derivatives of the individual functions in some manner.
Let's consider $k(x) = a\cdot f(x) + b\cdot g(x)$, what is its derivative? That is, in terms of $f$, $g$ and their derivatives, can we express $k'(x)$?
We can rearrange $(k(x+h)  k(x))$ as follows:
\[ ~ (a\cdot f(x+h) + b\cdot g(x+h))  (a\cdot f(x) + b \cdot g(x)) = a\cdot (f(x+h)  f(x)) + b \cdot (g(x+h)  g(x)). ~ \]
Dividing by $h$, we see that this becomes
\[ ~ a\cdot \frac{f(x+h)  f(x)}{h} + b \cdot \frac{g(x+h)  g(x)}{h} \rightarrow a\cdot f'(x) + b\cdot g'(x). ~ \]
This holds two rules: the derivative of a constant times a function is the constant times the derivative of the function; and the derivative of a sum of functions is the sum of the derivative of the functions.
Other rules can be similarly derived. SymPy
can give us them as well. Here we define to symbolic functions u
and v
and let SymPy
derive a formula for the derivative of a product of functions:
u,v = SymFunction("u,v") # make symbolic functions f(x) = u(x) * v(x) limit((f(x+h)  f(x))/h, h=>0)
The output uses some new notation to represent that the derivative of $u(x) \cdot v(x)$ is the $u$ times the derivative of $v$ plus $v$ times the derivative of $u$. A common shorthand is $[uv]' = u'v + uv'$.
The derivative of $f(x) = u(x)/v(x)$  a ratio of functions  can be similarly computed. The result will be $[u/v]' = (u'v  uv')/u^2$.
Finally, the derivative of a composition of functions can be computed. This gives a rule called the chain rule. Before deriving, let's give a slight motivation.
Consider the output of a factory for some widget. It depends on two steps: an initial manufacturing step and a finishing step. The number of employees is important in how much is initially manufactured. Suppose $x$ is the number of employees and $g(x)$ is the amount initially manufactured. Adding more employees increases the amount made by the madeup rule $g(x) = \sqrt{x}$. The finishing step depends on how much is made by the employees. If $y$ is the amount made, then $f(y)$ is the number of widgets finished. Suppose for some reason that $f(y) = y^2.$
How many widgets are made as a function of employees? The composition $u(x) = f(g(x))$ would provide that.
What is the effect of adding employees on the rate of output of widgets? In this specific case we know the answer, as $(f \circ g)(x) = x$, so the answer is just the rate is $1$.
In general, we want to express $\Delta f / \Delta x$ in a form so that we can take a limit.
But what do we know? We know $\Delta g / \Delta x$ and $\Delta f/\Delta y$. Using $y=g(x)$, this suggests that we might have luck with the right side of this equation:
\[ ~ \frac{\Delta f}{\Delta x} = \frac{\Delta f}{\Delta y} \cdot \frac{\Delta y}{\Delta x}. ~ \]
Interpreting this, we get the average rate of change in the composition can be thought of as a product: The average rate of change of the initial step ($\Delta y/ \Delta x$) times the average rate of the change of the second step evaluated not at $x$, but at $y$, $\Delta f/ \Delta y$.
Reexpressing using derivative notation with $h$ would be:
\[ ~ \frac{f(g(x+h))  f(g(x))}{h} = \frac{f(g(x+h))  f(g(x))}{g(x+h)  g(x)} \cdot \frac{g(x+h)  g(x)}{h}. ~ \]
The left hand side will converge to the derivative of $u(x)$ or $[f(g(x))]'$.
The right most part of the right side would have a limit $g'(x)$, were we to let $h$ go to $0$.
It isn't obvious, but the left part of the right side has the limit $f'(g(x))$. This would be clear if only $g(x+h) = g(x) + h$, for then the expression would be exactly the limit expression with $c=g(x)$. But, alas, except to some hopeful students and some special cases, it is definitely not the case in general that $g(x+h) = g(x) + h$  that right parentheses actually means something. However, it is nearly the case that $g(x+h) = g(x) + kh$ for some $k$ and this can be used to formulate a proof (one of the two detailed here and here).
We can verify this using SymPy
:
limit(( u(v(x+h))  u(v(x)) ) / (v(x+h)  v(x)), h=>0)
Combined, we would end up with:
The chain rule: $[f(g(x))]' = f'(g(x)) \cdot g'(x)$. That is the derivative of the outer function evaluated at the inner function times the derivative of the inner function.
To see that this works in our specific case, we assume the general power rule that $[x^n]' = n x^{n1}$ to get: $g'(x) = (1/2)x^{1/2}$, $f'(x)=2x$, and $f'(g(x)) = 2(\sqrt{x})$. Together, the product is:
\[ ~ 2\sqrt{x} \cdot (1/2) 1/\sqrt{x} = 1 ~ \]
Which is the same as the derivative of $x$ found by first evaluating the composition.
A function is differentiable at $a$ if the following limit exists $\lim_{h \rightarrow 0}(f(a+h)f(a))/h$. Reexpressing this as: $f(a+h)  f(a)  f'(a)h = \epsilon_f(h) h$ where as $h\rightarrow 0$, $\epsilon_f(h) \rightarrow 0$. Then, we have:
\[ ~ g(a+h) = g(a) + g'(a)h + \epsilon_g(h) h = g(a) + h', ~ \]
Where $h' = (g'(a) + \epsilon_g(h))h \rightarrow 0$ as $h \rightarrow 0$ will be used to simplify the following:
\[ ~ \begin{align} f(g(a+h))  f(g(a)) &= f(g(a) + g'(a)h + \epsilon_g(h)h)  f(g(a)) \\ &= f(g(a)) + f'(g(a)) (g'(a)h + \epsilon_g(h)h) + \epsilon_f(h')(h')  f(g(a))\\ &= f'(g(a)) g'(a)h + f'(g(a))(\epsilon_g(h)h) + \epsilon_f(h')(h'). \end{align} ~ \]
Rearranging:
\[ ~ f(g(a+h))  f(g(a))  f'(g(a)) g'(a) h = f'(g(a))\epsilon_g(h))h + \epsilon_f(h')(h') = (f'(g(a)) \epsilon_g(h) + \epsilon_f(h')( (g'(a) + \epsilon_g(h))))h = \epsilon(h)h, ~ \]
where $\epsilon(h)$ combines the above terms which go to zero as $h\rightarrow 0$ into one. This is the alternative definition of the derivative, showing $(f\circ g)'(a) = f'(g(a)) g'(a)$ when $g$ is differentiable at $a$ and $f$ is differentiable at $g(a)$.
Find the derivative of $x^5 \cdot \sin(x)$.
This is a product of functions, using $[u\cdot v]' = u'v + uv'$ we get:
\[ ~ 5x^4 \cdot \sin(x) + x^5 \cdot \cos(x) ~ \]
Find the derivative of $x^5 / \sin(x)$.
This is a quotient of functions. Using $[u/v]' = (u'v  uv')/v^2$ we get
\[ ~ (5x^4 \cdot \sin(x)  x^5 \cdot \cos(x)) / (\sin(x))^2. ~ \]
Find the derivative of $\sin(x^5)$. This is a composition of functions $u(v(x))$ with $v(x) = x^5$. The chain rule says find the derivative of $u$ ($\cos(x)$) and evaluate at $v(x)$ ($\cos(x^5)$) then multiply by the derivative of $v$:
\[ ~ \cos(x^5) \cdot 5x^4. ~ \]
Similarly, but differently, find the derivative of $\sin(x)^5$. Now $v(x) = \sin(x)$, so the derivative of $u(x)$ ($5x^4$) evaluated at $v(x)$ is $5(\sin(x))^4$ so multiplying by $v'$ gives:
\[ ~ 5(\sin(x))^4 \cdot \cos(x) ~ \]
We can verify these with SymPy
. Rather than take a limit, we will use SymPy
's diff
function to compute derivatives.
diff(x^5 * sin(x))
diff(x^5/sin(x))
diff(sin(x^5))
and finally,
diff(sin(x)^5)
The diff
function can be called as diff(ex)
when there is just one free variable, as in the above examples; as diff(ex, var)
when there are parameters in the expression; or, as diff(f)
, where f
is the name of a univariate function.
Let's see that the derivative of $e^x$ is just itself.
\[ ~ \frac{e^{x+h}  e^x}{h} = \frac{e^x \cdot(e^h 1)}{h}. ~ \]
If we know that $\lim_{h \rightarrow 0}(e^h  1)/h = 1$, we get $[e^x]' = e^x$, that is it is a function satisfying $f'=f$.
Suppose we knew that $\log(x)$ had derivative of $1/x$, but didn't know the derivative of $e^x$. From their inverse relation, we have: $x=\log(e^x)$, so taking derivatives of both sides would yield:
\[ ~ 1 = (\frac{1}{e^x}) \cdot [e^x]'. ~ \]
Or solving, $[e^x]' = e^x$. This is a general strategy to find the derivative of an inverse function.
The general product rule: For any $n$  not just integer values  we can reexpress $x^n$ using $e$: $x^n = e^{n \log(x)}$. Now the chain rule can be applied:
\[ ~ [x^n]' = [e^{n\log(x)}]' = e^{n\log(x)} \cdot (n \frac{1}{x}) = n x^n \cdot \frac{1}{x} = n x^{n1}. ~ \]
Find the derivative of $f(x) = x^3 (1x)^2$ using either the power rule or the sum rule.
The power rule expresses $f=u\cdot v$. With $u(x)=x^3$ and $v(x)=(1x)^2$ we get:
\[ ~ u'(x) = 3x^2, \quad v'(x) = 2 \cdot (1x)^1 \cdot (1), ~ \]
the last by the chain rule. Combining with $u' v + u v'$ we get: $f'(x) = (3x^2)\cdot (1x)^2 + x^3 \cdot (2) \cdot (1x)$.
Otherwise, the polynomial can be expanded to give $f(x)=x^52x^4+x^3$ which has derivative $f'(x) = 5x^4  8x^3 + 3x^2$.
Find the derivative of $f(x) = (x1)(x2)(x3)(x4)(x5)$.
We could expand this, as above, but instead will use the product rule. Let's work symbolically first and treat the case of $3$ products:
\[ ~ [u\cdot v\cdot w]' =[ u \cdot (vw)]' = u' (vw) + u [vw]' = u'(vw) + u[v' w + v w'] = u' vw + u v' w + uvw'. ~ \]
This pattern generalizes, clearly to $~ [f_1\cdot f_2 \cdots f_n]' = f_1' f_2 \cdots f_n + f_1 \cdot f_2' \cdot (f_3 \cdots f_n) + \dots + f_1 \cdots f_{n1} \cdot f_n'. ~$
There are $n$ terms, each where one of the $f_i$s have a derivative. Were we to multiply top and bottom by $f_i$, we would get each term looks like: $f \cdot f_i'/f_i$.
With this, we can proceed. Each term $xi$ has derivative $1$, so the answer to $f'(x)$, with $f$ as above, is $f'(x) = f(x)/(x1) + f(x)/(x2) + f(x)/(x3) + f(x)/(x4) + f(x)/(x5)$.
Find the derivative of $f(x) = x \cdot e^{x^2}$.
Using the product rule and then the chain rule, we have:
\[ ~ \begin{align} f'(x) &= [x \cdot e^{x^2}]'\\ &= [x]' \cdot e^{x^2} + x \cdot [e^{x^2}]'\\ &= 1 \cdot e^{x^2} + x \cdot (e^{x^2}) \cdot [x^2]'\\ &= e^{x^2} + x \cdot e^{x^2} \cdot (2x)\\ &= e^{x^2} (1  2x^2). \end{align} ~ \]
Find the derivative of $f(x) = e^{ax} \cdot \sin(x)$.
Using the product rule and then the chain rule, we have:
\[ ~ \begin{align} f'(x) &= [e^{ax} \cdot \sin(x)]'\\ &= [e^{ax}]' \cdot \sin(x) + e^{ax} \cdot [\sin(x)]'\\ &= e^{ax} \cdot [ax]' \cdot \sin(x) + e^{ax} \cdot \cos(x)\\ &= e^{ax} \cdot (a) \cdot \sin(x) + e^{ax} \cos(x)\\ &= e^{ax}(\cos(x)  a\sin(x)). \end{align} ~ \]
This table summarizes the rules of derivatives that allow derivatives of more complicated expressions to be computed with the derivatives of their pieces.
Name  Rule 

Power rule  $[x^n]' = n\cdot x^{n1}$

constant  $[cf(x)]' = c \cdot f'(x)$

sum/difference  $[f(x) \pm g(x)]' = f'(x) \pm g'(x)$

product  $[f(x) \cdot g(x)]' = f'(x)\cdot g(x) + f(x) \cdot g'(x)$

quotient  $[f(x)/g(x)]' = (f'(x) \cdot g(x)  f(x) \cdot g'(x)) / g(x)^2$

chain  $[f(g(x))]' = f'(g(x)) \cdot g'(x)$

This table gives some useful derivatives:
Function  Derivative 

$x^n \text{ all } n$
 $nx^{n1}$

$e^x$
 $e^x$

$\log(x)$
 $1/x$

$\sin(x)$
 $\cos(x)$

$\cos(x)$
 $\sin(x)$

The derivative of a function is an operator, it takes a function and returns a new, derived, function. We could repeat this operation. The result is called a higherorder derivative. The Lagrange notation uses additional "primes" to indicate how many. So $f''(x)$ is the second derivative and $f'''(x)$ the third. For even higher orders, sometimes the notation is $f^{(n)}(x)$ to indicate an $n$th derivative.
Find the second derivative of $e^{x^2}$.
We need the chain rule and the product rule:
\[ ~ [e^{x^2}]'' = [e^{x^2} \cdot (2x)]' = (e^{x^2} \cdot (2x) \cdot(2x) + e^{x^2} \cdot (2)) = e^{x^2}(4x^2  2). ~ \]
This can be verified:
diff(diff(e^(x^2))) > simplify
Having to iterate the use of diff
is cumbersome. An alternate notation is either specifying the variable twice: diff(ex, x, x)
or using a number after the variable: diff(ex, x, 2)
:
diff(e^(x^2), x, x) > simplify
The ability to breakdown an expression into operations and their arguments is necessary when trying to apply the differentiation rules. Such rules are applied from the outside in. Identifying the proper "outside" function is usually most of the battle when finding derivatives.
The SymPy
program has a function that can identify the outside function of a symbolic expression. Basic math expressions can be parsed as a function being called with one or more arguments. The function SymPy.Introspection.funcname
will return the function, and SymPy.Introspection.args
will return the arguments.
To illustrate, we define some variables and look at the operations:
args, funcname = SymPy.Introspection.args, SymPy.Introspection.funcname # args, funcname are not exported @vars x y z
(x, y, z)
This shows how addition is identified:
ex = x + y + z funcname(ex)
"Add"
The arguments are all three variables being added:
args(ex)
(x, y, z)
Similarly, with expressions of a single variable
ex = x^2 + sin(x) + sqrt(x^2 + 2) funcname(ex)
"Add"
and now the arguments are the expressions being added:
args(ex)
(x^2, sqrt(x^2 + 2), sin(x))
When addition is the "outside" function, the next step would be to apply the sum rule.
What about multiplication?
ex = x * y * z funcname(ex)
"Mul"
and
args(ex)
(x, y, z)
Like addition, multiplication is not just a binary operation, as it is typically viewed mathematically. The output of args
can be 2 or more expressions.
The power rule is an easy to use derivative rule. SymPy recognizes when the "outer" function is a power:
ex = x^4 funcname(ex)
"Pow"
The power rule would be used if the variable is not in the exponent, of course, as here.
Exploring the quotient rule offers a surprise:
ex = x/y funcname(ex)
"Mul"
The surprise? We may have expected division, but see multiplication. SymPy uses a different form to represent division, which can be gleaned from looking at the resulting arguments:
args(ex)
(x, 1/y)
And the expression 1/y
is really a power:
ex = 1/y funcname(ex)
"Pow"
So the quotient rule would be implemented here as a combination of the product rule and the power rule.
To differentiate, we need to recursively apply derivative rules, peeling back one level at a time. When would you stop? When a number or a free symbol is encountered, as with:
funcname(x)
"Symbol"
or
funcname(Sym(3))
"Integer"
In each of these cases, args
returns an empty container:
args(x), args(Sym(3))
((), ())
We can put all this together to create a function to differentiate an expression. This function will be called diffex
(to distinguish itself from the more powerful, builtin, diff
function). The first task for diffex
is to identify the wrapping, outside function, if any, and apply the corresponding rule to the arguments. Julia's Val
types are used in the following to specialize based on the type of "outside" function. This is technical, but is exactly what is needed as different types of functions need different rules of differentiation.
function diffex(ex) n = length(free_symbols(ex)) n == 0 && return Sym(0) n > 1 && error("This is a simple example, use diff") _diff(Val{Symbol(funcname(ex))}, args(ex)...) end
diffex (generic function with 1 method)
This function identifies the number of symbols in the expression to differentiate. If none, $0$ is returned; if more than one, an error is thrown, and if one, the outside function is identified.
For sums, we want the sum rule. This is the sum of the individual derivatives, simply implemented as:
_diff(::Type{Val{:Add}}, xs...) = sum(diffex(ex) for ex in xs)
_diff (generic function with 1 method)
The product rule is a bit more advanced. We use a formula derived above: if $f = f_1 \cdot f_2 \cdots f_n$ then $f' = \sum f'_i/f_i \cdot f$:
function _diff(::Type{Val{:Mul}}, fs...) f = prod(fs) sum(diffex(fi)/fi * f for fi in fs) end
_diff (generic function with 2 methods)
The power rule has three cases that could be considered depending on where the variable exists (in the base, exponent, or both). The following avoids this by reexpressing $a^b = \exp(b \ln(a))$ and then applying the chain rule:
_diff(::Type{Val{:Pow}}, a, b) = sympy.powsimp(a^b * diffex(b * log(a)))
_diff (generic function with 3 methods)
The powsimp
function is used to encourage SymPy to simplify $a^b/b$ into $a^{b1}$, which may not otherwise occur. (It is not exposed, so must be qualified by the underlying sympy
Python object.)
Now to catch some special cases. First, when we encounter the symbol, we want to return 1
, as $d/dx[x] = 1$:
_diff(::Type{Val{:Symbol}}) = Sym(1)
_diff (generic function with 4 methods)
For numbers the derivative is $0$. As there are quite a few number types, we define a catch all distinguishing them by the fact that the args
call is empty.
_diff(::Any) = Sym(0)
_diff (generic function with 5 methods)
Now for some specific functions. Here is where the chain rule comes in. For example, the derivative when the log
function is encountered has an extra product of diffex(ex)
:
_diff(::Type{Val{:log}}, ex) = 1/ex * diffex(ex)
_diff (generic function with 6 methods)
Similarly, here is the rule for exp
:
_diff(::Type{Val{:exp}}, ex) = exp(ex) * diffex(ex)
_diff (generic function with 7 methods)
Finally, we define values for some trigonometric functions. Many more special cases, as these, would be needed to fully implement this approach:
_diff(::Type{Val{:sin}}, ex) = cos(ex) * diffex(ex) _diff(::Type{Val{:cos}}, ex) = sin(ex) * diffex(ex) _diff(::Type{Val{:tan}}, ex) = sec(ex)^2 * diffex(ex)
_diff (generic function with 10 methods)
(The chain rule uses fall into a pattern that could be exploited to make the code only depend on a mapping of terms like :sin > cos
.)
Here we try it out:
diffex(x^5 + x^4 + x^3 + x^2 + x + 1)
diffex(sin(x^2) + cos(1/x))
And
diffex(x^x * exp(x))
The derivative at $c$ is the slope of the tangent line at $x=c$. Answer the following based on this graph:
fn = x > x*exp(x)*sin(pi*x) plot(fn, 0, 2)
At which of these points $c= 1/2, 1, 3/2$ is the derivative negative?
Which value looks bigger from reading the graph:
At $0.708 \dots$ and $1.65\dots$ the derivative has a common value. What is it?
Consider the graph of the airyai
function (from SpecialFunctions
) over $[5, 5]$.
At $x = 2.5$ the derivative is postive or negative?
At $x=0$ the derivative is postive or negative?
At $x = 2.5$ the derivative is postive or negative?
Compute the derivative of $e^x$ using limit
. What do you get?
Compute the derivative of $x^e$ using limit
. What do you get?
Compute the derivative of $e^{e\cdot x}$ using limit
. What do you get?
In the derivation of the derivative of $\sin(x)$, the following limit is needed:
\[ ~ L = \lim_{h \rightarrow 0} \frac{\cos(h)  1}{h}. ~ \]
This is
Let $f(x) = (e^x + e^{x})/2$ and $g(x) = (e^x  e^{x})/2$. Which is true?
Let $f(x) = (e^x + e^{x})/2$ and $g(x) = (e^x  e^{x})/2$. Which is true?
Consider the function $f$ and its transformation $g(x) = a + f(x)$ (shift up by $a$). Do $f$ and $g$ have the same derivative?
Consider the function $f$ and its transformation $g(x) = f(x  a)$ (shift right by $a$). Do $f$ and $g$ have the same derivative?
Consider the function $f$ and its transformation $g(x) = f(x  a)$ (shift right by $a$). Is $f'$ at $x$ equal to $g'$ at $xa$?
Consider the function $f$ and its transformation $g(x) = c f(x)$, $c > 1$. Do $f$ and $g$ have the same derivative?
Consider the function $f$ and its transformation $g(x) = f(x/c)$, $c > 1$. Do $f$ and $g$ have the same derivative?
Which of the following is true?
The rate of change of volume with respect to height is $3h$. The rate of change of height with respect to time is $2t$. At at $t=3$ the height is $h=14$ what is the rate of change of volume with respect to time when $t=3$?
Which equation below is $f(x) = \sin(k\cdot x)$ a solution of ($k > 1$)?
Let $f(x) = e^{k\cdot x}$, $k > 1$. Which equation below is $f(x)$ a solution of?