We come now to an examination-from the modern infinitesimal perspective-
of the cornerstone concept of the calculus.
8.1 The Derivative
Newton called the derivative the fluxion iJ of a fluent quantity y, thinking
of it as the "speed" with which the quantity flows. In more modern parlance,
the derivative of a function f at a real number x is the real number
f'(x) that represents the rate of change of the function as it varies near x.
Alternatively, it is the slope of the tangent to the graph ofnf at x. Formally
it is defined as the number
1. f(x +h)-f(x)
lm .
h--+0 h
Theorem 8.1.1 Ifof is defined at x E .!R, then the real number L E 1R is
the derivative of f at x if and only if for every nonzero infinitesimal c,
f(x +c) is defined and
f(x +c)-f(x) ':::!. L .
(i)
c
Proof Let g(h) = and apply the characterisation of
" lim g(h) = L"
h--+0
92 8. Differentiation
given in Section 7.3. 0
Thus when f is differentiable (i.e., has a derivative) at x, we have
J'(x) =
sh (f(x + f(x)n)
for all infinitesimal c: f. 0.
If (i) holds only for all positive infinitesimal c:, then L is the right-hand
derivative of f at x, defined classically as
l. f(x +h)o-f(x)
Im .
h-.O+ h
Similarly, if (i) holds for all negative c: 0, then Lis the left-hand derivative
given by the limit as h -t o-.
Exercise 8.1.2
Use the characterisation of Theorem 8.1.1 to prove that the derivative of
sinx is cosx at real x (cf. Section 7.2 and Exercise 5.7(2)).
8.2 Increments and Differentials
Let L!x denote an arbitrary nonzero infinitesimal representing a change or
increment in the value of variable x. The corresponding increment in the
value of the function f at x is
L!f = f(x + L!x)n-f(x).
To be quite explicit we should denote this increment by L!f ( x, L!x), since
its value depends both on the value of x and the choice of the infinitesimal
L!x. The more abbreviated notation is, however, convenient and suggestive.
If f is differentiable at x E R, Theorem 8.1.1 implies that
!'(x),
so the Newton quotient * is limited. Hence as
L!f
L!f = L!x,
L!x
it follows that the increment L!f in f is infinitesimal. Thus f ( x + L!x)
f( x) for all infinitesimal L!x, and this proves
Theorem 8.2.1 Ifof is differentiable at x E JR, then f is continuous at x.
D
8.2 Increments and Differentials
The differential ofof at x corresponding to Llx is defined to be
df = J'(x)Llx.
Thus whereas Llf represents the increment of the "y-coordinate" along the
graph of f at x, df represents the increment along the tangent line to this
graph at x. Writing dx for L\x, the definition of df yields
df
= J'(x).
dx
Now, since f' ( x) is limited and Llx is infinitesimal, it follows that df is
infinitesimal. Hence df and Llf are infinitely close to each other. In fact,
their difference is infinitely smaller than L\x, for if
Llf
£ = !'(x),
L1x
_
then £ is infinitesimal, because f' ( x), and
dfo-df = dfo-J'(x)L\x = £L1x,
which is also infinitesimal (being a product of infinitesimals). But
--£ "" 0Llx
_
L1x
_
df -df £Llx
'
and in this sense L1f -df is infinitesimal compared to L1x (equivalently,
is unlimited). These relationships are summarised in
Theorem 8.2.2 (Incremental Equation) If f'(x) exists at real x and
L1x = dx is infinitesimal, then L\f and df are infinitesimal, and there is an
infinitesimal £, dependent on x and Llx, such that
df = J'(x)L1x + £Llx = df + Edx,
and so
f(x + L1x) = f(x) + f'(x)Llx + £L1x.
This last equation elucidates the role of the derivative function J' as the
best linear approximation to the function f at x. For the graph of the linear
function
l( Llx) = f ( x) + f'( x) Llx
gives the tangent to foat x when the origin is translated to the point (x, 0),
and l(Llx) differs from f(x+oL1x) by the amount £L1x, which we saw above
is itself infinitely smaller than L1x when L1x is infinitesimal, and in that
sense is "negligible"n.
94 8. Differentiation
8.3 Rules for Derivatives
Ifof and g are differentiable at x E R, then so to are f + g, fg, and f jg,
provided that g(x)o-=/= 0. Moreover,
(1) (! + g)'(x) f'(x) + g'(x),
=
(2) (fg)'(x) = f'(x)g(x) + f(x)g'(x),
(3) (fjg)'(x) =
g(x)2
Proof We prove Leibniz's rule (2), and leave the others as exercises.
If ..1xo-=/= 0 is infinitesimal, then, by Theorem 8.1.1, f(x + ..1x) and g(x +
..1x) are both defined, and hence so is
(fg)(x + ..1x) = f(x + ..1x)g(x + ..1x).
Then the increment of f g at x corresponding to ..1x is
..1(/g) f(x + ..1x)g(x + ..1x)o-f(x)g(x)
(f(x) + ..1/)(g(x) + ..1g)o-f(x)g(x)
(..1/)g(x) + f(x)..1g + ..1f ..1g
(compare this to Leibniz's reasoning as discussed in Section 1.2). It follows
that
..1(/g)
- g(x)o+f(x) ! +..1f !
..1x
f'(x)g(x) + f(x)g'(x) + 0,
since * f'(x), * c:: g'(x), ..1f 0 and all quantities involved are
limited.
Hence by Theorem 8.1.1, f'(x)g(x) + f(x)g'(x) is the derivative of fg
at x. D
8.4 Chain Rule
Ifof is differentiable at x E R, and g is differentiable at f ( x), then g o f is
differentiable at x with derivative g' (! ( x)) f' ( x).
Proof Let ..1x be a nonzero infinitesimal. Then f ( x + ..1x) is defined and
f(x + ..1x) f(x), as we saw in Section 8.2. But g is defined at all points
infinitely close to f(x), since g'(f(x)) exists, so (go f)(x + ..1x) = g(f(x +
..1x)) is defined.
8.5 Critical Point Theorem
95
Now let
t1f = f(x + t1x)o-f(x),
t1(g of) = g(f(x + t1x))o-g(f(x))
be the increments ofnf and go f at x corresponding to t1x. Then t1f is
infinitesimal, and
t1(g of) = g(f(x) + t1f)o-g(f(x)),
which shows, crucially, that
t1(g of) is also the increment of g at f ( x) corresponding to t1f.
In the full incremental notation, this reads
t1(g o f)(x, t1x) = t1g(f(x), t1f).
By the incremental equation (Theorem 8.2.2) for g, it then follows that
there exists an infinitesimal c such that
t1(g of)= g'(f(x))t1f + ct1f.
Hence
t1(g 0 f)
g'(f(x)) + c
t1x
g'(f(x))f'(x) + 0,
establishing that g'(f(x))f'(x) is the derivative of go fat x. D
8.5 Critical Point Theorem
Let f have a maximum or a minimum at x on some real interval (a, b). If
f is differentiable at x, then f' ( x) = 0.
Proof Suppose f has a maximum at x. By transfer,
f(x + t1x) ::::; f(x)
for all infinitesimal t1x. Hence if cis positive infinitesimal and 6 is negative
infinitesimal,
J'(x) f(x + f(x) ::::; 00::::; f(x+ f(x) f'(x),
and so as f'(x) is real, it must be equal to 0.
The case of f having a minimum at x is similar. D
Using the critical point and extreme value theorems, the following results
can be successively derived about a function f that is continuous on
[a, b] lR and differentiable on (a, b). The proofs do not require any further
reasoning about infinitesimals or limits.
8. Differentiation
• Rolle's Theorem: if f(a) = f(b) = 0, then f'(x) = 0 for some x E
(a, b).
• Mean Value Theorem: for some x E (a, b),
f'(x) =
• Ifnf' is zero/positive/negative on (a, b), then f is constant/increasing/
decreasing on [a, b].
8.6 Inverse Function Theorem
Let f be continuous and strictly monotone (increasing or decreasing) on
(a, b), and suppose g is the inverse function of f. If f is differentiable at x in (a, b), with f'(x) f. 0, then g is differentiable at y = f(x), with g'(y) = 1/ f'(x).
Proof Using the intermediate value theorem and monotonicity ofnf it can
be shown that g is defined on some real open interval around y. The result
g' (f ( x)) = 1/ f'( x) would follow easily by the chain rule applied to the
equation g(f(x)) = x if we knew that g was differentiable at f(x). But
that is what we have to prove!
Now let Lly be a nonzero infinitesimal. We need to show that
g(y + Lly)n-g(y) 1
f'(x)n"
Now, if g(y + Lly) were not infinitely close to g(y), then there would be a
real number r strictly between them. But then, by monotonicity of f, f(r)would be a real number strictly between y + Lly andny. Since y is real, this
would mean that y + LJ.y and y were an appreciable distance apart, which
is not so. Hence
Llx = g(y + Lly)n-g(y)
is infinitesimal and is nonzero. (Thus the argument so far establishes that
g is continuous at y.) Observe that .t1x is, by definition, the increment
Llg(y, Lly) of gat y corresponding to Lly.
Since g(y) = x, the last equation gives g(y + Lly) = x + Llx, so
f(x + Llx) = f(g(y + Lly)) = y + Lly.
Hence
Lly f(x + L1x)n-f(x)
Llj, the increment ofnf at x corresponding to Llx.
fx(a, b)
/y(a, b)
Llx
f(a, b + LJ.y)x-f(a, b)
LJ.y
fx(a, b)
/y(a, b)
Llx
f(a, b + LJ.y)x-f(a, b)
LJ.y
8. 7 Partial Derivatives 97
Altogether we have
Llf(x, Llx) Lly
Llx Llx
and
Llg(y, Lly) LJ.x Llx
'
LJ.y Lly -LJ.fx
Put more briefly, we have shown that
LJ.g 1
·
Lly LlfILlx
To derive from this the conclusion g'(y) = 11f'(x) we invoke the hypothesis
that f'(x) =/:. 0 (which is essential: consider what happens at x = 0
when f(x) = x3). Since sh(L1flt1x) = f'(x), it follows that L1flt1x is
appreciable. But then
by 5.6.2(3)
sh(LlfILJ.x)
1
·
f'(x)
Therefore,
Llg(y, Lly) Llx 1 '
LJ.y f'(x)x
Because LJ.y is an arbitrary nonzero infinitesimal, this establishes that the
real number 11f'(x) is the derivative of g at y, as desired. 0
8.7 Partial Derivatives
Let z = f(x, y) be a real-valued function of two variables, with partial
derivatives denoted by fx and /y· At a real point (a, b), fx(a, b) is the
derivative of the function x f(x, b) at a, while jy(a, b) is the derivative
of y f(a, y) at b. Thus for nonzero infinitesimals Llx, LJ.y,
f(a + Llx, b) -f(a, b)
Points (x1, Yl) and (x2, Y2) in the hyperreal plane *JR2 are infinitely close if both x1 X2 and y1 y2, which is equivalent to requiring that their
Euclidean distance apart,
-X2)2 + (Yl -Y2)2,
98 8. Differentiation
be infinitesimal.
The function f is continuous at the real point (a, b) if (x, y) (a, b)
implies f(x, y) f(a, b) for all hyperreal x, y. For this to hold it is necessary
that f be defined on some open disk about (a, b) in the real plane.
We say that f is smooth at (a,b) if fx and fy both exist and are continuous
at (a,b).
The increment of f at a point (a, b) corresponding to ..dx, ..dy is defined to
be
..df = f(a + ..dx, b + ..dy)o-f(a, b),
while the total differential is
df = fx(a, b)..dx + fy(a, b)..dy.
The graph of z = f( x, y) is a surface in three-dimensional space, and ..df is the change in z-value on this surface in moving from the point (a, b). to
the point (ax+ ..dx, b + ..dy). The total differential df is the corresponding
change on the tangent plane to the surface at (a, b).
Theorem 8.7.1 (Incremental Equation for Two Variables) Iff is smooth at the real point (a, b) and ..dx and ..dy are infinitesimal, then
..df = df + c:..dx + 8..1y
for some infinitesimals c and 8.
Proof. The increment ofnf at (a, b) corresponding to ..dx, ..dy can be written
as
..df = [f(a + ..dx, b + ..dy)o-f(a + ..dx, b)]o+ [f(a + ..dx, b) -f(a,b)]. (ii)
The second main summand of (ii) is the increment at a corresponding to
..dx of the one-variable function x ---+ f(x, b), whose derivative fx(a, b) is assumed
to exist. Applying the one-variable incremental equation (Theorem
8.2.2) thus gives
f(a + ..dx, b)o-f(a, b) = fx(a, b)..dx + c:..dx (iii)
for some infinitesimal c.
Similarly, for the first summand we need to show that
f(a + ..dx, b + ..dy)o-f(a + ..dx, b) = fy(a, b)..dy + 8..1y (iv)
for some infinitesimal 8. Then combining (ii)-(iv) will give
..df = fx(a, b)..dx + fy(a, b)..dy + c:..dx + 8..dy,
which is the desired result.
8.7 Partial Derivatives
Now the left side of equation (iv) could be described as the increment in
the function y ----+ f(a + .1x, y) at b corresponding to the infinitesimal .1y. This is not a real function, because of the hyperreal parameter a + .1x, so
the incremental equation 8.2.2 does not apply directly to it. To overcome
this we will examine the family of functions y ----+ f (a + x0, y) for real xo, and consider their increments corresponding to real increments y0 in y.This will give a statement about x0 and y0 to which we can apply transfer
and then replace xo and Yo by .1x and .1y.
The technical details of this are as follows. Since fx and /y are continuous
at (a, b), f must be defined on an open disk D around (a, b) of some real
radius r. Then if x0, y0 are real numbers such that (a+ xo, b +Yo) E D,
the function y ----+ f(a + xo, y) is defined on the interval [b, b + y0] and is
subject to the one-variable mean value theorem. Hence there is some real c0 between bnand b + y0 such that the derivative of this one-variable function
at co is given as
f(ax+xo,b+yo)-f(ax+xo,xb)
fY (a+ xo, Co) =
'
b +Yo -b
and so
f(a + xo, b +xYo)x-f(a + Xo, b)x= /y(a + xo, c)yo. (v)
This obtains for all real x0, Yo such that (a+ xo, b +Yo) is within r of (a, b). That is, for all such xo, Yo there exists co E [b, b +Yo] such that (v) holds.
Symbolically,
('r/xo, Yo E IR)
( < r ---+ (3Co E lR) [b :::; co :::; b + Yo and (v) holdsn]).
But (ax+.1x, b + .1y) is within r of (a, b), since .1x, L1y are infinitesimal, so
by transfer there exists some hyperreal c between b and b + L1y such that
f(a + .1x, b + L1y)x-f(a + L1x, b)x= /y(a + .1x, c)L1y. (vi)
Then c c::: b, so (ax+ L1x, c) c::: (a, b), and hence by continuity of /y at (a, b),
Therefore the difference
8 = /y(a + .1x, c) -/y(a, b)
is infinitesimal, with /y(a + L1x, c) = /y(a, b) + 8. Applying this to (vi)
yields (iv) and completes the proof.
100 8. Differentiation
8.8 Exercises on Partial Derivatives
(1) Show that ifnf is smooth at (a, b), then it is continuous at (a, b).
(2) Let f be smooth at (a, b). Given infinitesimals Llx, Lly show that
the difference between Llf and df is itself infinitely smaller than the
infinitesimal distance Lll = Llx2 + Lly2 between (a, b) and (a + Llx, b + Lly), in the sense that
Llf -df
"-J
-0.
Lll
8.9 Taylor Series
Let f be a real function and a a real number. The Taylor series ofxf at x E IR, centred on a, is the series
)
or more briefly, E
is the kth derivative of f.
(xxa)k, where J(kxFor this to be defined, f must be differentiable infinitely often at a, but even
)
if J(kx(a) exists for all k EN, the series need not converge. Even if it does
converge, the sum need not be equal to f(x). A well-known example is the
function f(x) = e-l/x2 with f(O) = 0. This is so "flat" at the centre a= 0
that all its derivatives J(k)(0) there are equal to 0. Hence the associated
Taylor series converges at all real x, but converges to f(x) only when x = 0.
The partial sums of a Taylor series are the Taylor polynomials. The nth
polynomial is
Pn(x) E (xx-a)k
-
"(
fx(a) fxn)(a)
--.·
f(a) + J'(a)(xx-a)+ (x-a)2 + .e+ (xx-a)n.
2! n!
For any given x, the sequence (Pn(x) : n EN) extends to a hypersequence,
so Pn(x) is defined for all n E *N. Then from our earlier work on sequences
and series (Chapter 6) we see that
• the Taylor series for f at x converges to a real number L if and only
if Pn( x) c:::: L for all unlimited n.
The difference between f(x) and Pn(x) is the nth remainder at x:
Rn(x) = f(x)x-Pn(x). (vii)
If f is infinitely differentiable at a, then (vii) defines Rn( x) for all n E N.
The sequence (Rn(x) : n E N) then extends to a hypersequence, and by
transfer (vii) holds for all hypernatural n. Then:
8.9 Taylor Series 101
• the Taylor series for f at x converges to f ( x) if and only if Rn( x) is
infinitesimal for all unlimited n.
If the derivatives f(n) exist for all n E Non some open interval J containing
a, then the sequence of functions (!Cn) : n EN) extends to a hypersequence
(f(n) : n E *N) of functions defined onn* J in the manner described in Section
7.12. Formally, we put F(n,nx) = f(n)(x) for n EN and x E J and then by
extension get f(n)n(x) = *F(n,nx) for n E *Nnand x E *J. Then results like
f(n)(a)
Pn(x)n-PnnI(x) =
-
(xn-a)
n
1n.
continue to hold for unlimited n, by transfer.
Now, the Lagrange form of the remainder stipulates that if f can be
differentiated n + 1 times on some open interval containing a, then for each
x in that interval there is some real number c between a and x such that
f(n+l) ( ) c
Rn(x) = (xn-a)n+l.
(n + 1)!
Thus if f is infinitely differentiable on some open interval J containing a, then for every n E N and every x E J we have
f(n+I) (c)
f(x)x-Pn (x) = (xn-a)n+I (viii)
(n + 1)!
for some c between a and x. Hence by transfer, for every n E *N and x E * J,
the Taylor formula (viii) holds for some hyperreal c between a and x (c
may no longer be real). If we can show for a real x that the right side of
(viii) is infinitesimal whenever n is unlimited, it will follow that the Taylor
series of f at x converges to f( x).
Let us illustrate this with the case ofthe function f(x) = cosnx, analysing
its Maclaurin series, which is the Taylor series at the centre a= 0. For any
x E *JR. and n E *N we have
(n+l)
cos c n+I
Rn(x) = xn
(n + 1)!
for some c with lei :::; x. Now, if n EN and c E IR, then cosCn+l)c is ±sine
or ±cos c, and so in all cases lies between -1 and 1. This fact then holds
by transfer for any n E *N and c E *IR, so cosCn+l)c is always limited. But
if x E JR. and n is unlimited,
(n + 1)!
is infinitesimal (Exercise 6.11(10)). It follows in this case that Rn(x) is
infinitesimal, and therefore (cf. (vii)) f(x) Pn(x). This shows that the
-Iaclaurin series for the cosine function converges to cos x at all real x.
102 8. Differentiation
Exercise 8.9.1
x
Verify that the Maclaurin series for exconverges to ex at any x E 1R by
proving that the remainder Rn(x) is infinitesimal when n is unlimited.
8.10 Incremental Approximation by Taylor's
Formula
The incremental equation of Theorem 8.2.2 approximates the value f(x +
.1x) by a linear function f(x) + f'(x)Llx of the increment Llx, with an
error c:Llx that is infinitely smaller than Llx. We will now see that there
are similar approximations by higher-order polynomials in Llx (quadratics,
cubics, quartics, etc.).
Fix a real number x and a positive integer n) E N. Consider polynomials
centred at x itself. If the nth derivative f(nxexists on an open interval
J containing x, then the Taylor formula with Lagrange remainder (viii)
stipulates that for real numbers of the form x + Llx in J,
f(x + Llx)
(x) k + f(nx(c)
Pn-I(X + .1x) + Rn-l(x + Llx)
))
f(kx
n (ix)
+
for some c between x and x + Llx. By transfer this holds whenever x Llx belongs ton* J, and so it holds for any infinitesimal Llx, in which case c x. Now (ix) can be modified to give
) ))
and if f(nxis continuous at x, then from c x we infer f(nx(c) f(nx(x), implying that the number
is infinitesimal. Altogether:
)
Theorem 8.10.1 If the nth derivative f(nxexists on an open interval con
)
taining the real number x, and f(nxis continuous at x, then for any in
finitesimal Llx,
")
fx(x) 2 f(nx(x)
f(x + Llx) = f(x) + f'(x)Llx + 2!.1xx+ · · + Llxn + c:Llxn
·
n!
for some infinitesimal c:. D
8.11 Extending the Incremental Equation 103
In other words, the difference between f(x + L1x) and the nth-order polynomial
"x
fx(x) · f(n)(x)
f(x) + f'(x)L1x + --L1x2 + + L1xn
2!
2!
· · n!
in L1x is the infinitesimal cL1xn, which is, as Leibniz would put it (Section
1.2), infinitely small in comparison with L1xn.
Exercise 8.10.2
There are forms for the Taylor remainder other than Lagrange's. One of
these is
(c-a)(n + 1)!
for some c between a and x when f(n+l) exists between a and x.
Apply this form for Rn-1(x) to show that Theorem 8.10.1 holds without
the hypothesis of continuity of f(n).
8.11 Extending the Incremental Equation
The equation
f(x + L1x) = f(x) + f'(x)t1x + ct1x
holds for any real number x at which f is differentiable. It is natural to ask
whether a similar formula holds for nonreal x, and it turns out that this is
intimately connected with the question of the continuity of the derivative
function f'.
Let us say that a hyperreal x is well inside an interval (y, z) if y < x < z but x is not infinitely close to either of the end points y and z. Equivalently, this means that the halo of x is included in the interval, so
that y < x + t1x < z for all infinitesimals t1x.
Theorem 8.11.1 Let f be differentiable on an interval (a, b) in JR. Then the derivative f' is continuous on (a, b) if and only if for each hyperreal x
that is well inside *(a, b) and each infinitesimal t1x,
f(x + L1x) = f(x) + f'(x)t1x + cL1x
for some infinitesimal c.
Proof Assume that the incremental equation holds at points well inside
*(a, b). To prove continuity of f', let c be a real point in (a, b) and suppose
x•::::: c. We want f'(x)x::::: f'(c).
Now, if ..1 = (xx-c) ::::: 0, then using Theorem 8.2.2 we get
f(x) = f(c + ..1) = f(c) + f'(c)L1 + ct1
104 8. Differentiation
for some c 0. But xis well inside *(a,b), since a< c