8 Differentiation

We come now to an examination-from the modern infinitesimal perspective- of the cornerstone concept of the calculus. 8.1 The Derivative Newton called the derivative the fluxion iJ of a fluent quantity y, thinking of it as the "speed" with which the quantity flows. In more modern parlance, the derivative of a function f at a real number x is the real number f'(x) that represents the rate of change of the function as it varies near x. Alternatively, it is the slope of the tangent to the graph ofnf at x. Formally it is defined as the number 1. f(x +h)-f(x) lm . h--+0 h Theorem 8.1.1 Ifof is defined at x E .!R, then the real number L E 1R is the derivative of f at x if and only if for every nonzero infinitesimal c, f(x +c) is defined and f(x +c)-f(x) ':::!. L . (i) c Proof Let g(h) = and apply the characterisation of " lim g(h) = L" h--+0 92 8. Differentiation given in Section 7.3. 0 Thus when f is differentiable (i.e., has a derivative) at x, we have J'(x) = sh (f(x + f(x)n) for all infinitesimal c: f. 0. If (i) holds only for all positive infinitesimal c:, then L is the right-hand derivative of f at x, defined classically as l. f(x +h)o-f(x) Im . h-.O+ h Similarly, if (i) holds for all negative c: 􀀶 0, then Lis the left-hand derivative given by the limit as h -t o-. Exercise 8.1.2 Use the characterisation of Theorem 8.1.1 to prove that the derivative of sinx is cosx at real x (cf. Section 7.2 and Exercise 5.7(2)). 8.2 Increments and Differentials Let L!x denote an arbitrary nonzero infinitesimal representing a change or increment in the value of variable x. The corresponding increment in the value of the function f at x is L!f = f(x + L!x)n-f(x). To be quite explicit we should denote this increment by L!f ( x, L!x), since its value depends both on the value of x and the choice of the infinitesimal L!x. The more abbreviated notation is, however, convenient and suggestive. If f is differentiable at x E R, Theorem 8.1.1 implies that 􀀵 !'(x), so the Newton quotient * is limited. Hence as L!f L!f = L!x, L!x it follows that the increment L!f in f is infinitesimal. Thus f ( x + L!x) 􀀒 f( x) for all infinitesimal L!x, and this proves Theorem 8.2.1 Ifof is differentiable at x E JR, then f is continuous at x. D 8.2 Increments and Differentials The differential ofof at x corresponding to Llx is defined to be df = J'(x)Llx. Thus whereas Llf represents the increment of the "y-coordinate" along the graph of f at x, df represents the increment along the tangent line to this graph at x. Writing dx for L\x, the definition of df yields df = J'(x). dx Now, since f' ( x) is limited and Llx is infinitesimal, it follows that df is infinitesimal. Hence df and Llf are infinitely close to each other. In fact, their difference is infinitely smaller than L\x, for if Llf £ = !'(x), L1x _ then £ is infinitesimal, because 􀀒 f' ( x), and dfo-df = dfo-J'(x)L\x = £L1x, which is also infinitesimal (being a product of infinitesimals). But --£ "" 0Llx _ L1x _ df -df £Llx ' and in this sense L1f -df is infinitesimal compared to L1x (equivalently, is unlimited). These relationships are summarised in Theorem 8.2.2 (Incremental Equation) If f'(x) exists at real x and L1x = dx is infinitesimal, then L\f and df are infinitesimal, and there is an infinitesimal £, dependent on x and Llx, such that df = J'(x)L1x + £Llx = df + Edx, and so f(x + L1x) = f(x) + f'(x)Llx + £L1x. This last equation elucidates the role of the derivative function J' as the best linear approximation to the function f at x. For the graph of the linear function l( Llx) = f ( x) + f'( x) Llx gives the tangent to foat x when the origin is translated to the point (x, 0), and l(Llx) differs from f(x+oL1x) by the amount £L1x, which we saw above is itself infinitely smaller than L1x when L1x is infinitesimal, and in that sense is "negligible"n. 94 8. Differentiation 8.3 Rules for Derivatives Ifof and g are differentiable at x E R, then so to are f + g, fg, and f jg, provided that g(x)o-=/= 0. Moreover, (1) (! + g)'(x) f'(x) + g'(x), = (2) (fg)'(x) = f'(x)g(x) + f(x)g'(x), (3) (fjg)'(x) = g(x)2 Proof We prove Leibniz's rule (2), and leave the others as exercises. If ..1xo-=/= 0 is infinitesimal, then, by Theorem 8.1.1, f(x + ..1x) and g(x + ..1x) are both defined, and hence so is (fg)(x + ..1x) = f(x + ..1x)g(x + ..1x). Then the increment of f g at x corresponding to ..1x is ..1(/g) f(x + ..1x)g(x + ..1x)o-f(x)g(x) (f(x) + ..1/)(g(x) + ..1g)o-f(x)g(x) (..1/)g(x) + f(x)..1g + ..1f ..1g (compare this to Leibniz's reasoning as discussed in Section 1.2). It follows that ..1(/g) - g(x)o+f(x) ! +..1f ! ..1x f'(x)g(x) + f(x)g'(x) + 0, since * 􀀒 f'(x), * c:: g'(x), ..1f 􀀒 0 and all quantities involved are limited. Hence by Theorem 8.1.1, f'(x)g(x) + f(x)g'(x) is the derivative of fg at x. D 8.4 Chain Rule Ifof is differentiable at x E R, and g is differentiable at f ( x), then g o f is differentiable at x with derivative g' (! ( x)) f' ( x). Proof Let ..1x be a nonzero infinitesimal. Then f ( x + ..1x) is defined and f(x + ..1x) 􀀉 f(x), as we saw in Section 8.2. But g is defined at all points infinitely close to f(x), since g'(f(x)) exists, so (go f)(x + ..1x) = g(f(x + ..1x)) is defined. 8.5 Critical Point Theorem 95 Now let t1f = f(x + t1x)o-f(x), t1(g of) = g(f(x + t1x))o-g(f(x)) be the increments ofnf and go f at x corresponding to t1x. Then t1f is infinitesimal, and t1(g of) = g(f(x) + t1f)o-g(f(x)), which shows, crucially, that t1(g of) is also the increment of g at f ( x) corresponding to t1f. In the full incremental notation, this reads t1(g o f)(x, t1x) = t1g(f(x), t1f). By the incremental equation (Theorem 8.2.2) for g, it then follows that there exists an infinitesimal c such that t1(g of)= g'(f(x))t1f + ct1f. Hence t1(g 0 f) g'(f(x)) + c t1x g'(f(x))f'(x) + 0, establishing that g'(f(x))f'(x) is the derivative of go fat x. D 8.5 Critical Point Theorem Let f have a maximum or a minimum at x on some real interval (a, b). If f is differentiable at x, then f' ( x) = 0. Proof Suppose f has a maximum at x. By transfer, f(x + t1x) ::::; f(x) for all infinitesimal t1x. Hence if cis positive infinitesimal and 6 is negative infinitesimal, J'(x) 􀂔 f(x + f(x) ::::; 00::::; f(x+ f(x) 􀇮 f'(x), and so as f'(x) is real, it must be equal to 0. The case of f having a minimum at x is similar. D Using the critical point and extreme value theorems, the following results can be successively derived about a function f that is continuous on [a, b] 􀁽 lR and differentiable on (a, b). The proofs do not require any further reasoning about infinitesimals or limits. 8. Differentiation • Rolle's Theorem: if f(a) = f(b) = 0, then f'(x) = 0 for some x E (a, b). • Mean Value Theorem: for some x E (a, b), f'(x) = • Ifnf' is zero/positive/negative on (a, b), then f is constant/increasing/ decreasing on [a, b]. 8.6 Inverse Function Theorem Let f be continuous and strictly monotone (increasing or decreasing) on (a, b), and suppose g is the inverse function of f. If f is differentiable at x in (a, b), with f'(x) f. 0, then g is differentiable at y = f(x), with g'(y) = 1/ f'(x). Proof Using the intermediate value theorem and monotonicity ofnf it can be shown that g is defined on some real open interval around y. The result g' (f ( x)) = 1/ f'( x) would follow easily by the chain rule applied to the equation g(f(x)) = x if we knew that g was differentiable at f(x). But that is what we have to prove! Now let Lly be a nonzero infinitesimal. We need to show that g(y + Lly)n-g(y) 1 􀇭 f'(x)n" Now, if g(y + Lly) were not infinitely close to g(y), then there would be a real number r strictly between them. But then, by monotonicity of f, f(r)would be a real number strictly between y + Lly andny. Since y is real, this would mean that y + LJ.y and y were an appreciable distance apart, which is not so. Hence Llx = g(y + Lly)n-g(y) is infinitesimal and is nonzero. (Thus the argument so far establishes that g is continuous at y.) Observe that .t1x is, by definition, the increment Llg(y, Lly) of gat y corresponding to Lly. Since g(y) = x, the last equation gives g(y + Lly) = x + Llx, so f(x + Llx) = f(g(y + Lly)) = y + Lly. Hence Lly f(x + L1x)n-f(x) Llj, the increment ofnf at x corresponding to Llx. fx(a, b) /y(a, b) 􀀉 Llx f(a, b + LJ.y)x-f(a, b) LJ.y fx(a, b) /y(a, b) 􀀉 Llx f(a, b + LJ.y)x-f(a, b) LJ.y 8. 7 Partial Derivatives 97 Altogether we have Llf(x, Llx) Lly Llx Llx and Llg(y, Lly) LJ.x Llx ' LJ.y Lly -LJ.fx Put more briefly, we have shown that LJ.g 1 · Lly LlfILlx To derive from this the conclusion g'(y) = 11f'(x) we invoke the hypothesis that f'(x) =/:. 0 (which is essential: consider what happens at x = 0 when f(x) = x3). Since sh(L1flt1x) = f'(x), it follows that L1flt1x is appreciable. But then by 5.6.2(3) sh(LlfILJ.x) 1 · f'(x) Therefore, Llg(y, Lly) Llx 1 ' LJ.y f'(x)x Because LJ.y is an arbitrary nonzero infinitesimal, this establishes that the real number 11f'(x) is the derivative of g at y, as desired. 0 8.7 Partial Derivatives Let z = f(x, y) be a real-valued function of two variables, with partial derivatives denoted by fx and /y· At a real point (a, b), fx(a, b) is the derivative of the function x 􀃊 f(x, b) at a, while jy(a, b) is the derivative of y 􀁑 f(a, y) at b. Thus for nonzero infinitesimals Llx, LJ.y, f(a + Llx, b) -f(a, b) Points (x1, Yl) and (x2, Y2) in the hyperreal plane *JR2 are infinitely close if both x1 􀀉 X2 and y1 􀀉 y2, which is equivalent to requiring that their Euclidean distance apart, -X2)2 + (Yl -Y2)2, 98 8. Differentiation be infinitesimal. The function f is continuous at the real point (a, b) if (x, y) 􀀉 (a, b) implies f(x, y) 􀀉 f(a, b) for all hyperreal x, y. For this to hold it is necessary that f be defined on some open disk about (a, b) in the real plane. We say that f is smooth at (a,b) if fx and fy both exist and are continuous at (a,b). The increment of f at a point (a, b) corresponding to ..dx, ..dy is defined to be ..df = f(a + ..dx, b + ..dy)o-f(a, b), while the total differential is df = fx(a, b)..dx + fy(a, b)..dy. The graph of z = f( x, y) is a surface in three-dimensional space, and ..df is the change in z-value on this surface in moving from the point (a, b). to the point (ax+ ..dx, b + ..dy). The total differential df is the corresponding change on the tangent plane to the surface at (a, b). Theorem 8.7.1 (Incremental Equation for Two Variables) Iff is smooth at the real point (a, b) and ..dx and ..dy are infinitesimal, then ..df = df + c:..dx + 8..1y for some infinitesimals c and 8. Proof. The increment ofnf at (a, b) corresponding to ..dx, ..dy can be written as ..df = [f(a + ..dx, b + ..dy)o-f(a + ..dx, b)]o+ [f(a + ..dx, b) -f(a,b)]. (ii) The second main summand of (ii) is the increment at a corresponding to ..dx of the one-variable function x ---+ f(x, b), whose derivative fx(a, b) is assumed to exist. Applying the one-variable incremental equation (Theorem 8.2.2) thus gives f(a + ..dx, b)o-f(a, b) = fx(a, b)..dx + c:..dx (iii) for some infinitesimal c. Similarly, for the first summand we need to show that f(a + ..dx, b + ..dy)o-f(a + ..dx, b) = fy(a, b)..dy + 8..1y (iv) for some infinitesimal 8. Then combining (ii)-(iv) will give ..df = fx(a, b)..dx + fy(a, b)..dy + c:..dx + 8..dy, which is the desired result. 8.7 Partial Derivatives Now the left side of equation (iv) could be described as the increment in the function y ----+ f(a + .1x, y) at b corresponding to the infinitesimal .1y. This is not a real function, because of the hyperreal parameter a + .1x, so the incremental equation 8.2.2 does not apply directly to it. To overcome this we will examine the family of functions y ----+ f (a + x0, y) for real xo, and consider their increments corresponding to real increments y0 in y.This will give a statement about x0 and y0 to which we can apply transfer and then replace xo and Yo by .1x and .1y. The technical details of this are as follows. Since fx and /y are continuous at (a, b), f must be defined on an open disk D around (a, b) of some real radius r. Then if x0, y0 are real numbers such that (a+ xo, b +Yo) E D, the function y ----+ f(a + xo, y) is defined on the interval [b, b + y0] and is subject to the one-variable mean value theorem. Hence there is some real c0 between bnand b + y0 such that the derivative of this one-variable function at co is given as f(ax+xo,b+yo)-f(ax+xo,xb) fY (a+ xo, Co) = ' b +Yo -b and so f(a + xo, b +xYo)x-f(a + Xo, b)x= /y(a + xo, c)yo. (v) This obtains for all real x0, Yo such that (a+ xo, b +Yo) is within r of (a, b). That is, for all such xo, Yo there exists co E [b, b +Yo] such that (v) holds. Symbolically, ('r/xo, Yo E IR) ( < r ---+ (3Co E lR) [b :::; co :::; b + Yo and (v) holdsn]). But (ax+.1x, b + .1y) is within r of (a, b), since .1x, L1y are infinitesimal, so by transfer there exists some hyperreal c between b and b + L1y such that f(a + .1x, b + L1y)x-f(a + L1x, b)x= /y(a + .1x, c)L1y. (vi) Then c c::: b, so (ax+ L1x, c) c::: (a, b), and hence by continuity of /y at (a, b), Therefore the difference 8 = /y(a + .1x, c) -/y(a, b) is infinitesimal, with /y(a + L1x, c) = /y(a, b) + 8. Applying this to (vi) yields (iv) and completes the proof. 100 8. Differentiation 8.8 Exercises on Partial Derivatives (1) Show that ifnf is smooth at (a, b), then it is continuous at (a, b). (2) Let f be smooth at (a, b). Given infinitesimals Llx, Lly show that the difference between Llf and df is itself infinitely smaller than the infinitesimal distance Lll = Llx2 + Lly2 between (a, b) and (a + Llx, b + Lly), in the sense that Llf -df "-J -0. Lll 8.9 Taylor Series Let f be a real function and a a real number. The Taylor series ofxf at x E IR, centred on a, is the series ) or more briefly, E􀌟 is the kth derivative of f. (xxa)k, where J(kxFor this to be defined, f must be differentiable infinitely often at a, but even ) if J(kx(a) exists for all k EN, the series need not converge. Even if it does converge, the sum need not be equal to f(x). A well-known example is the function f(x) = e-l/x2 with f(O) = 0. This is so "flat" at the centre a= 0 that all its derivatives J(k)(0) there are equal to 0. Hence the associated Taylor series converges at all real x, but converges to f(x) only when x = 0. The partial sums of a Taylor series are the Taylor polynomials. The nth polynomial is Pn(x) E􀌠 (xx-a)k - "( fx(a) fxn)(a) --.· f(a) + J'(a)(xx-a)+ (x-a)2 + .e+ (xx-a)n. 2! n! For any given x, the sequence (Pn(x) : n EN) extends to a hypersequence, so Pn(x) is defined for all n E *N. Then from our earlier work on sequences and series (Chapter 6) we see that • the Taylor series for f at x converges to a real number L if and only if Pn( x) c:::: L for all unlimited n. The difference between f(x) and Pn(x) is the nth remainder at x: Rn(x) = f(x)x-Pn(x). (vii) If f is infinitely differentiable at a, then (vii) defines Rn( x) for all n E N. The sequence (Rn(x) : n E N) then extends to a hypersequence, and by transfer (vii) holds for all hypernatural n. Then: 8.9 Taylor Series 101 • the Taylor series for f at x converges to f ( x) if and only if Rn( x) is infinitesimal for all unlimited n. If the derivatives f(n) exist for all n E Non some open interval J containing a, then the sequence of functions (!Cn) : n EN) extends to a hypersequence (f(n) : n E *N) of functions defined onn* J in the manner described in Section 7.12. Formally, we put F(n,nx) = f(n)(x) for n EN and x E J and then by extension get f(n)n(x) = *F(n,nx) for n E *Nnand x E *J. Then results like f(n)(a) Pn(x)n-PnnI(x) = - (xn-a) n 1n. continue to hold for unlimited n, by transfer. Now, the Lagrange form of the remainder stipulates that if f can be differentiated n + 1 times on some open interval containing a, then for each x in that interval there is some real number c between a and x such that f(n+l) ( ) c Rn(x) = (xn-a)n+l. (n + 1)! Thus if f is infinitely differentiable on some open interval J containing a, then for every n E N and every x E J we have f(n+I) (c) f(x)x-Pn (x) = (xn-a)n+I (viii) (n + 1)! for some c between a and x. Hence by transfer, for every n E *N and x E * J, the Taylor formula (viii) holds for some hyperreal c between a and x (c may no longer be real). If we can show for a real x that the right side of (viii) is infinitesimal whenever n is unlimited, it will follow that the Taylor series of f at x converges to f( x). Let us illustrate this with the case ofthe function f(x) = cosnx, analysing its Maclaurin series, which is the Taylor series at the centre a= 0. For any x E *JR. and n E *N we have (n+l) cos c n+I Rn(x) = xn (n + 1)! for some c with lei :::; x. Now, if n EN and c E IR, then cosCn+l)c is ±sine or ±cos c, and so in all cases lies between -1 and 1. This fact then holds by transfer for any n E *N and c E *IR, so cosCn+l)c is always limited. But if x E JR. and n is unlimited, (n + 1)! is infinitesimal (Exercise 6.11(10)). It follows in this case that Rn(x) is infinitesimal, and therefore (cf. (vii)) f(x) 􀀉 Pn(x). This shows that the 􀜗-Iaclaurin series for the cosine function converges to cos x at all real x. 102 8. Differentiation Exercise 8.9.1 x Verify that the Maclaurin series for exconverges to ex at any x E 1R by proving that the remainder Rn(x) is infinitesimal when n is unlimited. 8.10 Incremental Approximation by Taylor's Formula The incremental equation of Theorem 8.2.2 approximates the value f(x + .1x) by a linear function f(x) + f'(x)Llx of the increment Llx, with an error c:Llx that is infinitely smaller than Llx. We will now see that there are similar approximations by higher-order polynomials in Llx (quadratics, cubics, quartics, etc.). Fix a real number x and a positive integer n) E N. Consider polynomials centred at x itself. If the nth derivative f(nxexists on an open interval J containing x, then the Taylor formula with Lagrange remainder (viii) stipulates that for real numbers of the form x + Llx in J, f(x + Llx) (x) k + f(nx(c) Pn-I(X + .1x) + Rn-l(x + Llx) )) f(kx n (ix) + for some c between x and x + Llx. By transfer this holds whenever x Llx belongs ton* J, and so it holds for any infinitesimal Llx, in which case c 􀀉 x. Now (ix) can be modified to give ) )) and if f(nxis continuous at x, then from c 􀀉 x we infer f(nx(c) 􀀒 f(nx(x), implying that the number is infinitesimal. Altogether: ) Theorem 8.10.1 If the nth derivative f(nxexists on an open interval con ) taining the real number x, and f(nxis continuous at x, then for any in finitesimal Llx, ") fx(x) 2 f(nx(x) f(x + Llx) = f(x) + f'(x)Llx + 2!.1xx+ · · + Llxn + c:Llxn · n! for some infinitesimal c:. D 8.11 Extending the Incremental Equation 103 In other words, the difference between f(x + L1x) and the nth-order polynomial "x fx(x) · f(n)(x) f(x) + f'(x)L1x + --L1x2 + + L1xn 2! 2! · · n! in L1x is the infinitesimal cL1xn, which is, as Leibniz would put it (Section 1.2), infinitely small in comparison with L1xn. Exercise 8.10.2 There are forms for the Taylor remainder other than Lagrange's. One of these is (c-a)(n + 1)! for some c between a and x when f(n+l) exists between a and x. Apply this form for Rn-1(x) to show that Theorem 8.10.1 holds without the hypothesis of continuity of f(n). 8.11 Extending the Incremental Equation The equation f(x + L1x) = f(x) + f'(x)t1x + ct1x holds for any real number x at which f is differentiable. It is natural to ask whether a similar formula holds for nonreal x, and it turns out that this is intimately connected with the question of the continuity of the derivative function f'. Let us say that a hyperreal x is well inside an interval (y, z) if y < x < z but x is not infinitely close to either of the end points y and z. Equivalently, this means that the halo of x is included in the interval, so that y < x + t1x < z for all infinitesimals t1x. Theorem 8.11.1 Let f be differentiable on an interval (a, b) in JR. Then the derivative f' is continuous on (a, b) if and only if for each hyperreal x that is well inside *(a, b) and each infinitesimal t1x, f(x + L1x) = f(x) + f'(x)t1x + cL1x for some infinitesimal c. Proof Assume that the incremental equation holds at points well inside *(a, b). To prove continuity of f', let c be a real point in (a, b) and suppose x•::::: c. We want f'(x)x::::: f'(c). Now, if ..1 = (xx-c) ::::: 0, then using Theorem 8.2.2 we get f(x) = f(c + ..1) = f(c) + f'(c)L1 + ct1 104 8. Differentiation for some c􀁘 0. But xis well inside *(a,b), since a< c