Derivatives

Differentiability and Derivatives

Total Derivative

Consider a function . Take a point , and a linear transformation .

Then, we say that is differentiable at with derivative , if:

(Note: this is a special case of Fréchet differentiability for functions between Banach spaces)

In a sense, this means that is the best linear approximation of near . Let us define the error term :

where we can think of . Then, the definition of differentiability can be rewritten as

which gives

Now, given that , the linear transformation is unique if it exists, and we call it the (total) derivative of at , and denote it as , which can be represented as a matrix (later discussed).

Proof

Assume that are two derivatives of at . Then, we have a vector such that . This means that

for any . Now, we let ,

taking the norm and dividing by , we have

since both numerator and denominator is an Euclidean norm, they are both positive, and we have

Taking the limit as , because and are both derivatives of at , we have

which is a contradiction. Therefore, the derivative of at is unique if it exists.

Directional Derivative

On the other hand, we have a different notion of derivative, called directional derivative. Given a function , a point , and a vector , we say that is differentiable in the direction of at if the following limit exists:

where is in for sufficiently small .

Directional derivative can be written using the total derivative:

Proof

Assume that is differentiable at with derivative , and take a vector . Since is differentiable at , we have

Let , we have

Since is a fixed vector, is a constant which can be factored out:

The inside of the norm is a vector in , and the limit of its norm is , which means that the vector is a zero vector in the limit:

by term wise limit,

Since the limit exists, the right limit are equal to the limit:

and the LHS is exactly the definition of the directional derivative , we have

Thus, differentiability is sufficient for the existence of directional derivative. (note that the converse is not true)

Partial Derivative

A special case of directional derivative is the partial derivative. Remember that the directional derivative can be written as follows:

If we take , a standard Euclidean basis vector, the direction is along the -th coordinate axis. So, given a function , a point , and a standard Euclidean basis vector , we say that is differentiable in the direction of at if the following limit exists:

in which case, the limit is called the partial derivative of with respect to at .

This is useful because now we can write directional derivatives as a linear combination of partial derivatives:

(Note: this will be important for tangent spaces of manifolds)

For a scalar function , the Jacobian matrix is a matrix, which can be identified with a row vector, called the gradient of at , sometimes denoted as :

which makes the directional derivative a matrix multiplication of the gradient and the vector :

If is vector-valued, i.e. , the total derivative can be represented by Jacobian matrix or derivative matrix , which is a matrix:

Let us check that it in fact satisfies , given :

Properties of Derivatives

Chain Rule

Consider functions and , and a point . Assume that is differentiable at , and , at which is differentiable. Then, the composition is differentiable at , and its derivative is given by the composition of the derivatives:

This is called the chain rule.

Note that this follows the same idea as the chain rule for single variable functions.

Proof

Since is differentiable at ,

where and is a function such that . Now, since is differentiable at , we have

where is a function such that . Take such that , we have

Thus,

Thus to show that the total derivative of at is , we need to show that

The norm on the numerator can be bounded by the triangle inequality:

For the second term,

Notice that the coefficient is of the form

which goes to as , as well as as . Substituting back to the limit,

All the term goes to by the definition of , so the whole limit goes to 0, and term goes to because of the coefficient, thus we have

which shows the chain rule.

Leibniz/ Product Rule

Now, there is an interesting corollary of the chain rule.

Take two functions , and combine them to make a vector-valued function defined as . The Jacobian matrix of is a matrix

Consider a multiplication map defined as . More explicitly, we call this the pointwise product of and :

Here, we notice that is differentiable at any point , and its derivative is a matrix (a row vector):

Thus by chain rule,

This is called the Leibniz rule or product rule of derivatives.

-class Functions and Higher Order Derivatives

-class Functions

Now, coming back to the total derivative of a vector-valued function , we have the total derivative at each interior point , which is a linear transformation from to :

Notice that the domain of and is not technically the original , but the tangent space of . To articulate, remember that can be written as

is a vector, that does not live in the original where lives, but it is rather difference between two nearby points that are in the original , that happens to be a vector in . We should express this distinction by writing , where is the tangent space of at (explained in manifold section later).

Also, in a similar way, while looks like a vector in the original range , but in essence is a difference between and , which should also live in the tangent space of at , denoted as :

As for reasons I explain later in manifold section, the tangent space of (subsets of) Euclidean space is canonically isomorphic to the original , so we regard and as and respectively, and write

This is a linear transformation between two vector spaces, but also it gives another map , that takes a point and produces a linear transformation , where

is the set of all linear transformations from to . In short, we now have a (crudely speaking) matrix-valued map :

Now, we define a class of functions, called -class functions. Given a function :

-class function
- if it is continuous at every point in .
-class function ()
- if the derivative is a -class function.

While we have the definition, it would not be as useful without the ability to actually compute actual functions to show that they are -class functions.

Higher Order Derivatives

Now, speaking of canonical isomorphism, there is actually a canonical isomorphism between and (refer here). Thus we can identify as a map from to , and consider its derivative , which is a map from to . Since is canonically isomorphic to , we can identify as a map:

Since this is an ordinary vector valued map, we can take its derivative again (assuming is differentiable) to get :

and in a similar manner, we identify the domain and range as

Physics Notes

Explorer

Notes on Calculus

Derivatives

Differentiability and Derivatives

Total Derivative

Directional Derivative

Partial Derivative

Properties of Derivatives

Chain Rule

Leibniz/ Product Rule

-class Functions and Higher Order Derivatives

-class Functions

Higher Order Derivatives

Table of Contents

Backlinks