Linear and Matrix Algebra for Multivariable Calculus

Section4Linear algebra in multivariable calculus

Subsection4.1Differentiability

A function $$f\colon \R^n\to \R^m$$ is differentiable at the point $$\mathbf{x}_0$$ if there exists a linear transformation $$L\colon \R^n\to \R^m$$ such that
$$\lim_{\mathbf{h}\to 0} \frac{f(\mathbf{x}_0+ \mathbf{h}) - f(\mathbf{x}_0)- L\mathbf{h}}{\|\mathbf{h}\|} = 0.\tag{4.1}$$
If $$L$$ exists, it is called the derivative of $$f$$ at $$\mathbf{x}_0$$ denoted $$Df(\mathbf{x_0})$$.
To understand the definition of the derivative, start with the case $$n=m=1\text{.}$$ The derivative of $$f$$ at $$x_0$$ is a number $$f'(x_0)$$ such that
\begin{equation*} f(x_0 + h) -f(x_0) \approx f'(x_0)h \end{equation*}
for $$h$$ near $$0\text{.}$$ The meaning of "approximately equals $$\ldots$$ for $$h$$ near $$0$$" is made precise by using a limit. To generalize to higher dimensions, interpret $$f'(x_0)h$$ as the value of a linear transformation that sends $$h$$ to $$f'(x_0)h\text{.}$$ The derivative $$Df(\mathbf{x}_0)$$ of $$f$$ at $$\mathbf{x_0}$$ is a linear transformation such that
\begin{equation*} f(\mathbf{x}_0 + \mathbf{h}) -f(\mathbf{x}_0) \approx Df(\mathbf{x}_0)\mathbf{h} \end{equation*}
for $$\mathbf{h}$$ near $$\mathbf{0}\text{.}$$ Putting $$\mathbf{h} = t\mathbf{e}_j\text{,}$$ this reads
\begin{equation*} f(\mathbf{x}_0 + t\mathbf{e}_j) -f(\mathbf{x}_0) \approx Df(\mathbf{x}_0)t\mathbf{e}_j \end{equation*}
for $$t$$ near $$0\text{.}$$ Dividing both sides by $$t$$ and taking a limit, we get an expression for $$Df(\mathbf{x}_0)\mathbf{e}_j\text{.}$$
$$Df(\mathbf{x}_0)\mathbf{e}_j = \lim_{t\to 0} \frac{f(\mathbf{x}_0 + t\mathbf{e}_j) -f(\mathbf{x}_0)}{t} = \left(\frac{\partial y_1}{\partial x_j}, \frac{\partial y_2}{\partial x_j},\ldots, \frac{\partial y_m}{\partial x_j}\right)\tag{4.2}$$
where $$\mathbf{y}=(y_1,y_2,\ldots,y_m) = f(x_1,x_2,\ldots,x_n) = f(\mathbf{x})\text{.}$$ From this it follows that $$Df(\mathbf{x}_0)\text{,}$$ if it exists, is represented by the matrix $$\left[\frac{\partial y_i}{\partial x_j}\right]\text{.}$$
$$[Df(\mathbf{x}_0)] = \left[\frac{\partial y_i}{\partial x_j}\right]\tag{4.3}$$

Subsection4.2The Chain Rule

Consider the composition of functions
\begin{equation*} \R^p \stackrel{g}{\longrightarrow} \R^n \stackrel{f}{\longrightarrow} \R^m \end{equation*}
and suppose $$g$$ is differentiable at $$\mathbf{t}_0$$ and $$f$$ is differentiable at $$\mathbf{x}_0=g(\mathbf{t}_0)\text{.}$$ The chain rule says that $$f\circ g$$ is differentiable at $$\mathbf{t}_0\text{,}$$ and that the derivative of the composition is the composition of the derivatives.
$$D(f\circ g)(\mathbf{t}_0) = Df(\mathbf{x}_0) Dg(\mathbf{t}_0)\tag{4.4}$$
This explains the "tree diagram rule" given in most multivariate calculus texts. The partial derivative $$\frac{\partial y_i}{\partial t_j}$$ is just the $$i,j$$ entry of the product of the derivative matrices for $$f$$ and $$g\text{.}$$
$$\frac{\partial y_i}{\partial t_j} = \sum_{k=1}^n \frac{\partial y_i}{\partial x_k}\frac{\partial x_k}{\partial t_j}\tag{4.5}$$

Exercises4.3Exercises

1.

Verify the definition of differentiable function (4.1) given in is equivalent to the usual definition for $$n=m=1$$ from Calculus 1.

2.

Explain equation (4.2). Why does the limit on the left equal the vector on the right?
\begin{equation*} \lim_{t\to 0} \frac{f(\mathbf{x}_0 + t\mathbf{e}_j) -f(\mathbf{x}_0)}{t}= \left(\frac{\partial y_1}{\partial x_j}, \frac{\partial y_2}{\partial x_j},\ldots, \frac{\partial y_m}{\partial x_j}\right) \end{equation*}

3.

Explain equation (4.3). How does this equation follow from the previous?

4.

Explain equation (4.5). How is it the same as the chain rule (4.4)?