Home My Page Projects Code Snippets Project Openings diderot

# SCM Repository

[diderot] Annotation of /trunk/doc/math/derivs.tex
 [diderot] / trunk / doc / math / derivs.tex

# Annotation of /trunk/doc/math/derivs.tex

Original Path: trunk/math/derivs.tex

 1 : glk 10 \documentclass[11pt]{article} 2 : 3 : \usepackage{amsmath} 4 : \usepackage{amsfonts} 5 : \usepackage{array} 6 : \usepackage{amssymb} 7 : \usepackage{bm} 8 : 9 : \newcommand{\bx}{\mathbf{x}} 10 : \newcommand{\bH}{\mathbf{H}} 11 : \newcommand{\ie}{{\em i.e.}} 12 : glk 11 \newcommand{\eg}{{\em e.g.}} 13 : glk 10 14 : \begin{document} 15 : 16 : \title{Math for Diderot} 17 : 18 : In the following, 19 : \begin{itemize} 20 : \item $\bx$ is a vector (an element of some vector space $W$) 21 : \item $\alpha$ is a constant scalar 22 : glk 11 \item $\phi$ is a scalar function of a scalar (\eg $\phi(x) = x^2$) 23 : glk 10 \item $f$ and $g$ are scalar functions of $W$ 24 : \item $\mathbf{u} \otimes \mathbf{v}$ is tensor product of two vectors, 25 : computed as the outer product of their vectors of coefficients in 26 : some basis. 27 : \item $\nabla f$ is a the gradient (first derivative) of $f$, 28 : computed in 3-D as: 29 : 30 : \nabla f = \begin{bmatrix} 31 : \frac{\partial}{\partial x} \\ 32 : \frac{\partial}{\partial y} \\ 33 : \frac{\partial}{\partial z} 34 : \end{bmatrix} f 35 : = \begin{bmatrix} 36 : \frac{\partial f}{\partial x} \\ 37 : \frac{\partial f}{\partial y} \\ 38 : \frac{\partial f}{\partial z} 39 : \end{bmatrix} 40 : 41 : glk 11 where $x$, $y$, $z$ are the coordinates in $W$ (\ie some basis 42 : glk 10 is assumed). The formulae below don't assume a particular dimension. 43 : Note that $\nabla$ can also be used to define divergence and curl 44 : of vector fields, but for the time being these are not Diderot's concern. 45 : \item $\bH f = \nabla \otimes \nabla f$ is the Hessian (second derivative) 46 : of $f$, computed in 3-D as: 47 : 48 : \bH f = \begin{bmatrix} 49 : \frac{\partial}{\partial x} \\ 50 : \frac{\partial}{\partial y} \\ 51 : \frac{\partial}{\partial z} 52 : \end{bmatrix} \begin{bmatrix} 53 : \frac{\partial}{\partial x} & 54 : \frac{\partial}{\partial y} & 55 : \frac{\partial}{\partial z} 56 : \end{bmatrix} f 57 : = \begin{bmatrix} 58 : \frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \frac{\partial^2 f}{\partial x_1 \partial x_3} \\ 59 : \frac{\partial^2 f}{\partial x_1 \partial x_2} & \frac{\partial^2 f}{\partial x_2^2} & \frac{\partial^2 f}{\partial x_2 \partial x_3} \\ 60 : \frac{\partial^2 f}{\partial x_1 \partial x_3} & \frac{\partial^2 f}{\partial x_2 \partial x_3} & \frac{\partial^2 f}{\partial x_3^2} 61 : \end{bmatrix} 62 : 63 : \end{itemize} 64 : 65 : Basic rules for the gradient: 66 : \begin{align} 67 : \nabla (f + g) &= \nabla f + \nabla g \\ 68 : \nabla (f g) &= f \nabla g + g \nabla f \label{eq:grad-prod} \\ 69 : \nabla (\alpha f) &= \alpha \nabla f \label{eq:grad-scale} \\ 70 : glk 11 \nabla (\phi(f)) &= \phi'(f) \nabla f \label{eq:grad-chain} \\ 71 : glk 10 \nabla (f^n) &= n f^{n-1} \nabla f \label{eq:grad-pow} \\ 72 : \nabla \left(\frac{f}{g}\right) 73 : &= \frac{\nabla f}{g} - \frac{f \nabla g}{g^2} \label{eq:grad-frac} \\ 74 : \end{align} 75 : 76 : (\ref{eq:grad-scale}) follows from (\ref{eq:grad-prod}) with $\nabla 77 : \alpha = 0$. (\ref{eq:grad-frac}) follows from (\ref{eq:grad-pow}). 78 : 79 : Basic rules for the Hessian: 80 : \begin{align} 81 : \bH (f + g) &= \bH f + \bH g \\ 82 : \bH (\alpha f) &= \alpha \bH f \\ 83 : \bH (f g) &= f \bH g + \nabla f \otimes \nabla g + \nabla g \otimes \nabla f + g \bH f \\ 84 : \bH \left(\frac{f}{g}\right) &= 85 : \frac{\bH f}{g} 86 : - \frac{\nabla f \otimes \nabla g + \nabla g \otimes \nabla f + f \bH g}{g^2} 87 : + \frac{2 f \nabla g \otimes \nabla g}{g^3} \label{eq:hess-quot} \\ 88 : \bH (f^n) &= n f^{n-2} \left( (n-1) \nabla f \otimes \nabla f + f \bH f \right) \label{eq:hess-pow} 89 : \end{align} 90 : 91 : glk 11 All of these can actually be derived with $\bH f = \nabla \otimes 92 : \nabla f$ and the rules above. Someone may want to doublecheck 93 : (\ref{eq:hess-quot}). I didn't include the Hessian of the chain rule 94 : (analogous to (\ref{eq:grad-chain})) because the process of starting 95 : to derive it inspired me to start learning enough ML to write a program 96 : that would derive it for me... 97 : glk 10 98 : \end{document}