Home My Page Projects Code Snippets Project Openings diderot
Summary Activity Tracker Tasks SCM

SCM Repository

[diderot] Annotation of /trunk/doc/math/derivs.tex
ViewVC logotype

Annotation of /trunk/doc/math/derivs.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 11 - (view) (download) (as text)
Original Path: trunk/math/derivs.tex

1 : glk 10 \documentclass[11pt]{article}
2 :    
3 :     \usepackage{amsmath}
4 :     \usepackage{amsfonts}
5 :     \usepackage{array}
6 :     \usepackage{amssymb}
7 :     \usepackage{bm}
8 :    
9 :     \newcommand{\bx}{\mathbf{x}}
10 :     \newcommand{\bH}{\mathbf{H}}
11 :     \newcommand{\ie}{{\em i.e.}}
12 : glk 11 \newcommand{\eg}{{\em e.g.}}
13 : glk 10
14 :     \begin{document}
15 :    
16 :     \title{Math for Diderot}
17 :    
18 :     In the following,
19 :     \begin{itemize}
20 :     \item $\bx$ is a vector (an element of some vector space $W$)
21 :     \item $\alpha$ is a constant scalar
22 : glk 11 \item $\phi$ is a scalar function of a scalar (\eg $\phi(x) = x^2$)
23 : glk 10 \item $f$ and $g$ are scalar functions of $W$
24 :     \item $\mathbf{u} \otimes \mathbf{v}$ is tensor product of two vectors,
25 :     computed as the outer product of their vectors of coefficients in
26 :     some basis.
27 :     \item $\nabla f$ is a the gradient (first derivative) of $f$,
28 :     computed in 3-D as:
29 :     \begin{equation}
30 :     \nabla f = \begin{bmatrix}
31 :     \frac{\partial}{\partial x} \\
32 :     \frac{\partial}{\partial y} \\
33 :     \frac{\partial}{\partial z}
34 :     \end{bmatrix} f
35 :     = \begin{bmatrix}
36 :     \frac{\partial f}{\partial x} \\
37 :     \frac{\partial f}{\partial y} \\
38 :     \frac{\partial f}{\partial z}
39 :     \end{bmatrix}
40 :     \end{equation}
41 : glk 11 where $x$, $y$, $z$ are the coordinates in $W$ (\ie some basis
42 : glk 10 is assumed). The formulae below don't assume a particular dimension.
43 :     Note that $\nabla$ can also be used to define divergence and curl
44 :     of vector fields, but for the time being these are not Diderot's concern.
45 :     \item $\bH f = \nabla \otimes \nabla f$ is the Hessian (second derivative)
46 :     of $f$, computed in 3-D as:
47 :     \begin{equation}
48 :     \bH f = \begin{bmatrix}
49 :     \frac{\partial}{\partial x} \\
50 :     \frac{\partial}{\partial y} \\
51 :     \frac{\partial}{\partial z}
52 :     \end{bmatrix} \begin{bmatrix}
53 :     \frac{\partial}{\partial x} &
54 :     \frac{\partial}{\partial y} &
55 :     \frac{\partial}{\partial z}
56 :     \end{bmatrix} f
57 :     = \begin{bmatrix}
58 :     \frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \frac{\partial^2 f}{\partial x_1 \partial x_3} \\
59 :     \frac{\partial^2 f}{\partial x_1 \partial x_2} & \frac{\partial^2 f}{\partial x_2^2} & \frac{\partial^2 f}{\partial x_2 \partial x_3} \\
60 :     \frac{\partial^2 f}{\partial x_1 \partial x_3} & \frac{\partial^2 f}{\partial x_2 \partial x_3} & \frac{\partial^2 f}{\partial x_3^2}
61 :     \end{bmatrix}
62 :     \end{equation}
63 :     \end{itemize}
64 :    
65 :     Basic rules for the gradient:
66 :     \begin{align}
67 :     \nabla (f + g) &= \nabla f + \nabla g \\
68 :     \nabla (f g) &= f \nabla g + g \nabla f \label{eq:grad-prod} \\
69 :     \nabla (\alpha f) &= \alpha \nabla f \label{eq:grad-scale} \\
70 : glk 11 \nabla (\phi(f)) &= \phi'(f) \nabla f \label{eq:grad-chain} \\
71 : glk 10 \nabla (f^n) &= n f^{n-1} \nabla f \label{eq:grad-pow} \\
72 :     \nabla \left(\frac{f}{g}\right)
73 :     &= \frac{\nabla f}{g} - \frac{f \nabla g}{g^2} \label{eq:grad-frac} \\
74 :     \end{align}
75 :    
76 :     (\ref{eq:grad-scale}) follows from (\ref{eq:grad-prod}) with $\nabla
77 :     \alpha = 0$. (\ref{eq:grad-frac}) follows from (\ref{eq:grad-pow}).
78 :    
79 :     Basic rules for the Hessian:
80 :     \begin{align}
81 :     \bH (f + g) &= \bH f + \bH g \\
82 :     \bH (\alpha f) &= \alpha \bH f \\
83 :     \bH (f g) &= f \bH g + \nabla f \otimes \nabla g + \nabla g \otimes \nabla f + g \bH f \\
84 :     \bH \left(\frac{f}{g}\right) &=
85 :     \frac{\bH f}{g}
86 :     - \frac{\nabla f \otimes \nabla g + \nabla g \otimes \nabla f + f \bH g}{g^2}
87 :     + \frac{2 f \nabla g \otimes \nabla g}{g^3} \label{eq:hess-quot} \\
88 :     \bH (f^n) &= n f^{n-2} \left( (n-1) \nabla f \otimes \nabla f + f \bH f \right) \label{eq:hess-pow}
89 :     \end{align}
90 :    
91 : glk 11 All of these can actually be derived with $\bH f = \nabla \otimes
92 :     \nabla f$ and the rules above. Someone may want to doublecheck
93 :     (\ref{eq:hess-quot}). I didn't include the Hessian of the chain rule
94 :     (analogous to (\ref{eq:grad-chain})) because the process of starting
95 :     to derive it inspired me to start learning enough ML to write a program
96 :     that would derive it for me...
97 : glk 10
98 :     \end{document}

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0