MathJax

Monday, June 20, 2011

Exercise: Objects and Attributes

There are \(M\) objects, each with a number of attributes or labels. There are \(L\) possible labels and each object is associated with some subset of these labels. The indicator variables
\[
\alpha_{ml} = \left\{\begin{array}{cl}
1 & \textrm{if label } l \textrm{ is associated with object } m \\
0 & \textrm{otherwise}
\end{array}\right.
\]
define the association between objects and labels. The true values of the \(\alpha_{ml}\) are unknown. To infer the values of the \(\alpha_{ml}\), we ask \(N\) people to indicate which labels they believe are associated with each object. The outcomes of these trials are defined by another set of indicator variables,
\[
\beta_{nml} = \left\{\begin{array}{cl}
1 & \textrm{if person } n \textrm{ associated label } l \textrm{ with object } m \\
0 & \textrm{otherwise}
\end{array}\right.
\]
Now, people can make mistakes and we need to take two types of mistakes into account, namely false positives and false negatives. A false positive is when a person associates a label with an object, while the truth is that the label is not associated with it — i.e. \(\beta_{nml} = 1\) and \(\alpha_{ml} = 0\). A false negative is the opposite scenario, \(\beta_{nml} = 0\) and \(\alpha_{ml} = 1\). We will assume that each person makes each type of error with some fixed but unknown probability,
\begin{eqnarray}
e_n^{\textrm{pos}} &=& P(\beta_{nml}=1 | \alpha_{ml}=0) \quad\textrm{ and}\\
e_n^{\textrm{neg}} &=& P(\beta_{nml}=0 | \alpha_{ml}=1)
\end{eqnarray}

Questions

  1. Come up with an appropriate prior over \(\left\{\alpha_{ml}\right\}\) and derive the posterior after observing \(\left\{\beta_{nml}\right\}\). You will also need priors over \(\left\{e^{\textrm{pos}}_n\right\}\) and \(\left\{e^{\textrm{neg}}_n\right\}\) and should derive their posteriors too.
  2. Use the data file (LINK: \(N\approx20\), \(M\approx100\), \(L\approx100\)) and infer posteriors over \(\alpha_{ml}\), \(\left\{e^{\textrm{pos}}_n\right\}\) and \(\left\{e^{\textrm{neg}}_n\right\}\). Visualise the posteriors over \(\left\{e^{\textrm{pos}}_n\right\}\) and \(\left\{e^{\textrm{neg}}_n\right\}\) and note your observations. Visualise the posterior over \(\left\{\alpha_{ml}\right\}\) for each object.
  3. Comment on whether there are enough data in the file or whether more measurements should be made.

Wednesday, June 15, 2011

Conjugate Inference: Multivariate Gaussian Likelihood

Domain: \[\vec{x}\in\mathbb{R}^d\]

Parameters: The Gaussian likelihood is parametrised by its mean vector, \(\vec{\mu}\), and precision matrix, \(\Lambda\).
\[\Theta = \{\vec{\mu}, \Lambda\}\]

Likelihood: \[P(\vec{x}|\Theta) = N(\vec{x}|\vec{\mu},\Lambda^{-1})\]

Prior: A normal–Wishart distribution.
\[P(\Theta) = N(\vec{\mu}|\vec{\eta}_0,(\tau_0\Lambda)^{-1})\ W(\Lambda|V_0,\nu_0)\]
The probability density function of the Wishart distribution is
\[W(\Lambda|V,\nu) = \frac{|\Lambda|^{(\nu-d-1)/2}\exp\left(-\frac{1}{2}\textrm{Trace}(V^{-1}\Lambda)\right)}{2^{\nu d/2}|V|^{\nu/2}\Gamma_d(\nu/2)}\]
where
\[\Gamma_d(\nu/2) = \pi^{d(d-1)/4}\ \prod_{i=1}^d \Gamma\left(\frac{\nu-i+1}{2}\right)\]
is the multivariate gamma function.

Posterior:
\[P(\Theta|D) = N(\vec{\mu}|\vec{\eta}_1,(\tau_1\Lambda)^{-1})\ W(\Lambda|V_1,\nu_1)\]
with
\begin{eqnarray}
\vec{\eta}_1 &=& \frac{\tau_0\vec{\eta}_0 + \vec{S}^{(1)}}{\tau_1} \\
\tau_1 &=& \tau_0 + S^{(0)} \\
\nu_1 &=& \nu_0 + S^{(0)} \\
V_1^{-1} &=& V_0^{-1} + S^{(2)} + \tau_0\vec{\eta}_0^{I\!I} - \tau_1\vec{\eta}_1^{I\!I}
\end{eqnarray}
where
\begin{eqnarray}
S^{(0)} &=& |D| \\
\vec{S}^{(1)} &=& \sum_{\vec{x}\in D} \vec{x} \\
S^{(2)} &=& \sum_{\vec{x}\in D} \vec{x}^{I\!I}
\end{eqnarray}
and \(\vec{x}^{I\!I} \equiv \vec{x}\vec{x}^T\) is the outer product of a vector with itself.

Marginal likelihood:
\[P(D) =
\pi^{-S^{(0)}d/2}
\left(\frac{\tau_0}{\tau_1}\right)^{d/2}
\frac{|V_1|^{\nu_1/2}}{|V_0|^{\nu_0/2}}
\ \prod_{i=1}^d \frac{\Gamma\left((\nu_1-i+1)/2\right)}{\Gamma\left((\nu_0-i+1)/2\right)}
\]

Monday, May 9, 2011

Product of Dirichlet Distributions

The Dirichlet distribution is
\[
D(\vec{x}|\vec{\alpha}) =
\frac{\Gamma\left(\sum_{j=1}^d\alpha_j\right)}{\prod_{j=1}^d\Gamma(\alpha_j)}
\prod_{j=1}^d x_j^{\alpha_j-1}
\]
where \(\Gamma(\cdot)\) is the gamma function and \(d\) is the dimensionality of \(\vec{x}\).

A product of Dirichlet distributions is proportional to another Dirichlet distribution.
\[
\prod_{i=1}^n D(\vec{x}|\vec{\alpha}_i) =
Z\times D(\vec{x}|\vec{\alpha}')
\]
where
\[
\vec{\alpha}' - 1 = \sum_{i=1}^n \left[\vec{\alpha}_i - 1\right]
\]
and
\[
Z = \frac
{\prod_{j=1}^d\Gamma(\alpha'_j)}
{\Gamma\left(\sum_{j=1}^d\alpha'_j\right)}
\ \prod_{i=1}^n\left[ \frac
{\Gamma\left(\sum_{j=1}^d\alpha_{ij}\right)}
{\prod_{j=1}^d\Gamma(\alpha_{ij})} \right]
\]

Product of Normal–Gamma Distributions

The normal–gamma distribution is
\begin{eqnarray}
NG(\mu,\lambda|\eta,\tau,\alpha,\beta)
& = & N(\mu|\eta,(\tau\lambda)^{-1})\ G(\lambda|\alpha,\beta) \\
& = &
\frac{\beta^{\alpha}\sqrt{\tau}}{\Gamma(\alpha)\sqrt{2\pi}}
\lambda^{\alpha-\frac{1}{2}}
\exp\left( -\beta\lambda - \frac{1}{2}\tau\lambda(\mu-\eta)^2 \right)
\end{eqnarray}
where \(\Gamma(\cdot)\) is the gamma function.

A product of normal–gamma distributions is proportional to another normal–gamma distribution.
\[
\prod_{i=1}^n NG(\mu,\lambda|\eta_i,\tau_i,\alpha_i,\beta_i)
= Z\times NG(\mu,\lambda|\hat{\eta},\hat{\tau},\hat{\alpha},\hat{\beta})
\]
where
\begin{eqnarray}
\hat{\tau} &=& \sum_{i=1}^n \tau_i \\
\hat{\tau}\hat{\eta} &=& \sum_{i=1}^n \tau_i\eta_i \\
2\hat{\beta} + \hat{\tau}\hat{\eta}^2 &=& \sum_{i=1}^n \left[2\beta_i + \tau_i\eta_i^2\right] \\
\hat{\alpha}-\frac{1}{2} &=& \sum_{i=1}^n \left[\alpha_i - \frac{1}{2}\right]
\end{eqnarray}
and
\[
Z =
\frac{\Gamma(\hat{\alpha})\sqrt{2\pi}}{\hat{\beta}^{\hat{\alpha}}\sqrt{\hat{\tau}}}
\prod_{i=1}^n\left[\frac{\beta_i^{\alpha_i}\sqrt{\tau_i}}{\Gamma(\alpha_i)\sqrt{2\pi}}\right]
\]

Monday, May 2, 2011

Conjugate Inference: Gaussian Likelihood

Domain: Here we consider only univariate normal distributions. \[x\in\mathbb{R}\]

Parameters: The Gaussian likelihood is parametrised by its mean, \(\mu\), and inverse variance, \(\lambda\).
\[\Theta = \{\mu, \lambda\}\]

Likelihood: \[P(x|\Theta) = N(x|\mu,\lambda^{-1})\]

Prior: A normal–gamma distribution.
\[P(\Theta) = N(\mu|\eta_0,(\tau_0\lambda)^{-1})\ G(\lambda|\alpha_0,\beta_0)\]
Note that the following parametrisation of the gamma distribution is used:
\[G(x|\alpha,\beta) = \frac{\beta^{\alpha}}{\Gamma(\alpha)} x^{\alpha-1} e^{-\beta x}\]
(There is another parametrisation, which uses \(\beta^{-1}\) rather than \(\beta\)).

Posterior:
\[P(\Theta|D) = N(\mu|\eta_1,(\tau_1\lambda)^{-1})\ G(\lambda|\alpha_1,\beta_1)\]
with
\begin{eqnarray}
\eta_1 &=& \frac{\eta_0\tau_0 + S^{(1)}}{\tau_1} \\
\tau_1 &=& \tau_0 + S^{(0)} \\
\alpha_1 &=& \alpha_0 + \frac{S^{(0)}}{2} \\
\beta_1 &=& \beta_0 + \frac{1}{2}\left( S^{(2)}+\eta_0^2\tau_0 - \eta_1^2\tau_1 \right)
\end{eqnarray}
where
\begin{eqnarray}
S^{(0)} &=& |D| \\
S^{(1)} &=& \sum_{x\in D} x \\
S^{(2)} &=& \sum_{x\in D} x^2
\end{eqnarray}

Marginal likelihood:
\[P(D) =
(2\pi)^{-S^{(0)}/2}
\frac{\sqrt{\tau_0} \beta_0^{\alpha_0} \Gamma(\alpha_1)}
{\sqrt{\tau_1} \beta_1^{\alpha_1} \Gamma(\alpha_0)} \]

Conjugate Inference: Gaussian Likelihood with Known Variance

Domain: Here we consider only univariate normal distributions.
\[x\in\mathbb{R}\]

Parameters: \[\Theta = \{\mu\}\] where \(\mu\) is the mean of the Gaussian likelihood.

Likelihood: \[P(x|\Theta) = N(x|\mu,v)\] Note that \(v\) is the known variance of the Gaussian likelihood.

Prior: A normal distribution.
\[P(\Theta) = N(\mu|\mu_0,\sigma_0^2)\]

Posterior:
\[P(\Theta|D) = N(\mu|\mu_1,\sigma_1^2)\]
with
\begin{eqnarray}
\mu_1 &=& \sigma_1^2 \left( \frac{\mu_0}{\sigma_0^2} + \frac{S^{(1)}}{v} \right) \\
\sigma_1^2 &=& \left( \frac{1}{\sigma_0^2} + \frac{S^{(0)}}{v} \right)^{-1}
\end{eqnarray}
where
\begin{eqnarray}
S^{(0)} &=& |D| \\
S^{(1)} &=& \sum_{x\in D} x \\
S^{(2)} &=& \sum_{x\in D} x^2
\end{eqnarray}

Marginal likelihood:
\[P(D) =
\left(\frac{1}{\sqrt{2\pi v}}\right)^{S^{(0)}}
\frac{\sigma_1}{\sigma_0}
\exp\left( -\frac{1}{2}\left( \frac{\mu_0^2}{\sigma_0^2} - \frac{\mu_1^2}{\sigma_1^2} + \frac{S^{(2)}}{v} \right) \right)\]

Monday, April 4, 2011

The Greek Alphabet

Ααalpha Ηηeta Ννnu Ττtau
Ββbeta Θθtheta Ξξxi Υυupsilon
Γγgamma Ιιiota Οοomicron Φφphi
Δδdelta Κκkappa Ππpi Χχchi
Εεepsilon Λλlambda Ρρrho Ψψpsi
Ζζzeta Μμmu Σσsigma Ωωomega

Thursday, March 31, 2011

Expectations Under the Normal Distribution

Here are some of the more exotic or slightly quirky expectations under the normal distribution that you might encounter. I find that I need these now and again and it's annoying to have to re-derive them every time.

With the normal distribution defined as
\[N(x|\mu,\sigma^2) = \frac{1}{\sqrt{2\pi}\sigma} \exp \left( -\frac{(x - \mu)^2}{2\sigma^2}\right)\]
we have the following expectations:
\[E[(x-a)^2] = (\mu - a)^2 + \sigma^2\]
\[E[e^{ax}] = \exp\left(\frac{a^2\sigma^2}{2} + a\mu\right)\]
\[E[\log N(x|m,s^2)] = -\frac{1}{2}\log 2\pi - \log s - \frac{1}{2s^2}\left(\sigma^2 + (\mu-m)^2\right)\]

Also see the following for more expectations:

Tuesday, March 8, 2011

Merging multiple PDF files into one

Try
gs -q -sPAPERSIZE=a4 -dNOPAUSE -dBATCH \
   -sDEVICE=pdfwrite -sOutputFile=out.pdf in1.pdf in2.pdf in3.pdf
or, perhaps even better, install pdftk and use
pdftk in1.pdf in2.pdf in3.pdf cat output out.pdf