next up previous
Next: 4 Evaluation Up: Improving Specificity in PDMs Previous: 2 The Point Distribution

3 A Hierarchical PDM

Bregler and Omohundro [2] describe a solution to this problem, whereby a constraint surface is constructed within shape space, using a union of lower dimensional subspaces. We have altered and extended this technique by considering a two-level hierarchical approach, and by substituting the hyperplane subspaces for bounded regions.

The key to the approach is that a complex, non-linear region approaches linearity locally under magnification, and hence can be approximated by a combination of a number of smaller linear subregions. To build the subregions, a k-means cluster analysis is performed on the training data in shape space to find a number of prototypes. For each prototype, a number of nearest neighbours are taken from the training set and a PCA is performed on them. A subregion is produced which is centred on the cluster mean (not necessarily the same as the prototype), and bounded by a hyperellipsoid with a Malhalanobis radius of some specified value M (similar to a standard Linear PDM). The VSR is then represented as the union of these subregions (there are some subtleties; these are discussed later). Figure 1 demonstrates the process.

   figure73
Figure 1: Building a valid shape region from linear patches; (a) training data projected into a 2D shape space, (b) k-means cluster centres, (c) principal axes and (d) the valid shape region.

In the case of the PDM, the shape space generally has upwards of 100 dimensions (twice the number of model landmark points). For this reason it is useful to adopt a two-level hierarchical approach; an initial global PCA is performed on the training data in order to produce a lower dimensional space. The linear subregions are then constructed in this new space instead of the high dimensional shape space. The reduced dimensionality decreases computation times substantially, and noise from outliers is reduced due to the removal of insignificant modes of variation by the global PCA.

For the VSR to be of practical use, it must be possible to apply what is known as the `nearest point' query: ``Given a general point in shape space, where is the nearest point in the VSR?'' This then facilitates the application of constraints to any given shape.

The simplest (but not only) way to apply these constraints is to find the closest (Euclidian distance) cluster mean and constrain the point to be within the associated subregion. The point should ideally be moved to the closest position within the subregion (hyperellipsoid-bounded), however this requires an expensive gradient descent computation, and instead we approximate the subregion as a hypercuboid. The alternative of moving the point directly towards the cluster mean is grossly inaccurate for eccentric hyperellipsoids. Figure 2 illustrates.

   figure82
Figure 2: Constraining a general point to lie within a linear subregion; (a) the correct way, (b) a bad approximate method and (c) a better approximate method.

There are other ways to apply the constraints; Bregler describes a constraint function C on a shape tex2html_wrap_inline483 as follows:

equation89

where tex2html_wrap_inline485 is the shape tex2html_wrap_inline487 as projected into the subregion of cluster i and tex2html_wrap_inline491 is the influence function for cluster i. Setting tex2html_wrap_inline495 if cluster i is closest and tex2html_wrap_inline499 if not results in the simple algorithm described above. Alternatively, tex2html_wrap_inline501 can be a Gaussian, centred on cluster i's mean:

equation98

where tex2html_wrap_inline505 is the cluster mean and tex2html_wrap_inline507 is a scale factor related to the `size' of cluster i. A sensible value for tex2html_wrap_inline511 is the square root of the sum of the eigenvalues from the local PCA. It is important to note that in the above equation, the Euclidian distance (as opposed to the Malhalanobis distance) is used; otherwise the influence function decays too quickly off-axis for eccentric hyperellipsoids.

For a particular tex2html_wrap_inline513 , tex2html_wrap_inline515 will be very small for the majority of i. The calculation of tex2html_wrap_inline519 can be made more efficient by only including terms for which tex2html_wrap_inline521 is significant. A cutoff point of one tenth of the maximum value is suitable.

Using the Gaussian influence functions has the effect of performing an interpolation at positions between neighbouring clusters, giving smoother joins, especially in cases where the subregions don't actually overlap. However, a side effect is that the notion of a concrete divide between valid and invalid shapes is lost, insofar as that if tex2html_wrap_inline523 then it is not necessarily the case that tex2html_wrap_inline525 . It is thus important to only apply the constraint function once each time the shape needs constraining. Also this method is slower than the nearest-cluster method.

The method we describe here is based on Bregler and Omohundro's approach, but differs in two aspects. Firstly, they treated the linear patches as lower-dimensional hyperplanes, whereas we prefer to use hyperellipsoid-bounded regions. The hyperplanes method has the undesirable property that it extends the VSR indefinitely at extremities (see Figure 3). Secondly, we have introduced the idea of a hierarchical framework, whereby a global PCA is performed prior to the constraint process. This increases efficiency and also removes some training data noise.

An alternative approach is described in the statistics literature. The VSR can be modelled as a probability density function, approximated as a Gaussian mixture. Instead of using k-means to determine the Gaussians, the EM algorithm [4] can be used with equal, if not better, success.

   figure117
Figure 3: Constraining shape using (a) hyperplanes and (b) hyperellipsoids.

Choosing k, the and M

k is the number of clusters that are used to build the VSR. tex2html_wrap_inline543 is the number of nearest-neighbour training examples used to build the linear subregion for cluster i. Between them, k and the tex2html_wrap_inline549 determine the degree of cluster overlap.

Our current strategy is to specify a fixed degree of cluster overlap, O, and set tex2html_wrap_inline553 , where tex2html_wrap_inline555 is the number of members in the k-means cluster. The argument for overlap is that it results in smoother transitions between subregions. Initial experimentation suggests that O = 1.5 produces a good balance between accuracy (allowing all valid shapes) and specificity (disallowing invalid shapes).

The choice of k is data-dependent. The more complex and non-linear the VSR, the larger the number of clusters required to model it accurately; however, there is a trade-off between accuracy and speed. If speed is not an issue then k can be increased in the limit to E (but the choice of O must be reconsidered). For noiseless training data this produces the smoothest model; however any noise that is present is liable to be included in the model.

So far we have chosen k manually, based on knowledge about the expected shape of the VSR. It seems likely that it would be possible to find a sensible value for k automatically via some optimisation technique. If unsure, a good first guess would be k = E / 10.

Also important is the choice of M--the Malhalanobis radius of the clusters. We have chosen M=2.0 which, statistically, encompasses over 95% of the cluster members. A larger value generally leads an underconstrained shape space.


next up previous
Next: 4 Evaluation Up: Improving Specificity in PDMs Previous: 2 The Point Distribution

A J Heap
Wed Jul 2 14:34:50 BST 1997