morphops.procrustes¶

Provides procrustes alignment related operations and algorithms.

For geometric morphometrics based studies, after landmark data are collected for each specimen, a typical next step is to remove the position, size and orientation information from the landmark set of each specimen so that what remains is the shape information. This can be achieved by, for example, running Generalized Procrustes Aligment (see gpa()) on the set of landmark sets.

After procrustes alignment, the shapes lie in a high-dimensional non-euclidean manifold but are usually quite close to each other and can be projected to a euclidean tangent space at their shape mean, whereupon they can be subjected to multivariate analysis techniques like Principal Components Analysis, Partial Least Squares, etc.

morphops.procrustes.get_position(lmks)[source]¶

Returns the centroid of the set or sets of landmarks in lmks.

The centroid of a \(p\) landmarks is simply the arithmetic mean of all the landmark positions. That is

\[\mathbf{x_c} = \sum_{i=1}^p \dfrac{\mathbf{x_i}}{p}\]

Parameters:

lmks (array-like) –

One of the following

Single specimen A (p,k) array of p landmarks in k dimensions for one specimen.
n specimens A (n,p,k) array of n landmark sets for n specimens, each having p landmarks in k dimensions.

Returns:

centroid –

If lmks is a (p,k) array, then centroid is a (k,)-shaped array, whose i-th element is the mean of the i-th coordinate in lmks.
If lmks is a (n,p,k) array, then centroid is a (n,k)-shaped array whose i-th element is the (k,)-shaped centroid of the i-th specimen’s landmarks in lmks.

Return type: numpy.ndarray

morphops.procrustes.get_scale(lmks)[source]¶

Returns the euclidean norm of the real matrix or matrices in lmks.

The euclidean norm of the real (p x k) matrix \(X\) is calculated as

\[\|X\| = \sqrt{Tr(X^T X)}\]

Note

lmks is not assumed to have been pre-centered. To pre-center lmks you can call remove_position() on lmks before applying remove_scale.

Parameters:

lmks (array-like) –

One of the following

Single specimen A (p,k) array of p landmarks in k dimensions for one specimen.
n specimens A (n,p,k) array of n landmark sets for n specimens, each having p landmarks in k dimensions.

Returns:

scale –

Single specimen If lmks is (p,k)-shaped, scale is a float representing its euclidean norm.
n specimens If lmks is (n,p,k)-shaped, scale is an (n,) -shaped array such that the i-th element is the euclidean norm of the i-th specimen’s landmarks.

Return type: numpy.float64 or numpy.ndarray

morphops.procrustes.get_ssqd(X)[source]¶: Alias for lmk_util.ssqd(X).

morphops.procrustes.gpa(X, tol=1e-05, max_iters=10, do_project=False, do_scaling=False, no_reflect=False, unitize_mean=False)[source]¶

Performs Generalized Procrustes Alignment to transform all the landmark sets in X such that (a quantity proportional to) the sum of squared norms of pairwise differences between all the landmark sets is minimized.

Say len(X) = n. gpa() tries to find

\[\operatorname*{argmin}_{\beta_i > 0,\ R_i \in O(k),\ \gamma_i \in \mathbb{R}^k } g(X) = \frac{1}{n} \sum_{i=1}^{n-1} { \sum_{j=i+1}^n {\| (\beta_i X_i R_i + \mathbf{1_k} \gamma_i^T) - (\beta_j X_j R_j + \mathbf{1_k} \gamma_j^T) \|^2}}\]

The Generalized (Procrustes) Sum of Squares or G is defined as

\[G(X) = \operatorname*{inf}_{\beta_i > 0,\ R_i \in O(k),\ \gamma_i \in \mathbb{R}^k } g(X)\]

The GPA algorithm, per [drymar], tries to iteratively rotate and scale the landmark sets in X until the sum of squared differences is below tol. While the algorithm should converge quite fast, it can be forced to stop the minimization loop after max_iters number of iterations.

For an explanation of the other parameters, please see the Parameters section.

Note

Re do_project and do_scaling: The projection used here is based on [rohlf] and assumes that the aligned shapes are of unit centroid size, which is not generally true when do_scaling is True. Consequently, if both do_project and do_scaling are True, gpa() will issue a warning, but proceed with the projection.

Note

Generally for opa(), \(OSS(X1, X2) \neq OSS(X2, X1)\).

In contrast to opa(), gpa() is symmetric for the input matrices in that \(G(X1, X2) = G(X2, X1)\).

See also

rotate(), opa()

Parameters:	X (array-like) – A (n,p,k)-shaped set of landmark sets that have to be aligned to each other. tol (float, optional) – The sum of squared differences value that will be considered “low enough” by the iterative rotation and scaling. The iterations will continue until tol has been achieved or max_iters is reached, whichever comes first. max_iters (int, optional) – The maximum number of iterations that the iterative rotation and scaling is allowed to run for. The iterations will continue until tol has been achieved or max_iters is reached, whichever comes first. do_scaling (bool, optional) – If False, \(\beta_i = \frac{1}{\\| X'_i \\|}\), where \(X'_i\) is the mean-centered \(X_i\). Else \(\beta_i\) is calculated as per [tenb]. do_project (bool, optional) – If True, the final aligned landmarks are orthogonally projected to the tangent space at the mean of aligned landmark sets mean, using equation 1 in [rohlf]. no_reflect (bool, optional) – Flag indicating whether the best alignment should exclude reflection (default is False, which means reflection will be used if it achieves better alignment). unitize_mean (bool, optional) – Flag indicating whether the mean of aligned landmark sets mean should be rescaled to have unit centroid size.
Returns:	result – aligned: numpy.ndarray A (n,p,k)-shaped set of aligned landmark sets. mean: numpy.ndarray A (p,k)-shaped array representing the mean of the procrustes aligned landmark sets aligned. b: numpy.ndarray A (n,)-shaped array representing the scaling factor \(\beta_i\) by which the centered \(X'_i\) is scaled. ssq: numpy.float64 This number represents the Generalized (Procrustes) Sum of Squares, which is the infinimum of \(g\). Essentially, the ssq is the result of plugging in the optimal \(\beta_i\), \(R_i\) and \(\gamma_i\) into the \(g\) objective.
Return type:	dict
Warns:	UserWarning – If both do_project and do_scaling are True

References

[drymar]

Dryden, I.L. and Mardia, K.V., 1998. Statistical shape analysis.

[tenb]

Ten Berge, J.M., 1977. Orthogonal Procrustes rotation for two or more matrices. Psychometrika, 42(2), pp.267-276.

[rohlf]

(1, 2) Rohlf, F.J., 1999. Shape statistics: Procrustes superimpositions and tangent spaces. Journal of Classification, 16(2), pp.197-223.

morphops.procrustes.opa(source, target, do_scaling=False, no_reflect=False)[source]¶

Performs Ordinary Procrustes Alignment to transform the landmark set source such that the squared Euclidean distance between source and target is minimized.

Say X=`source` and Y=`target` and do_scaling = True. opa() tries to find

\[\operatorname*{argmin}_{\beta > 0,\ R \in O(k),\ \gamma \in \mathbb{R}^k } D^2_{\mathtt{OPA}}(X, Y) = \| Y - \beta X R - \mathbf{1_k} \gamma^T \|^2\]

If do_scaling = False, \(\beta = 1\). If no_reflect = True, then just as in rotate(), opa() will force \(R \in SO(k)\).

The Ordinary (Procrustes) Sum of Squares or OSS is defined as

\[OSS(X, Y) = \operatorname*{min}_{\beta > 0,\ R \in O(k),\ \gamma \in \mathbb{R}^k } D^2_{\mathtt{OPA}}(X, Y)\]

Note

Generally for opa(), \(OSS(X1, X2) \neq OSS(X2, X1)\).

In contrast to opa(), gpa() is symmetric for the input matrices in that \(G(X1, X2) = G(X2, X1)\).