|
|
|
Measure |
Definition |
Expression(s) |
|
Log |
If the frequency distribution for a dataset is broadly unimodal and left-skewed, the natural log transform (logarithms base e) will adjust the pattern to make it more symmetric/similar to a Normal distribution. For variates whose values may range from 0 upwards a value of 1 is often added to the transform. Back transform with the exp() function |
z=ln(x) or z=ln(x+1) nb: ln(x)=loge(x)=log10(x)*log10(e) x=exp(z) or x=exp(z)‑1 |
|
Square root |
A transform that may adjust the dataset to make it more similar to a Normal distribution. For variates whose values may range from 0 upwards a value of 1 is often added to the transform. For 0<=x<=1 (e.g. rate data) the combined form of the transform is often used, and is known as the Freeman-Tukey (FT) transform |
|
|
Logit |
Often used to transform binary response data, such as survival/non-survival or present/absent, to provide a continuous value in the range (‑∞,∞), where p is the proportion of the sample that is 1 (or 0). The inverse or back-transform is shown as p in terms of z. This transform avoids concentration of values at the ends of the range. For samples where proportions p may take the values 0 or 1 a modified form of the transform may be used. This is typically achieved by adding 1/2n to the numerator and denominator, where n is the sample size. Often used to correct S-shaped (logistic) relationships between response and explanatory variables |
|
|
Normal, z-transform |
This transform normalises or standardises the distribution so that it has a zero mean and unit variance. If {xi} is a set of n sample mean values from any probability distribution with mean μ and variance σ2 then the z-transform shown here as z2 will be distributed N(0,1) for large n (Central Limit Theorem). The divisor in this instance is the standard error. In both instances the standard deviation must be non-zero |
|
|
Box-Cox, power transforms |
A family of transforms defined for positive data values only, that often can make datasets more Normal; k is a parameter. The inverse or back-transform is also shown as x in terms of z |
|
|
Angular transforms (Freeman-Tukey) |
A transform for proportions, p, designed to spread the set of values near the end of the range. k is typically 0.5. Often used to correct S-shaped relationships between response and explanatory variables. If p=x/n then the Freeman-Tukey (FT) version of this transform is the averaged version shown. This is a variance-stabilising transform |
|
|
|