Perform Lambert's W x F transformation and center/scale a vector to attempt normalization via the LambertW package.

lambert(x, type = "s", standardize = TRUE, warn = FALSE, ...)

# S3 method for lambert
predict(object, newdata = NULL, inverse = FALSE, ...)

# S3 method for lambert
print(x, ...)

Arguments

x

A vector to normalize with Box-Cox

type

a character indicating which transformation to perform (options are "s", "h", and "hh", see details)

standardize

If TRUE, the transformed values are also centered and scaled, such that the transformation attempts a standard normal

warn

should the function show warnings

...

Additional arguments that can be passed to the LambertW::Gaussianize function

object

an object of class 'lambert'

newdata

a vector of data to be (reverse) transformed

inverse

if TRUE, performs reverse transformation

Value

A list of class lambert with elements

x.t

transformed original data

x

original data

mean

mean after transformation but prior to standardization

sd

sd after transformation but prior to standardization

tau.mat

estimated parameters of LambertW::Gaussianize

n

number of nonmissing observations

norm_stat

Pearson's P / degrees of freedom

standardize

was the transformation standardized

The predict function returns the numeric value of the transformation performed on new data, and allows for the inverse transformation as well.

Details

lambert uses the LambertW package to estimate a normalizing (or "Gaussianizing") transformation. This transformation can be performed on new data, and inverted, via the predict function.

NOTE: The type = "s" argument is the only one that does the 1-1 transform consistently, and so it is the only method currently used in bestNormalize(). Use type = "h" or type = 'hh' at risk of not having this estimate 1-1 transform. These alternative types are effective when the data has exceptionally heavy tails, e.g. the Cauchy distribution.

Additionally, sometimes (depending on the distribution) this method will be unable to extrapolate beyond the observed bounds. In these cases, NaN is returned.

References

Georg M. Goerg (2016). LambertW: An R package for Lambert W x F Random Variables. R package version 0.6.4.

Georg M. Goerg (2011): Lambert W random variables - a new family of generalized skewed distributions with applications to risk estimation. Annals of Applied Statistics 3(5). 2197-2230.

Georg M. Goerg (2014): The Lambert Way to Gaussianize heavy-tailed data with the inverse of Tukey's h transformation as a special case. The Scientific World Journal.

See also

Examples

if (FALSE) {
x <- rgamma(100, 1, 1)

lambert_obj <- lambert(x)
lambert_obj
p <- predict(lambert_obj)
x2 <- predict(lambert_obj, newdata = p, inverse = TRUE)

all.equal(x2, x)
}