Reverse Box-Cox Transformation

June 16, 2022 Post a Comment

I am using SciPy's boxcox function to perform a Box-Cox transformation on a continuous variable. from scipy.stats import boxcox import numpy as np y = np.random.random(100) y_box,

Solution 1:

SciPy has added an inverse Box-Cox transformation.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.inv_boxcox.html

scipy.special.inv_boxcox scipy.special.inv_boxcox(y, lmbda) =

Compute the inverse of the Box-Cox transformation.

Find x such that:

y = (x**lmbda - 1) / lmbda  if lmbda != 0
    log(x)                  if lmbda == 0

Parameters: y : array_like

Data to be transformed.

lmbda : array_like

Power parameter of the Box-Cox transform.

Returns:
x : array

Transformed data.

Notes

New in version 0.16.0.

Example:

Baca Juga

from scipy.special import boxcox, inv_boxcox
y = boxcox([1, 4, 10], 2.5)
inv_boxcox(y, 2.5)

output: array([1., 4., 10.])

Solution 2:

Here it is the code. It is working and just test. Scipy used neperian logarithm, i check the BoxCox transformation paper and it seens that they used log10. I kept with neperian, because it works with scipy

Follow the code:

#Function
def invboxcox(y,ld):
   if ld == 0:
      return(np.exp(y))
   else:
      return(np.exp(np.log(ld*y+1)/ld))

# Test the code
x=[100]
ld = 0
y = stats.boxcox(x,ld)
print invboxcox(y[0],ld)

Solution 3:

Thanks to @Warren Weckesser, I've learned that the current implementation of SciPy does not have a function to reverse a Box-Cox transformation. However, a future SciPy release may have this function. For now, the code I provide in my question may serve others to reverse Box-Cox transformations.

Solution 4:

In order to inverse the boxcox transformation from scipy.stats.boxcox using scipy.special.inv_boxcox you have to identify the lambda which was generated.

First apply the transformation and print the lambda (ie. param).

df[feature_boxcox], param = stats.boxcox(df[feature])
print('Optimal lambda', param)

Then in order to inverse the transformation you input the generated lambda.

inv_boxcox(df[feature_boxcox], param)

Solution 5:

I recommend to look at Yeo-Johnson transformation, which is Box-Cox analog, but work with negative values and has been well implemented in scikit-learn library with easy reverse transformation.

I'm using it with fbprophet library (forecasting):

from sklearn.preprocessing import PowerTransformer

from fbprophet import Prophet
from fbprophet.plot import plot_cross_validation_metric
from fbprophet.diagnostics import cross_validation
from fbprophet.diagnostics import performance_metrics
import numpy as np
import pandas as pd

def inverse_transform(df, pt_instance, features):
    for feature in features:
        df[feature] = pt_instance.inverse_transform(np.array(df[feature]).reshape(-1,1))
    return df

pt = PowerTransformer(method='yeo-johnson')

train_df_transformed = train_df.copy()
train_df_transformed['y'] = pt.fit_transform(np.array(train_df['y']).reshape(-1,1))

model = Prophet(**hyperparams)
model.fit(train_df_transformed)
df_cv = cross_validation(model, initial='14 days', period='3 days', horizon='1 day', parallel="processes")
df_cv = inverse_transform(df_cv, pt, ['yhat','yhat_lower','yhat_upper'])
df_cv = pd.merge(df_cv.drop(columns=['y']),train_df, left_on='ds', right_on='ds')
df_p = performance_metrics(df_cv, metrics=['mae','mape'], rolling_window=1)
fig1 = plot_cross_validation_metric(df_cv, metric='mape')
fig2 = plot_cross_validation_metric(df_cv, metric='mae')

Learn Python Tutorials