Parallelization

A joblib.Parallel instance can be passed in order to parallelize the bootstrapping process. See the joblib documentation for more information.

from joblib import Parallel

parallel = Parallel(n_jobs=-1)

def some_func(df: pd.DataFrame) -> pd.Series:
    return df.mean(numeric_only=True)

df_samples = df.boot.get_samples(bfunc=some_func, B=100, parallel=parallel)

Be aware that time improvement is not guaranteed. The overhead of parallelization may outweigh the benefits of parallelization. We see that a parallelized version of the function is not always faster than the non-parallelized version. However, this will depend on the machine and the configuration of the joblib.Parallel instance.