datasets
Example datasets to test out the boot attribute.
different_mean_and_sigma(n, groups=2, random_state=None, mu=None, sigma=None)
Generate a dataset with different mean and sigma for each group.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n
|
int
|
Number of samples |
required |
groups
|
int
|
Number of groups |
2
|
random_state
|
int | None
|
Random state |
None
|
mu
|
ArrayLike | None
|
Optional mean for each group |
None
|
sigma
|
ArrayLike | None
|
Optional sigma for each group |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
A DataFrame with two columns: group and x |
Examples:
Get the confidence intervals for the mean and sigma
import pandas as pd
from bootstrap.datasets import different_mean_and_sigma
n = 100
mu = [1, 2]
sigma = [1, 4]
df = different_mean_and_sigma(n=n, random_state=0, mu=mu, sigma=sigma)
group x
0 1 3.429522
1 1 -2.833275
2 1 1.982183
3 0 1.656475
4 0 -0.288361
.. ... ...
95 1 0.564347
96 0 -0.901635
97 0 0.891085
98 0 0.196268
99 1 6.320654
[100 rows x 2 columns]
By define our statistic function and pass to get_samples
method.
def mean_and_sigma_by_group(df: pd.DataFrame) -> pd.DataFrame:
return df.groupby("group")["x"].agg(['mean', 'std'])
B = 1_000
sample_kwargs = {"random_state": 0}
df_boot = df.boot.get_samples(
mean_and_sigma_by_group,
B=B,
sample_kwargs=sample_kwargs
)
mean std
group sample
0 0 1.011128 0.960958
1 0.995973 0.874010
2 0.961375 0.941634
3 0.986562 0.848745
4 0.749629 0.982982
... ... ...
1 995 1.797657 4.103178
996 2.836222 3.584542
997 2.587314 3.873845
998 3.176353 3.444296
999 2.817353 3.597222
[2000 rows x 2 columns]
Source code in bootstrap/datasets.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
|