數(shù)據(jù)選自Journal of the American Medical Association(http://jse.amstat.org/v4n2/datasets.shoemaker.html
)關(guān)于體溫、性別肉津、心率的臨床數(shù)據(jù)
現(xiàn)對男性體溫抽樣計(jì)算下95%置信區(qū)間總體均值范圍风瘦。
1鹰贵、讀取數(shù)據(jù)
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
#讀取數(shù)據(jù)
df = pd.read_csv('http://jse.amstat.org/datasets/normtemp.dat.txt', header = None,sep = '\s+' ,names=['體溫','性別','心率'])
2帆焕、選取樣本大小括尸,查看數(shù)據(jù)
np.random.seed(42)
#df.describe()
#樣本量為90攒钳,查看樣本數(shù)據(jù)
df_sam = df.sample(90)
df_sam.head()
3挽荠、計(jì)算抽取樣本中男士體溫的均值
df3 = df_sam.loc[df_sam['性別']==1]
df3['體溫'].mean()
4进栽、重復(fù)抽取樣本,計(jì)算其他樣本中男士體溫的均值,得到抽樣分布
boot_means = []
for _ in range(10000):
bootsample = df.sample(90, replace=True)
mean = bootsample[bootsample['性別'] == 1]['體溫'].mean()
boot_means.append(mean)
5德挣、繪制男士體溫抽樣分布均值6、計(jì)算抽樣分布的置信區(qū)間以估計(jì)總體均值, 置信度95%
np.percentile(boot_means, 2.5), np.percentile(boot_means, 97.5)
(97.89249519230768, 98.30741452991455)