{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "2e8efb04-a94b-4e03-9278-5cf2cdc0cdd5",
   "metadata": {},
   "source": [
    "# Chapter 13.3 - Confidence Intervals"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d5516087-9b25-409f-94d5-f11dd0c167c0",
   "metadata": {},
   "outputs": [],
   "source": [
    "from datascience import *\n",
    "%matplotlib inline\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plots"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "324be9b7-0e03-4db1-9928-17de96c0a99a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Place the csv file in the same directory as this notebook\n",
    "ski_resorts = Table().read_table(\"ski_resorts.csv\")\n",
    "ski_resorts.show(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ffd94506-0937-44b2-9e16-20a9e70fe5d1",
   "metadata": {},
   "source": [
    "Let's use the **bootstrap percentile method** from section 13.2 on the data in ski_resorts.csv \n",
    "to estimate the *95% confidence interval* of the mean **Total Snowfall** of all ski resorts in North America, \n",
    "not just the ones that are in the original data set.  In this scenario, we don't know what the true average is!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3501b588-5066-4da4-bcd7-1e5e9b385082",
   "metadata": {},
   "outputs": [],
   "source": [
    "ski_resorts.hist(\"Total Snowfall\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b5ab39e5-3457-4944-a8db-ae60ba6dc37e",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.average(ski_resorts.column(\"Total Snowfall\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aa763892-3f7d-4f8f-a7c6-b6fae3d3649e",
   "metadata": {},
   "outputs": [],
   "source": [
    "def one_bootstrap_mean():\n",
    "    resample = ski_resorts.sample()\n",
    "    return np.average(resample.column('Total Snowfall'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f86e03ca-2383-43d5-98b0-51baa7bb6f54",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Generate means from 5000 bootstrap samples\n",
    "num_repetitions = 5000\n",
    "bootstrap_means = make_array()\n",
    "for _ in np.arange(num_repetitions):\n",
    "    bootstrap_means = np.append(bootstrap_means, one_bootstrap_mean())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "888a2377-8cbe-4a94-abc6-2847213ab7fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Obtain endpoints of the 95% confidence interval\n",
    "left = percentile(2.5, bootstrap_means)\n",
    "right = percentile(97.5, bootstrap_means)\n",
    "make_array(left, right)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "33f0bc3a-0246-4999-90e1-65bca3da8025",
   "metadata": {},
   "source": [
    "The array endpoints show the 95% confidence interval for the mean Total Snowfall.\n",
    "Here is a histogram to help visualize:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "123ffa2a-1708-4d49-9934-506dc56ea556",
   "metadata": {},
   "outputs": [],
   "source": [
    "resampled_means = Table().with_column('Bootstrap Sample Mean', bootstrap_means)\n",
    "resampled_means.hist(bins=20, unit=\"Inches\")\n",
    "plots.plot([left, right], [0, 0], color='yellow', lw=8);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b3bc48d-14d3-458c-91ce-68ff1a42b6ef",
   "metadata": {},
   "source": [
    "Note: the empirical histogram of the resampled means has roughly a symmetric bell shape, even though the histogram of the \n",
    "sampled Total Snowfalls did not.  This can be explained by the **Central Limit Theorem**, a theorem we will visit later\n",
    "in the semester."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b1c6377-9057-42f2-a136-ef0435f8fd14",
   "metadata": {},
   "source": [
    "**Active Learning**: Eliminate all entries in the table that have an **Average Base Depth** of 0.\n",
    "Then display a histogram that highlights the 90% confidence interval of the\n",
    "percentage of the remaining North American resorts that\n",
    "have an **Average Base Depth** of at least 12 inches."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "651fc90d-999e-4348-a347-80a360c9113b",
   "metadata": {},
   "source": [
    "*Step One* - Eliminate the ski resorts that have an Average Base Depth of 0."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5abf80f1-9965-48e4-8f39-e911d875d13b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Place answer here."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4b38279-d265-4a76-99e9-96a8fd7f9f3b",
   "metadata": {},
   "source": [
    "*Step Two* - Define a function that returns True if the value of the parameter exceeds 12."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c88dcb98-0dc9-4502-817a-d12c992bb27a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Place answer here."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5454c0f3-49b6-4e22-8fe1-1fe180d85326",
   "metadata": {},
   "source": [
    "*Step Three* - Apply the function to the table, adding a column that indicates whether each entry meets the criteria."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ba43ba9d-b765-4f56-af4e-64612659d128",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Place answer here."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d32a9d96-bf2f-4c36-b4bb-58930c61a1f2",
   "metadata": {},
   "source": [
    "*Step Four* - Proceed with a process similar to above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1e471bf3-0681-4a68-ad23-9686d586cb64",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Place answer here."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}