{ "cells": [ { "cell_type": "markdown", "id": "1733987b", "metadata": {}, "source": [ "# Homework 10 - Chapter 15" ] }, { "cell_type": "markdown", "id": "72eba3bb-044b-4508-8a6a-4e3b05121b8a", "metadata": {}, "source": [ "- Due Date: Monday, April 21st no later than 11:59 p.m.\n", "- Partner Information: You may complete this assignment individually or with exactly one classmate.\n", "- Submission Instructions (working alone): Upload your solution, entitled **YourFirstName-YourLastName-Homework10.ipynb** to the \n", "BrightSpace Homework 10 Dropbox.\n", "- Submission Instructions (working with one classmate): Upload your solution, entitled \n", "**YourFirstName-YourLastName-PartnerFirstName-PartnerLastName-Homework10.ipynb** to the BrightSpace Homework 10 Dropbox. Note: If you \n", "work with a partner, only one person needs to submit a solution. If you both submit a solution, the submission that will be graded is the one \n", "from the partner whose last name comes alphabetically first.\n", "- Deadline Reminder: Once the submission deadline passes, BrightSpace will no longer accept your submission and you will no longer be able to earn credit. \n", "Thus, if you are not able to fully complete the assignment, submit whatever you have before the deadline so that partial credit can be earned." ] }, { "cell_type": "markdown", "id": "b3c19ddc", "metadata": {}, "source": [ "## Starting Code" ] }, { "cell_type": "code", "execution_count": null, "id": "df5f022a", "metadata": {}, "outputs": [], "source": [ "from datascience import *\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "id": "23e82818-5d87-46db-b4a3-69bfc7853df1", "metadata": {}, "source": [ "Download the file world_happiness_report.csv into the same directory as this Jupyter notebook. You are encouraged to read about each variable to aid in your understanding of the data: https://www.kaggle.com/datasets/jainaru/world-happiness-report-2024-yearly-updated" ] }, { "cell_type": "code", "execution_count": null, "id": "df3eb465-3a37-4488-9b8e-b3ecc8eae06b", "metadata": {}, "outputs": [], "source": [ "# Start out with a pandas DataFrame\n", "happiness = pd.read_csv('world_happiness_report.csv')\n", "happiness.head()" ] }, { "cell_type": "code", "execution_count": null, "id": "708e08ff-1187-488c-a42c-85ae5ad6e5de", "metadata": {}, "outputs": [], "source": [ "# Drop rows with missing values.\n", "cleaned_happiness = happiness.dropna()\n", "# Convert DataFrame to Table\n", "happiness_table = Table.from_df(cleaned_happiness)\n", "happiness_table.show(4)" ] }, { "cell_type": "markdown", "id": "e20a93db", "metadata": {}, "source": [ "## Question 1: 4 points" ] }, { "cell_type": "markdown", "id": "7e11a5c6-0b77-4053-9084-b1edd095b7b4", "metadata": {}, "source": [ "Using three scatterplots, display the relationship between the happiness score and each of the following variables: *Log GDP per capita*, *Perceptions of corruption*, and *Healthy life expectancy*. After visually examining these plots, identify which of the three variables appears to have the weakest correlation with the happiness score. Explain your reasoning based on the patterns you observe." ] }, { "cell_type": "code", "execution_count": null, "id": "0bc41948-6490-41f3-bdf9-63e88b797da0", "metadata": {}, "outputs": [], "source": [ "# Place answer here." ] }, { "cell_type": "markdown", "id": "66faf902", "metadata": {}, "source": [ "## Question 2: 2 points" ] }, { "cell_type": "markdown", "id": "02ecdd71-ac39-49b2-bc54-bb30cbcf7c3e", "metadata": {}, "source": [ "Standardize both the happiness score and GDP per capita by converting them into standard units. (Standard units are also known \n", "as *z-scores*.) Once standardized, add these new variables to the happiness_table and display the first 4 entries." ] }, { "cell_type": "code", "execution_count": null, "id": "15d0f3b7-2278-4afa-9c1b-87b16e28cc5e", "metadata": {}, "outputs": [], "source": [ "# Place answer here." ] }, { "cell_type": "markdown", "id": "d53e306a-59d9-42ce-af63-8200686282e4", "metadata": {}, "source": [ "## Question 3a: 1 point" ] }, { "cell_type": "markdown", "id": "172488ec-5425-4e59-8da8-41f3126a4c82", "metadata": {}, "source": [ "Calculate and display the correlation coefficient, r, between the standardized happiness score and the standardized GDP per capita." ] }, { "cell_type": "code", "execution_count": null, "id": "52cfb220-50ee-4579-84cc-225fb24dc73d", "metadata": {}, "outputs": [], "source": [ "# Place answer here." ] }, { "cell_type": "markdown", "id": "3b98839b-c8d8-4a2d-9a75-8029dd6f5675", "metadata": {}, "source": [ "## Question 3b: 2 points" ] }, { "cell_type": "markdown", "id": "4ef63656-8064-4b76-a0b1-3970652a5930", "metadata": {}, "source": [ "Display a scatter plot of standardized happiness vs. standardized GDP. On this same graph,\n", "display the regression line (in red) using the r value you calculated in Question 3a." ] }, { "cell_type": "code", "execution_count": null, "id": "a3453b12-67b3-46cb-b26b-08df5d69e213", "metadata": {}, "outputs": [], "source": [ "# Place answer here." ] }, { "cell_type": "markdown", "id": "2632d766-72ef-4e96-83f4-df0c082a2041", "metadata": {}, "source": [ "## Question 3c: 1 point" ] }, { "cell_type": "markdown", "id": "0355114b-27eb-48a5-852b-a860b38e93be", "metadata": {}, "source": [ "What does the standardized regression line reveal about the relationship between GDP per capita and the happiness score (Ladder Score)? Explain how to interpret the slope of the line in the context of standardized units." ] }, { "cell_type": "markdown", "id": "9e7c524e-90b1-4acf-8490-90bdff0f82ae", "metadata": {}, "source": [ "**Answer** - " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.1" } }, "nbformat": 4, "nbformat_minor": 5 }