{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Homework 7 - Chapter 12" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Due Date: Friday, March 13th no later than 11:59 p.m.\n", "- Partner Information: You may complete this assignment individually or with exactly one classmate.\n", "- Submission Instructions (working alone): Upload your solution, entitled **YourFirstName-YourLastName-Homework7.ipynb** to the \n", "Canvas Homework 7 Dropbox.\n", "- Submission Instructions (working with one classmate): Upload your solution, entitled \n", "**YourFirstName-YourLastName-PartnerFirstName-PartnerLastName-Homework7.ipynb** to the Canvas Homework 7 Dropbox. Note: If you \n", "work with a partner, only one person needs to submit a solution. If you both submit a solution, the submission that will be graded is the one \n", "from the partner whose last name comes alphabetically first.\n", "- Deadline Reminder: Once the submission deadline passes, Canvas will no longer accept your submission and you will no longer be able to earn credit. \n", "Thus, if you are not able to fully complete the assignment, submit whatever you have before the deadline so that partial credit can be earned." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Starting Code" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from datascience import *\n", "%matplotlib inline\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Download the ab_test.csv file and place it\n", "into the same directory as this Jupyter notebook.\n", "The meaning of the columns is as follows:\n", "- user_id: A unique id assigned to a visitor of a data science tutoring web site\n", "- group: the web site experience given to the visitor: either **A** or **B**\n", "- page_views: the number of different pages on the site the user visited\n", "- time_spent: the number of seconds the user spent on the site\n", "- conversion: **1** if the user purchased a tutoring session and **0** otherwise\n", "- device: the type of device the user used to visit the site\n", "- location: the visitor's general location" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_id group page_views time_spent conversion device location
9032 B 1 50 0 mobile West Midlands
3463 A 3 159 1 mobile South East
3864 A 1 50 0 mobile London
\n", "

... (9997 rows omitted)

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Place the csv file in the same directory as your solution\n", "ab_test = Table().read_table(\"ab_test.csv\")\n", "ab_test.show(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 1 - 2 Points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ellingsen Incorporated, a data science tutoring conglomeration, has developed two different\n", "website experiences for visitors and wants to learn if one experience\n", "is more effective than the other in fostering the purchase of a tutoring session.\n", "A user in **group A** receives one type of website experience and a user in \n", "**group B** receives a different experience.\n", "A **conversion** of 1 means that the visitor purchased a tutoring session and a **conversion** of 0\n", "means that they did not.\n", "\n", "Create **Null and Alternative Hypotheses** for this study." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Null Hypothesis ($H_0$):**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Alternative Hypothesis ($H_a$):**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 2 - 1 Point" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To begin the study, determine and print the conversion rates for both **group A** and **group B**. \n", "A conversion rate can range from 0.000 (0% conversions) to 1.000 (100% conversions).\n", "Also, determine the difference between the two groups and use this\n", "difference as the **Test Statistic**. Your output should look something like this:\n", "\n", "Group A Conversion Rate: 0.ddd \n", "Group B Conversions Rate: 0.ddd \n", "Test Statistic (Group A - Group B): 0.ddd" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "# Place answer here." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 3 - 2 Points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Perform a permutation test by shuffling the **group** column while keeping the **conversion** column fixed. \n", "For each permutation, calculate the test statistic and store the result in an array. \n", "Repeat this process 1000 times.\n", "Print the resulting array (note: the array will show the first 3 numbers, followed by ..., followed by the final 3 numbers)." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "# Place answer here." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 4 - 1 Point" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the array created in question 3, create a histogram that displays the results." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "# Place answer here." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 5 - 1 Point" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calculate and display the p-value, then form a conclusion reporting (1) whether the value is statistically significant,\n", "highly statistically significant or neither and (2) whether the null hypothesis or alternative hypothesis is favored." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "# Place p-value calculation and display here." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Conclusion:**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 6 - 3 Points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Develop an insightful visualization that uses the provided csv file, Chapter 12 knowledge, and any other data science knowledge that you have previously learned. Explain your visualization and what makes it insightful." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Place visualization answer here." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Explanation -**" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.14.2" } }, "nbformat": 4, "nbformat_minor": 4 }