Assignment 4: MatPlotLib
Purpose
The purpose of this lab is to introduce you to using Python's matplotlib library. Matplotlib is
very useful in displaying data. You can extract different data and show it in a way that helps your
reader or user understand the data better.
The data used in the lab was taken from
kaggle.com. The data has information
on video games sold from 1985-2020 (there is one entry for 2020 that has not been released yet). It contains
the number of copies a particular game sold worldwide as well as the number of copies the game sold in
specific regions. Additional infromation in the data set includes name of the game, name of the company that
produced the game, and what platform the game was made for, among other things.
Partners
You are encouraged to work with one other partner on this assignment.
If you work with a partner, you must both be in the same lab section.
Preliminaries
Create a folder named VideoGames on your lab computer.
Place the following Python file and CSV file into this VideoGames folder:
Assignment
Your job in this assignment is to produce different types of graphs, all using the same data set.
You will use the matplotlib library to accomplish this and learn about the different graphs you can create.
This assignment is split into several parts. When implementing your solution, you should follow the steps below.
- In the read_file method, read the data in from vgsales.csv.
- Hint: There are some non-ascii characters in the csv. In Idle, to get around this,
include encoding='utf8' in your call to open file.
- Create a method which produces a pie chart showing the percentage of global sales in each region
(North America, Europe, Japan, Others).
- Make sure the pie chart is circular, the North American region
is "exploded" up (meaning it sits apart from the others), and there is a shadow underneath it.
- Hint: There are methods and settings that will make sure your pie chart is a truly circular as well as
create the "explode" effect and shadow. Look at the following documentation if you get stuck:
matplotlib.org.
- Create a method which produces a scatter plot showing the sales over the years with a focus on global sales
of 0.2 and global sales over 29.5.
- Make sure to annotate the most popular game with its name, i.e. "Wii Sports".
- Also make sure the overall global sale points (the blue ones) are smaller than the global sale points of 0.2 (the red ones)
and the global sale points of 29.5 and higher(the violet ones). Refer to the below pictures to view
the expected output.
- Create a method of your choosing that produces a graph from the given data. It should be creative
and make use of the data in an interesting way.
Your output for the pie chart and scatter plot should look like the following graphs:
Your output for the creative method could look something like this graph:
You will note that on the upper right graph the names of the publishers were sometimes shorten to the first
letter in order to have them all fit correctly. As long as you can justify why you created the graph they way you did,
you should be okay with whatever you design.
Submission
Before lab ends, e-mail a copy of correlation.py to your
lab TA. The subject of the e-mail should be Assignment 4,
your lab time, your name, your partner's name.
To receive credit, this e-mail must be sent before your
lab period finishes. Partial credit can be earned, but late
assignments will not be accepted.
Grading - 100 points possible
- 5 points: Your implementation properly reads the data in from the provided file and stores the data in a list.
- 25 points: Your implementation correctly produces a pie chart that matches the below output.
- 25 points: Your implementation correctly produces a scatter plot that matches the below output.
- 35 points: Your implementation of a different, creative graph from the data is clean, has labels,
and display the information in a way that is easy to interpret. It should utilize the data in an
interesting manner.
- 5 points: Your implementation overall is clean and well formatted.
- 5 points: Your code is well documented. Use documentation strings (""") to comment your functions and block comments (#) for all additional documentation.
Fill in the name, lab description, and date at the top of the code in graphs.py file.