Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
CSE 6242 / CX 4242: Data and Visual Analytics | Georgia Tech | Spring 2025
HW 2: Tableau, D3 Graphs, and Visualization
Homework Overview
The maximum possible score for this homework is 100 points. Students have the option to complete any 90 points’ worth of work to receive 100% (equivalent to 15 course total grade points) for this assignment. They can earn more than 100% if they submit additional work. For example, a student scoring 100 points will receive 111% for the assignment (equivalent to 16.67 course grade points, as shown on Canvas).
a. Every assignment has a generous 48-hour grace period, allowing students to address unexpected minor issues without facing penalties. You may use it without asking.b. Before the grace period expires, you may resubmit as many times as needed.c. TA assistance is not guaranteed during the grace period.d. Submissions during the grace period will display as "late" but will not incur a penalty.e. We will not accept any submissions executed after the grace period ends.
Submission Notes
a. Question 1 will be manually graded after the final HW due date and Grace Period.b. Questions 2-5 are auto graded at the time of submission.
a. The code is free of syntax errors (by running locally)b. All methods have been implementedc. The correct file was submitted with the correct named. No extra packages or files were imported
Do I need to use the specific version of the software listed?
Goal |
Design a table, a grouped bar chart, and a stacked bar chart with filters in Tableau. |
Technology |
Tableau Desktop |
Deliverables |
Gradescope: After selecting HW2 - Q1, click Submit Images. You will be taken to a list of questions for your assignment. Click Select Images and submit the following four PNG images under the corresponding questions:
● table.png: Image/screenshot of the table in Q1.1
● grouped_barchart.png: Image of the chart in Q1.2
● stacked_barchart_1.png: Image of the chart in Q1.3 after filtering data for Max.Players = 2
● stacked_barchart_2.png: Image of the chart in Q1.3 after filtering data for Max.Players = 4
Q1 will be manually graded after the grace period.
|
Install and activate Tableau Desktop by following “HW2 Instructions” on Canvas. The product activation key is for your use in this course only. Do not share the key with anyone. If you already have Tableau Desktop installed on your machine, you may use this key to reactivate it.
If neither of the above methods work, use Tableau for Students. Follow the link and select "Get Tableau For Free". You should be able to receive an activation key which offers you a one-year use of Tableau Desktop at no cost by providing a valid Georgia Tech email.
a. Open Tableau and connect to a data source. Choose To a File – Text file. Select the popular_board_game.csv file from the skeleton.b. Click on the graph area at the bottom section next to "Data Source" to create worksheets.
Table and Chart Design
1. [5 points] Good table design. Visualize the data contained in popular_board_game.csv as a data table (known as a text table in Tableau). In this part (Q1.1), you can use any tool (e.g., Excel, HTML, Pandas, Tableau) to create the table.
We are interested in grouping popular games into "support solo" (min player = 1) and "not support solo" (min player > 1). Your table should clearly communicate information about these two groups simultaneously. For each group (Solo Supported, Solo Not Supported), show:
f. Save the table as table.png. (If you use Tableau, go to Worksheet/Dashboard → Export → Image). NOTE: Do not take screenshots in Tableau since your image must have high resolution. You can take screenshot If you use HTML, Pandas, etc.
Your learning goal here is to practice good table design, which is not strongly dependent on the tool that you use. Thus, we do not require that you use Tableau in this part. You may decide the most meaningful column names, the number of columns, and the column order. You are not limited to only the techniques described in the lecture. For OMS students, the lecture video on this topic is Week 4 - Fixing Common Visualization Issues - Fixing Bar Charts, Line Charts. For campus students, review lecture slides 42 and 43.
2. [10 points] Grouped bar chart. Visualize popular_board_game.csv as a grouped bar chart in Tableau.
a. NOTE: Do not take screenshots in Tableau since your image must have high resolution.
The main goal here is for you to get familiarized with Tableau. Thus, we kept this open-ended, so you can practice making design decisions. We will accept most designs. We show one possible design in Figure 1.2, based on the tutorial from Tableau.
3. [10 points] Stacked bar chart. Visualize the data.world dataset (or games_detailed_info_filtered.csv if using the local files in the skeleton) as a stacked bar chart. Showcase the count of games in different categories and the relationship between game categories, their mechanics, and max player size.
i. Select "2 Players" only in the filter. Save the resulting chart as 'stacked_barchart_1.png'ii. Select "4 Players" only in the filter. Save the resulting chart as 'stacked_barchart_2.png'iii. Both images must include your GT username in the bottom left. This can be added using a text box. Refer to the tutorial here.https://youtu.be/fRwQenvBJ6Iiv. In each image, the filter must be visible. If you are using Tableau Online, you may need to add your worksheet containing the chart to a dashboard and then download an image of the dashboard that contains both the filter and the chart.
Note: To save a dashboard image, go to Dashboard - Export Image. Do not submit screenshots. An example of a possible design is shown in Figure 1.3.
Optional Reading: The effectiveness of stacked bar charts is often debated—sometimes, they can be confusing, difficult to understand, and may make data series comparisons challenging.
Figure 1.2: Example of a grouped bar chart. Your chart may appear different and can earn full credit if it meets all the stated requirements. Your submitted image should include your GT username in the bottom left.
Figure 1.3: Example of a stacked bar chart after selecting "4 Players" in Max.Players filter. Your chart may appear different and can earn full credit if it meets all the stated requirements. Your submitted image should include your GT username in the bottom left.
Important Points about Developing with D3 in Questions 2–5
It is incorrect to use an absolute path such as:
<script type="text/javascript" src="C:/Users/polo/hw2-skeleton/lib/d3.v5.min.js"></script>
Q2 [15 points] Force-directed graph layout
Goal |
Create a network graph shows relationships between games in D3. Use interactive features like pinning nodes to give the viewer some control over the visualization. |
Technology |
D3 Version 5 (included in the lib folder) Chrome v131.0.0 (or higher): the browser for grading your code Python http server (for local testing) |
Allowed Libraries |
D3 library is provided to you in the lib folder. You must NOT use any D3 libraries (d3*.js) other than the ones provided. On Gradescope, these libraries are provided for you in the auto-grading environment. |
Deliverables |
[Gradescope] Q2.(html/js/css): The HTML, JavaScript, CSS to render the graph. Do not include the D3 libraries or board_games.csv dataset. |
You will experiment with many aspects of D3 for graph visualization. To help you get started, we have provided the Q2.html file (in the Q2 folder) and an undirected graph dataset of boardgames, board_games.csv file (in the Q2 folder). The dataset for this question was inspired by a Reddit post about visualizing boardgames as a network, where the author calculates the similarity between board games based on categories and game mechanics where the edge value between each board game (node) is the total weighted similarity index. This dataset has been modified and simplified for this question and does not fully represent actual data found from the post. The provided Q2.html file will display a graph (network) in a web browser.
1. If the value of the edge is equal to 0 (similar), the edge should be gray, thick, and solid (The dashed line with zero gap is not considered as solid).2. If the value of the edge is equal to 1 (not similar), the edge should be green, thin, and dashed.
Note: Regardless of which scale you decide to use, you should avoid extreme node sizes, which will likely lead to low-quality visualization (e.g., nodes that are mere points, barely visible, or of huge sizes with overlaps).
4. [6 points] Pinning nodes:
Node pinning is an effective interaction technique to help users spatially organize nodes during graph exploration. The D3 API for pinning nodes has evolved over time. We recommend reading this post when you work on this sub-question.
IMPORTANT:
5. [1 points] Add GT username: Add your Georgia Tech username (usually includes a mix of letters and numbers, e.g., gburdell3) to the top right corner of the force-directed graph (see example image). The GT username must be a <text> element having the id: "credit"
Figure 2: Example of Visualization with pinned node (yellow). Your chart may appear different and can earn full credit if it meets all the stated requirements.
Q3 [15 points] Line Charts
Goal |
Explore temporal patterns in the BoardGameGeek data using line charts in D3 to compare how the number of ratings grew. Integrate additional data about board game rankings onto these line charts and explore the effect of axis scale choice. |
Technology |
D3 Version 5 (included in the lib folder) Chrome v131.0.0 (or higher): The browser used for grading your code Python HTTP server (for local testing) |
Allowed Libraries |
D3 library is provided to you in the lib folder. You must NOT use any D3 libraries (d3*.js) other than the ones provided. On Gradescope, these libraries are provided for you in the autograder environment. |
Deliverables |
[Gradescope] Q3.(html / js / css): The HTML, JavaScript, CSS to render the line charts. Do not include the D3 libraries or boardgame_ratings.csv dataset. |
VERY IMPORTANT — Beware of “Silent Date Conversion": Opening the csv file in an application like Excel may silently modify date strings without warning you, e.g., converting hyphen-separated date strings (e.g., 2016-11-01) into slash-separated date strings (e.g., 11/01/16). Impacted students would see a “correct” line chart visualization on their local computers, but when they upload their code to Gradescope, test cases will fail (e.g., tick labels are not found, lines are not drawn) because the x-scale cannot be computed (as the dates are parsed as NaN). To view the content of a csv file, we recommend you only use text editors (e.g., sublime text, notepad) that do not silently modify csv files.
2. [5 points] Adding board game rankings. Create a line chart (Figure 3.2) for this part (append to the same HTML page) whose design is a variant of what you have created in part 1. Start with your chart from part 1. Modify the code to visualize how the rankings of ['Catan', 'Codenames', 'Terraforming Mars', 'Gloomhaven'] change over time by adding a circle marker with the ranking text on their corresponding lines. Show the circle marker for every three months and exactly align with the x-axis ticks in part 1. (See Figure 3.2). Add a legend to explain what this circle marker represents next to your chart (See the Figure 3.2 bottom right).
Hint: You may need to carefully set the scale domain to handle the 0s in data.
● First chart (Figure 3.3-1)
○ Chart title: Number of Ratings 2016-2020 (Square root Scale)○ This chart uses the square root scale for its vertical axis (only)○ Other features should be the same as part 2.
● Second chart (Figure 3.3-2)
○ Chart title: Number of Ratings 2016-2020 (Log Scale)○ This chart uses the log scale for its vertical axis (only). Set the y-scale domain minimum to 1.○ Other features should be the same as part 2.
Figure 3.1: Example line chart. Your chart may appear different and can earn full credit if it meets all stated requirements.
Figure 3.2: Example of a line chart with rankings. Your chart may appear different and can earn full credit if it meets all stated requirements.
Figure 3.3-1: Example of a line chart using square root scale. Your chart may appear different and can earn full credit if it meets all stated requirements.
Figure 3.3-2: Example of a line chart using log scale. Your chart may appear different and can earn full credit if it meets all stated requirements.
Q4 [20 points] Interactive Visualization
Goal |
Create line charts in D3 that use interactive elements to display additional data. Then implement a bar chart that appears when you mouse over a point on the line chart. |
Technology |
D3 Version 5 (included in the lib folder) Chrome v131.0.0 (or higher): the browser for grading your code Python http server (for local testing) |
Allowed Libraries |
D3 library is provided to you in the lib folder. You must NOT use any D3 libraries (d3*.js) other than the ones provided. On Gradescope, these libraries are provided for you in the auto-grading environment. |
Deliverables |
[Gradescope] Q4.(html/js/css): The HTML, JavaScript, CSS to render the visualization in Q4. Do not include the D3 libraries or average-rating.csv dataset. |
Use the dataset average-rating.csv provided in the Q4 folder to create an interactive frequency polygon line chart. This dataset contains a list of games, their ratings and supporting information like the numbers of users who rated a game and the year a game was published. In the data sample below, each row under the header represents a game name, year of publication, average rating, and the number of users who rated the game. Helpful resource to work with nested data in D3: https://gist.github.com/phoebebright/3176159
name,year,average_rating,users_rated
1. [3 points] Create a line chart. Summarize the data by displaying the count of board games by rating for each year. Round each rating down to the nearest integer, using Math.floor(). For example, a rating of 7.71148 becomes 7. For each year, sum the count of board games by rating. Display one plot line for each of the 5 years (2015-2019) in the dataset. Note: the dataset comprises year data from 2011 to 2020; this question asks to plot lines for the years 2015-2019. If some of the datapoints in the chart do not have ratings, generate dummy values (0s) to be displayed on the chart for the required years.
All axes must start at 0, and their upper limits must be automatically adjusted based on the data. Do not hard-code the upper limits. Note: if you are losing points on Gradescope for axis or scale, ensure that you are using the proper margin convention without any additional paddings or translations.
• The vertical axis represents the count of board games for a given rating. Use a linear scale.• The horizontal axis represents the ratings. Use a linear scale.
• For each line, use a different color of your choosing. Display a filled circle for each rating-count data point.• Display a legend on the right-hand portion of the chart to show how line colors map to years.• Display the title "Board games by Rating 2015-2019" at the top of the chart.• Add your GT username (usually includes a mix of lowercase letters and numbers, e.g., gburdell3) beneath the title (see example Figure 4.1).
Figure 4.1 shows an example line chart design. Yours may look different but can earn full credit if it meets all stated requirements.
Note: The data provided in average-rating.csv requires some processing for aggregation. All aggregation must only be performed in JavaScript; you must NOT modify average-rating.csv. That is, your code should first read the data from .csv file as is, then you may process the loaded data using JavaScript. If you are getting a MoveTargetOutOfBoundsException, (a) check that your margin convention is correct, and (b) make sure to check the Dropbox linked screenshot of your graph to get a good idea of how the plot could be Improved compared to the sample graph provided.
Figure 4.1: Line chart representing count of board games by rating for each year. Your chart may appear different but can earn full credit if it meets all stated requirements.
Figure 4.2: Bar chart representing the number of users who rated the top 5 board games with the rating 6 in year 2019. Your chart may appear different but can earn full credit if it meets all stated requirements.
Interactivity and sub-chart. In the next few sub-questions, you will create event handlers to detect mouseover and mouseout events over each circle that you added in Q4.2.
3. [8 points] Create a horizontal bar chart, so that when hovering over a circle, that bar chart will be shown below the line chart. The bar chart displays the top 5 board games that received the highest numbers of user ratings (users_rated), for the hovered year and rating. For example, hovering over the rating-6 circle for 2019 will display the bar chart for the number of users who rated the top 5 board games. If a certain year/rating combination has fewer than 5 entries, it should display as many as there are. Figure 4.2 shows an example design. Show one bar per game. The bar length represents the number of users who rated the game.
Note: No bar chart should be displayed when the count of games is 0 for hovered year and rating.
• The vertical axis represents the board games. Sort the game names in descending order, such that the game with the highest users_rated is at the top, and the game with the smallest users_rated is at the bottom. Some boardgame names are quite long. For each game name, display its first 10 characters (if a name has fewer than 10 characters, display them all). A space counts as a character. The horizontal axis represents the number of users who rated the game (for the hovered year and rating). Use a linear scale.• Set horizontal axis label to 'Number of users' and vertical axis label to 'Games'.
4. [2 points] Bar styling, grid lines and title
• Bars: All bars should have the same color regardless of year or rating. All bars for the specific yearshould have a uniform bar thickness.• Grid lines should be displayed.• Title: Display a title with the format "Top 5 Most Rated Games of <Year> with Rating <Rating>" at thetop of the chart where <Year> and <Rating> are what the user hovers over in the line chart. Forexample, hovering over rating 6 in 2015, the title would read: "Top 5 Most Rated Games of 2015 withRating 6"
• The bar chart and its title should only be displayed during mouseover events for a circle in the line chart.• The circle in the line chart should change to a larger size during mouseover to emphasize that it is the selected point.• When count of games is 0 for hovered year and rating, no bar chart should be displayed. The hoveredover circle on the line graph should still change to a larger size to show it is selected.
Hint: .attr() is generally used for describing the size, shape, location, etc. of an element, whereas .style() is used for other design aspects like color, opacity, etc.
6. [2 points] Mouseout Event Handling
The bar chart and its title should be hidden from view on mouseout and the circle previously mouseover-ed should return to its original size in the line chart.
The graph should exhibit interactivity similar to Figure 4.6 where the mouse is over the larger circle.
Figure 4.6: Line chart and bar chart demonstrating interactivity. Your chart may appear different, but you can earn full credit if it meets all stated requirements.
Q5 [25 points] Choropleth Map of Wildlife Trafficking Incidents
Goal |
Create a choropleth map in D3 to explore the number of wildlife trafficking incidents per country by year. |
Technology |
D3 Version 5 (included in the lib folder) Chrome v131.0.0 (or higher): the browser for grading your code Python http server (for local testing) |
Allowed Libraries |
D3 library is provided to you in the lib folder. You must NOT use any D3 libraries (d3*.js) other than the ones provided. On Gradescope, these libraries are provided for you in the auto-grading environment. |
Deliverables |
[Gradescope] Q5.(html/js/css): Modified file(s) containing all html, javascript, and any css code required to produce the plot. Do not include the D3 libraries or csv files. |
We have provided two files in the Q5 folder, wildlife_trafficking.csv and world_countries.json.
- Each row in wildlife_trafficking.csv represents the number of wildlife trafficking incidents per country in a given year, in the form of <Year,Country,Number of Incidents,Average Fine,Average Imprisonment>, where
- Year: the year in which the wildlife trafficking incidents occurred
- Country: a country in the world, e.g., United States of America.
- Number of Incidents: the number of wildlife trafficking incidents that occurred in Country in Year.
- Average Fine: the average fine in USD for wildlife traffickers caught for incidents occurring in Country in Year.
- Average Imprisonment: the average imprisonment term in years for wildlife traffickers caught for incidents occurring in Country in Year.
- The world_countries.json file is a geoJSON, containing a single geometry collection: countries. You can find examples of map generation using geoJSON here.
• The list options should be obtained from the Year column of the csv file.• Sort the list options in increasing order. Set the default display value to the first option.• Selecting a different year from the dropdown list should update both the choropleth map (see part 2) and the legend (see part 3) accordingly.
Hint: If failing to load map, you may try this method.
b. [10 points] Load the data from wildlife_trafficking.csv and create a choropleth map such that the color of each country in the map corresponds to the number of wildlife trafficking incidents for the year selected in the dropdown in each country. Use a "Natural Earth" projection of the geoJSON file. You MUST name the function for calculating path as 'path', to help the auto-grader locate it. Create a projection first and use it as an input for the path.
Promise.all() is provided within the skeleton code and you can use it to read in both the world json file and game data csv file. Usage example: Making a Map in D3.js v.5.
Many countries have no incidents for some years — these should be colored gray. For countries that do have incidents in the selected year, use a quantile scale to generate the color scheme based on the average rating by country. Color them along a gradient of exactly 4 gradations from a single hue, with darker colors corresponding to higher rating values and lighter colors corresponding to lower values (see gradient examples at Color Brewer).
About Scaling Colormaps: In order to create effective visualizations that highlight patterns of interest, it is important to carefully think about the relationship between the range and distribution of values being displayed (the domain) and the color scale the values are mapped to (the range). Many types of mapping functions are possible, e.g., we could use a linear mapping where the lowest incident count is mapped to the first value in the color scheme, the highest incident count is mapped to the highest value in the color scheme, and intermediate incident counts are mapped to hues in the middle. This article illustrates the value of choosing appropriate endpoints for linear color maps, or log-scaling the domain so that large but relatively infrequent values do not cause differences between smallervalues to be washed out. In our case, we can compute quantiles of the domain data into roughlyequally sized groups. Here, we will get 4 groups, a special case of quantiles called "quartiles" sincethe data are divided into quarters.
Hint: You can verify the correctness of the quartiles generated by using the 'quartile' function in Excel. Open wildlife_trafficking.csv and filter the data for one year (say 2021). Then use the quartile function to get the 0th, 1st, 2nd, 3rd and 4th quartile values from the Incident Count column. Here [0th quartile, 1stquartile), [1st quartile, 2nd quartile), [2nd quartile, 3rd quartile), [3rd quartile, 4th quartile] will represent the 4 groups of values generated by the d3 quantile scale. Use all the countries listed inwildlife_trafficking.csv to generate your quartiles depending on the selection, including ones whichmay not appear in the geoJSON.
2. [5 points] Add a tooltip using the d3-tip.min library (in the lib folder). On hovering over a country, the tooltip should show the following information on separate lines:
• Country name• Year• Number of Incidents in that year in that country• Average Fine in USD for wildlife traffickers for incidents in that year in that country• Average Imprisonment in years for wildlife traffickers for incidents in that year in that country
For countries with no data, the tooltip should display "N/A" for Number of Incidents, Average Fine, and Average Imprisonment.
Note: Please ensure that you only have a single tooltip element defined in your code. You should notcreate new tooltip elements for different countries; rather, update the contents, position, and visibility of asingle tooltip.
Figure 5.1: Reference example for Choropleth Map showing number of incidents in 2021. Your chart may appear different, but you will earn full credit as long as it meets all stated requirements.
Figure 5.2: Reference example for Choropleth Map showing tooltip for USA. Your chart may appear different, but you will earn full credit as long as it meets all stated requirements.
• Countries without data should be colored gray. These countries can be found using a condition thatcompares the country's average rating with 'undefined'.• It is optional for your visualization to show (or not show) Antarctica.• D3-tip warning may be ignored if it does not break the code.• You may consider clearing the SVG and creating a new map when selecting a new game.