Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
CS 439: Introduction to Data Science – Project Description
Recommender System
1. Overview
domains from Amazon (e.g., cell phones, clothing, beauty, etc.), and each product domain corresponds to a dataset, the structure of a dataset will be exposited in the next section; for the movie recommendation dataset, we have a “small” dataset (1M) and a “full” dataset (265M). You only need to work on one dataset, whichever you like.
After you select a particular dataset to work on, this project will mainly consist three steps: 1) Data Processing, create the training and testing dataset from the original dataset; 2) Conduct rating prediction and make evaluation based on MAE and RMSE; 3) Conduct Top-N Recommendation and make evaluation based on Precision, Recall, Fmeasure, and NDCG.
2. Dataset
A complete Amazon dataset publicly available online (https://nijianmo.github.io/amazon/index.html). This dataset contains user reviews (numerical rating and textual comment) towards amazon products on 29 product categories, and there is an independent dataset for each product category. We use the “Small subsets for experiment” (the 5-core dataset) on the website, which can be downloaded directly from the website. You can select one product domain to work on.
The structure of the dataset has been explained on the website with detailed examples. Basically, each entry in a dataset is a user-item interaction record, including the following fields:
- user-id: which is denoted as “reviewerID” in the dataset
- product-id: which is denoted as “asin” in the dataset
- rating: a 1-5 integer star rating, which is the rating that the user rated on the product, it is denoted as “overall” in the dataset
- review: a piece of review text, which is the review content that the user commented about the product, it is denoted as “reviewText” in the dataset
- title: the title of the review, which is denoted as “summary” in the dataset
- timestamp: time that the user made the rating and review
- helpfulness: contains two numbers, i.e., [#users that think this review is not helpful, #users that think this review is helpful]
- Image: for each product, the dataset product the image of the product in a form of a 4096-dimensional vector that is learned by a CNN deep neural network (these vectors are provided in an independent dataset “Visual Features”, also in the website, which is very large)
- Metadata: some metadata information for each product, including product title, price, image URL, brand, category, etc. It is also provided as an independent dataset (“Metadata”), which is also very large.
The dataset can be downloaded from here:
https://grouplens.org/datasets/movielens/latest/
It includes the user_ID, item_ID, user-item ratings, time, user-movie tags, as well as the metadata of the movies, such as title and genre. Details can be seen in this readme file: http://files.grouplens.org/datasets/movielens/ml-latest README.html
Depending on the algorithm that you want to design, you may not use all the information available in the dataset. In the simplest case, you may only use the user-id, item-id, and ratings to finish the project, but these information is sufficient enough to design very complicated algorithms (most collaborative filtering and matrix factorization algorithms are only based on ratings).
If you want to design more advanced recommendation algorithms that may achieve better prediction and recommendation performance, you may use other information sources such as review text (based on NLP techniques), timestamps (for time-aware recommendation), images (for visual recommendation), or metadata (for content-based recommendation).
3. Required Tasks
First you need to select a dataset. If you are not very familiar with very large scale data processing or your computing facility (e.g. your laptop) is not powerful enough to process very big dataset, you may select relatively smaller dataset to work on.
After you select a dataset, you need to create a training dataset and a testing dataset from therein for the experiment. A recommended standard pre processing strategy is that: for each user, randomly select 80% of his/her ratings as the training ratings, and use the remaining 20% ratings as testing ratings. At last, the training ratings from all users consist the final training dataset, and the testing ratings from all users consist the final testing dataset.
Based on the training dataset, i.e., the information that we treat as already known, you should develop a model/algorithm to conduct rating prediction, i.e., to predict the ratings in the testing set as if we didn’t know them. You may use any existing popular algorithm (e.g., user-based CF, item-based CF, Slope One, Matrix Factorization) or develop new algorithms by yourself.
After predicting the ratings in the testing set, evaluate your predictions by calculating the MAE1 and RMSE2 .
The final step is to create a recommendation list (a ranking list of recommended items) for each user, and the length of recommendation list should be 10. Note that, the recommended items should be items that the user didn’t purchase before, i.e., you should avoid recommending an item that the user has already rated in the training dataset, instead, your algorithm should try the best to recommend the items in the testing set.
A simple strategy to create such a recommendation list for a user is to predict the ratings on all the items that user didn’t buy before (as in step 2), then rank the items in descending order of the predicted rating, and finally take the top 10 items as the recommendation list. Of course, you may develop other recommendation algorithms to create a recommendation list.
After the recommendation list is created (we called it a top-10 recommendation list), you should evaluate the quality of the recommendation list. Remember that you have already holdout 20% purchased items for each user as testing items, then you can calculate the following measures for evaluation:
- Precision: percentage of testing items in the 10 recommended items, calculate the precision for each user first, then average the numbers from all users
- Recall: percentage of recommended testing items in all the testing items for a user, also average the recall of all users to get the final recall
- F-measure: F=2*Precision*Recall / (Precision + Recall)
- NDCG: Normalized Discounted Cumulative Gain3 .
4. Optional Tasks
a. Transparency and Explainability of Recommender Systemsb. Fairness and Unbiases of Recommender Systemsc. Controllability of Recommender Systemsd. Privacy Protection for Recommender Systemse. Robustness and Anti-attacks for Recommender Systems
This is a totally open-ended task, you can feel free to come up with your project problem, solution, and evaluation methods for any one or more of the above trustworthiness perspectives. Depending on the quality of implementation, you will earn up to 5 bonus points for the optional tasks.
Project Submission
2. A project report written using the provided latex template
Note: if you have done optional tasks, please clearly state that by writing a specific section dedicated to your optional task in the project report).
3. The presentation slides of your project
All of the above documents should be put into one single folder and compressed into one .zip file, and name the zip file using the NetID of all team members, e.g., NetID1_NetID2_NetID3.zip.
References
2. There are a lot of research papers using this dataset, some of the examples are listed in the following, you may refer to these papers if you want to try something cool and develop the recommendation algorithm for yourself.
b. Towards conversational search and recommendation: System ask, user respond (https://dl.acm.org/citation.cfm?id=3271776)
c. Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) (https://arxiv.org/abs/2203.13366)
d. Neural Collaborative Reasoning (https://arxiv.org/abs/2005.08129)
e. Causal Collaborative Filtering (https://arxiv.org/abs/2102.01868)
f. User-oriented Fairness in Recommendation (https://arxiv.org/abs/2104.10671)
g. Personalized Transformer for Explainable Recommendation (https://aclanthology.org/2021.acl-long.383/)