Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
INFS5705 – AI for Business Analytics in Practice
Individual Hands-on Assignment – Using Python or R
In this hands-on assignment you are required to conduct AI-driven business analytics of structured and unstructured data using Python or R and submit a report on Moodle course site through Turnitin. The due date of this assignment is on Monday 14th of July 2025 at 5:00pm (AEST).
Please note that this assignment is worth 30% of your overall course mark.
1. Requirements
Amazon, a global leader in e-commerce, operates a robust smartphone marketplace offering a wide range of new and refurbished smartphones from top brands and suppliers. With a focus on competitive pricing, fast shipping options, and customer-centric services, Amazon aims to provide seamless shopping experiences worldwide. However, low sales performance for some smartphones, particularly refurbished models, pose challenges, leading to financial losses, operational inefficiencies, and environmental concerns due to e-waste from unsold refurbished smartphones. Furthermore, Manual handling (sorting based on visible damage) of used smartphones, usually received in bulk from a variety of sources, is labour-intensive and error-prone, impacting profitability and sustainability goals.
To address these issues, Amazon aims to leverage AI-driven analytics to predict best selling products and efficiently classify used smartphones, thereby reducing manual handling costs, improving sustainability, and enhancing customer trust.
In a hypothetical scenario, you are working as a Business Analyst at Amazon, and you have been tasked by your manager to conduct AI-driven business analytics (using Python or R). To address the business problem faced by this company, you have decided to:
• Apply Machine Learning to analyse structured data using TWO (2) Machine Learning Algorithms to predict whether products would be best-sellers or not.
• Apply Machine Learning to analyse unstructured data using ONE (1) Machine Learning Algorithm to identify damage in used smartphones from images.
Please note that the task can includes data pre-processing before conducting the analysis. The dataset required to be analysed includes both structured and unstructured data.
The structured data consists of a large dataset of listings available on Moodle as a CSV file called amazon_phones_sales_dataset.csv under the Assessments Hub section.
The structured dataset includes the following variables:
• ASIN: Amazon Standard Identification Number, a unique product identifier.
• Color: The available color of the phone.
• Ratings: User ratings (out of 5 stars).
• Number of Ratings: Total count of ratings the product has received.
• Price: The price in USD of the phone.
• Discount: The percentage discount offered on the product.
• Brand: The brand name of the phone (e.g., Apple, Samsung).
• OS: The operating system of the phone (e.g., Android, iOS).
• CPU Model: The model of the processor used in the phone.
• Resolution: The screen resolution of the phone.
• Name: The product name as listed on Amazon.
• Wireless Carrier: The supported wireless carrier (e.g., Verizon, AT&T).
• Cellular Technology: The cellular network technology (e.g., 4G, 5G).
• Model: The model number of the phone.
• Amazon Renewed: Indicates whether the product is part of the Amazon Renewed program (refurbished).
• Battery Capacity: The capacity of the phone’s battery (in mAh).
• Battery Power: The power rating of the battery.
• Charging Time: Time taken to charge the phone fully.
• RAM: The amount of RAM in the phone.
• Storage: Internal storage capacity of the phone.
• Screen Size: Size of the display (in inches).
• Connectivity Technologies: Wireless technologies supported by the phone (e.g., Bluetooth, Wi- Fi).
• Wireless Network: Type of wireless networks supported (e.g., Wi-Fi 6).
• CPU Speed: The speed of the phone’s CPU (in GHz).
• Best Seller Last Month: Indicates whether the product has been among the best selling phones over the past month.
The unstructured data consists of a collection of 820 images collected by an industrial camera scanning each used smartphone to detect scratched screens as defective. Non-defective smartphones are then considered for refurbishment. This dataset is provided on Moodle as a ZIP file called used_smartphone_images.zip under the Assessments Hub section. Please note that the data is not labelled (not sorted).
1.1. Deliverable
In this assignment you are required to submit a Business Report (in Word format) including the discussion of the findings of your analysis along with justifications and actionable insights.
Your Business Report (3000-word limit) must include the following components:
1. Analysis of Structured Data: predicting best-selling items
a) Rationale of the choice of ML algorithms (10 marks, 300 words)
Provide justifications for your choice of TWO (2) relevant Machine Learning algorithms to achieve the business objective of Amazon to predict whether products would be best- sellers or not.
b) Data pre-processing and selection of the key features (10 marks, 300 words)
Discuss the data pre-processing tasks you decide to undertake along with justifications. How do you handle missing values, outliers, duplicates, data transformations, etc.
Explain the selection of the most relevant features (key variables) in the dataset required for the ML models, along with justifications.
Discuss the approach you undertake to create the training/testing datasets.
c) Training the TWO (2) ML models (10 marks, 300 words)
Provide an explanation of how each machine learning model is trained. This includes discussing the steps involved, and interpretation of the outputs of the models.
Please include the Python or R code.
d) Testing and evaluating the TWO (2) ML models (15 marks, 350 words)
Provide an explanation of how each machine learning model is tested. Please include the Python or R code.
Evaluate and compare the performance of the TWO (2) ML models in making predictions, along with an explanation of the approach implemented to evaluate their performance. Provide justifications of the selection of the best performing ML model.
e) Discuss the findings of your analysis and derive actionable insights (15 marks, 450 words)
Discuss the key findings of your analysis along with plausible explanations.
Derive actionable insights from the key findings and translate them into recommendations on how to address the business problem.
2. Analysis of Unstructured Data: detecting e-waste
a) Rationale of the choice of ML algorithm (10 marks, 300 words)
Provide justifications of your choice of ONE (1) relevant Machine Learning algorithm to achieve the business objective of Amazon to identify damage in used smartphones from images.
b) Data transformation (feature extraction) and selection of model parameters (10 marks, 300 words)
Discuss the process of feature extraction and the selection of model parameters (e.g. optimal number of clusters). Discuss the findings.
c) Implementation of the ML model (10 marks, 300 words)
Explain the implementation of the Machine Learning model.
Please include the Python or R code
d) Discuss the key findings of your analysis and derive actionable insights (10 marks,
300 words)
Discuss the findings of your analysis along with plausible explanations.
Derive actionable insights from the key findings and translate them into recommendations on how to address the business problem.
NB. Throughout the business report, your arguments should be justified and supported with plausible explanations for the key findings and relevant examples.
1.2. Formatting
Word Limit
Each section of the analytics report has a word limit, as indicated in 1.1. Deliverable. The distribution of word count proportionally reflects the complexity and significance of each section, totalling a maximum word length of 3,000 words. There is a (+10%) leeway in word limits for each section.
Please note that tables, figures, diagrams, code and references are excluded from the word count. You should be mindful of the marks awarded to each section, as indicated in 1.1. Deliverable, when allocating the number of words spend on each section.
Please note that material presented in excess of the word limit for each section will not be considered when grading the assignment.
Formatting
The analytics report should be in ‘business report’ style (in Word format) with the following requirements:
• Arial 12-point font
• 1.5 spacing
• Page numbers on each page
• Individual Assignment 1 Cover page included (provided on Moodle)
• All required sections included, as indicated in 1.1. Deliverable
Feel free to make whatever use of tables, figures, and diagrams that you believe appropriate and relevant to support your work. Tables, figures, diagrams, code and references do not count towards the word limit.
1.3. Submission
Upload your Business Report document (in Word format) on Moodle.
• You can only upload one report document.