ITS70504 Data Engineering Assignment 1

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

ITS70504

Data Engineering

Assignment 1 - Individual

Subject Code             ITS70504

Subject Name Data Engineering

Weightage 15%

Assignment Individual Assignment

Handout Date 18th June 2024

Submission Date 1st July 2024

Learning outcomes assessed by this assignment

1. Demonstrate the concept of data engineering.

Instructions

1. This individual assignment carries 15% of the total marks available for the module.

2. The output of this assignment is in terms of a report written within the range of between 1800 to 2500 words.

3. You are required to submit your report in softcopy (through MyTIMeS-Moodle). Kindly ensure your name and ID are written on the cover sheet.

4. Using AI tools is not recommended. But, in the case of using any AI tools, below instructions must be followed:

a. The AI tool must be cited properly.

b. The output of the AI tool must be interpreted.

c. At least 3 improvements to the AI-suggested answers must be discussed.

d. Student needs to present the report in the class, defend the answers, and provide required justifications if it is needed.

5. 0 mark and barring from sitting the final examination may be implemented for those who do not submit any assignments.

GENTLE REMINDER:

- Plagiarism is a serious offence and plagiarized work will result in an F grade.

- Failure to submit the report by the deadline shall be penalized with zero marks.

Assessment Criteria

Assessment Task

Weightage

MLO

Assessed

Formative/ Summative

Assessment Instrument

Topics

Week

MQF2.0

Assignment 1

15%

MLO 1

Formative

Individual

Assignment. Case Study

1,2, 3

3

C1, C2, C3A

C =Knowledge & nderstanding, C=Cognitive Skills, C= Practical Skills, C3= Interpersonal Skills, C3C=Communication Skills, C3D= Digital SkillsC3ENumeracy Skills, C3FLeadership, Autonomy & Responsibility, C4APersonal Skills, C4B=Entrepreneurial Skills, C5Ethics & Professionalism

Case Study: Data Engineering at a Retail Company

A large retail company, ShopEase, operates both online and offline stores, serving millions of customers globally. The company has accumulated vast amounts of data from various sources, including customer transactions, online browsing behaviours, inventory management systems, and social media interactions. The data is stored in different formats and systems, creating challenges for analysis and decision-making. ShopEase aims to improve its data infrastructure to enable better data-driven decisions, enhance customer experience, and optimise operations.

Answers the following questions:

Criteria 1: Introduction to Data Engineering (20%)

1)   Discuss the concept of data engineering and explain its importance in the context of

ShopEase.

a.   Discuss what is data engineering in a general context.                                 (10 Marks)

b.   Explain how data engineering can help ShopEase manage and utilize its vast amount of data effectively.      (10 Marks)

Criteria 2: Data Preprocessing: Concepts & Techniques (20%)

2)   Identify and describe Four (4) data preprocessing techniques that would be critical for preparing ShopEase's data for analysis.

a.   Explain each technique and its relevance to the case study.                         (10 Marks)

b.   Provide examples of how these techniques can be applied to ShopEase's data.    (10 Marks)

Criteria 3: Data Storage Technologies (20%)

3)   Evaluate different data storage technologies suitable for ShopEase’s diverse data types (structured, semi-structured, unstructured).

a.   Compare at least TWO (2) data storage technologies.                                  (10 Marks)

b.   Discuss the advantages and disadvantages of each in the context of ShopEase’sneeds. (10 Marks)

Criteria 4: Big Data Frameworks for Data Engineering (20%)

4)   Recommend a big data framework that ShopEase should adopt for its data processing needs. Justify your choice.

a.   Describe the selected big data framework.                                                   (10 Marks)

b.   Explain why it is suitable for handling ShopEase’s large-scale data processing.    (10 Marks)

Criteria 5: Distributed File System (20%)

5)   Explain the role of a distributed file system in ShopEase’s data architecture.

a.   Define a distributed file system.                                                                    (10 Marks)

b.   Discuss its benefits and potential challenges for ShopEase.                         (10 Marks)

Note 1: Using tables to summarise your points and adding figures to provide some illustrations are highly recommended.

Note 2: In the case of using any figures or tables from any external sources, proper citations must be added to their captions and inside your description of those figures/tables.

Note 3: The total achieved marks  will be capped at 15 marks

Deliverables

The output should be in terms of:

1.   Assignment Report (Softcopy in PDF)

2.   Cover page

3.   References (APA Referencing Stylewww.apa.org or http://www.apastyle.org/index.aspx or https://owl.english.purdue.edu/owl/resource/560/01/).

4.   Report of similarity (maximum accepted similarity is 20%).

发表评论

电子邮件地址不会被公开。 必填项已用*标注