Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

COMP9313 Big Data Management

Course Details & Outcomes

Course Description

This course introduces the core concepts and technologies involved in managing Big Data. It will first introduce the characteristics of big data and big data analysis. Then, we will learn the open-source big data management framework Hadoop. We will mainly focus on Hadoop MapReduce programming. YARN, HDFS, HBase, and Hive will be briefly introduced as well. We will also learn an open-source memory-based distributed computing framework Spark. Another major focus of this course is algorithm design on large-scale data sets based on big data management frameworks, in various domains such as data stream mining, graph data processing, and finding similar items.

Course Aims

This course aims to introduce students to the concepts behind Big Data, the core technologies used in managing large-scale data sets, and a range of technologies for developing solutions to large-scale data analytics problems.

This course is intended for students who want to understand modern large-scale data analytics systems. It covers a wide range of topics and technologies. It will prepare students to be able to build such systems as well as use them efficiently and effectively to address challenges in big data management.

Course Learning Outcomes

Course Learning Outcomes
CLO1 : Describe the important characteristics of Big Data
CLO2 : Develop an appropriate storage structure for a Big Data repository
CLO3 : Utilise the map/reduce paradigm and the Spark platform to manipulate Big Data
CLO4 : Use a high-level query language to manipulate Big Data
CLO5 : Develop efficient solutions for analytical problems involving Big Data

Course Learning Outcomes	Assessment Item
CLO1 : Describe the important characteristics of Big Data	Final Exam
CLO2 : Develop an appropriate storage structure for a Big Data repository	Coding Project 1 Coding Project 3 Final Exam
CLO3 : Utilise the map/reduce paradigm and the Spark platform to manipulate Big Data	Coding Project 2 Coding Project 1 Coding Project 3 Final Exam
CLO4 : Use a high-level query language to manipulate Big Data	Coding Project 2
CLO5 : Develop efficient solutions for analytical problems involving Big Data	Coding Project 2 Coding Project 3 Final Exam

Learning and Teaching Technologies

Moodle - Learning Management System | Blackboard Collaborate

Assessments

Assessment Structure

Assessment Item	Weight	Relevant Dates
Coding Project 1 Assessment FormatIndividual	12%	Start DateNot Applicable Due DateWeek 4: 17 June - 23 June
Coding Project 2 Assessment FormatIndividual	16%	Due DateWeek 7: 08 July - 14 July
Coding Project 3 Assessment FormatIndividual	22%	Due DateWeek 10: 29 July - 04 August
Final Exam Assessment FormatIndividual	50%	Due DateTBA during Exam Week

Assessment Details

Coding Project 1
Assessment Overview

This coding project assesses the student's MapReduce programming skills. It will be assessed manually by course tutors according to a rubric. The feedback will be provided in Moodle to students in the format of comments on the students' submissions.
Course Learning Outcomes

CLO2 : Develop an appropriate storage structure for a Big Data repository

CLO3 : Utilise the map/reduce paradigm and the Spark platform to manipulate Big Data

Coding Project 2
Assessment Overview

This coding project assesses the student's Spark programming skills. It will be assessed manually by course tutors according to a rubric. The feedback will be provided in Moodle to students in the format of comments on the students' submissions.
Course Learning Outcomes

CLO3 : Utilise the map/reduce paradigm and the Spark platform to manipulate Big Data

CLO4 : Use a high-level query language to manipulate Big Data

CLO5 : Develop efficient solutions for analytical problems involving Big Data

Coding Project 3
Assessment Overview

This coding project assesses the student's Spark programming skills, using a real cloud computing platform such as Google Dataproc. It will be assessed manually by course tutors according to a rubric. The feedback will be provided in Moodle to students in the format of comments on the students' submissions.
Course Learning Outcomes

CLO2 : Develop an appropriate storage structure for a Big Data repository

CLO3 : Utilise the map/reduce paradigm and the Spark platform to manipulate Big Data

CLO5 : Develop efficient solutions for analytical problems involving Big Data

Final Exam
Assessment Overview

The final exam assesses the students' MapReduce and Spark programming skills, as well as algorithm design for big data analytics. The exam will be marked by course tutors manually according to a rubric. The feedback is provided upon students' request.
Course Learning Outcomes

CLO1 : Describe the important characteristics of Big Data

CLO2 : Develop an appropriate storage structure for a Big Data repository

CLO3 : Utilise the map/reduce paradigm and the Spark platform to manipulate Big Data

CLO5 : Develop efficient solutions for analytical problems involving Big Data

General Assessment Information

Later Submission Penalties:

5% reduction of your marks for up to 5 days

The final mark is calculated by:

Final Mark= proj1 + proj2 + proj3 + FinalExam
Double Pass: You also need to achieve at least 20 marks in the final exam to pass the course.

Grading Basis

Standard

Course Schedule

Teaching Week/Module	Activity Type	Content
Week 1 : 27 May - 2 June	Lecture	Course information + introduction to big data
Week 2 : 3 June - 9 June	Lecture	Hadoop MapReduce 1
Week 3 : 10 June - 16 June	Lecture	Hadoop MapReduce 2
Week 4 : 17 June - 23 June	Lecture	Spark 1
Week 5 : 24 June - 30 June	Lecture	Spark 2
Week 6 : 1 July - 7 July	Lecture	Recess Week
Week 7 : 8 July - 14 July	Lecture	Finding Similar Items
Week 8 : 15 July - 21 July	Lecture	Mining Data Streams
Week 9 : 22 July - 28 July	Lecture	Graph Data Management
Week 10 : 29 July - 4 August	Lecture	NoSQL, HBase, and Hive/Revision and exam preparation

Attendance Requirements

Students are strongly encouraged to attend all classes and review lecture recordings.

General Schedule Information

The table summarises the planned weekly activities for the course. These are tentative. Please refer to the relevant sections of the course homepage for the most up-to-date information about the weekly schedule throughout the course delivery period.

Course Resources

Recommended Resources

The textbooks include:

Hadoop: The Definitive Guide . Tom White. 4th Edition - O'Reilly Media
Data-Intensive Text Processing with MapReduce . Jimmy Lin and Chris Dyer. University of Maryland, College Park.
Mining of Massive Datasets . Jure Leskovec, Anand Rajaraman, Jeff Ullman . 2nd edition - Cambridge University Press
Learning Spark . 1st and 2nd Edition - O'Reilly Media

Other references include:

Course Evaluation and Development

According to the feedback, the students mentioned the need of more examples. In this term, we will modify the slides and lecture as required.

文章

COMP9313 Big Data Management

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

COMP9313 Big Data Management

Course Details & Outcomes

Course Description

Course Aims

Course Learning Outcomes

Learning and Teaching Technologies

Assessments

Assessment Structure

Assessment Details

Assessment Overview

Course Learning Outcomes

Assessment Overview

Course Learning Outcomes

Assessment Overview

Course Learning Outcomes

Assessment Overview

Course Learning Outcomes

General Assessment Information

Grading Basis

Course Schedule

Attendance Requirements

General Schedule Information

Course Resources

Recommended Resources

Course Evaluation and Development

发表评论