CEG5201 Hardware Technologies, Principles, & Platforms
(Semester I, AY2023/2024)
CA-2 Statement
Release date: 25-27 Sept, 2023
Consultation date: 16th Oct 2023
Submission deadline: 13 rd Nov 2023 11:59pm
Instructions to candidates:
The CA2 Canvas submission deadline is midnight as shown above. No late submission is considered, try NOT to do last-minute submissions to prevent traffic congestion.
I. Objective
In this assignment, we will attempt to benchmark various matrix multiplication algorithms across sequential and multi-processinf methods. To achieve this goal, you need to familiarize yourself with the Python multiprocessing package. Please read through the following materials first, which introduce the basics of the package.
The pool class -https://superfastpython.com/multiprocessing-pool-python/
Multiprocessing pool example -https://superfastpython.com/multiprocessing-pool-example/
Threadpool vs. pool class differences - https://superfastpython.com/threadpool-vs-pool-in- python
You may also access other useful sites, if any, to learn. Do share such new links by posting on the Discussion on Canvas so that it will be useful for your classmates.
II. Project overview
We will attempt to implement matrix multiplication through four different algorithms (Strassen’s algorithm [1], Coppersmith–Winograd algorithm [2], Cannon’s algorithm [3], and Fox’s algorithm [4]) in this project, and each team member focuses on only one algorithm and implement in both sequential and multi-process ways. The project is composed of individual tasks and joint tasks as shown in Table I. Blue shaded parts in the table are individual tasks and yellow shaded parts are joint tasks.
Table I. Overview of the tasks
|
Task ID |
Task description |
|
A |
Matrices generation |
|
B-n |
Sequential implementation |
|
C-n |
Multiprocessing implementation |
|
D-n |
Determining the ultimate speed-up |
|
E |
Comparison of the algorithms |
“n” in Task B, C, and D stands for each team member is calculated as “(group ID + member ID) %4 +1” (“%” here stands for the modulo operator), and the algorithm corresponding to “n” is listed in Table II. Please refer to Appendix A for general ideas of the four algorithms.
Table II. List of the algorithms used
|
S/N (n) |
Algorithm |
|
1 |
Strassen’s algorithm |
|
2 |
Coppersmith–Winograd algorithm |
|
3 |
Cannon’s algorithm |
|
4 |
Fox’s algorithm |
III. Task description
[O] Getting started
As a team, first fix the hardware platform that you are going to use – multi-core CPU / GPU before doing individual coding.
[A] Matrix generation
Randomly generate 8 pairs of matrices (Ai , Bi ), i ∈ [0, 1, … 7] with Mi ×Mi size where M= [16, 32, 64, 128, 256, 512, 1024, 2048]. The value of each element should be between 0 and 255 (integers only). Generate 10 groups of such matrix pairs Gj , j ∈ [0, 1, … 9]. All the team members need to use the same set of G for their subsequent tasks.
[B-n] Sequential Processing
[B-n.1] Perform multiplications for all 8 matrix pairs in G0 sequentially. Record the time taken for processing each pair and the total time. This is the sequential processing time of G0 and please present the obtained results in Table B-n.1 shown below. Please put the table in your report exactly as it appears.
Table B-n.1: Processing time of Go under sequential implementation
|
Pair Index |
Measured Sequential Time |
Measured Cumulative Sequential time |
|
0 |
t0 |
t0 |
|
1 |
t1 |
t0 + t1 |
|
2 |
t2 |
t0 + t1 + t2 |
|
… |
… |
… |
|
7 |
t7 |
t0 + t1 + ⋯ + t7 |
[B-n.2] Repeat the above processes for all the 10 groups. This is the sequential processing time of all groups and please present the results obtained in Table B-n.2 shown below. Please put the table in your report exactly as it appears.
Table B-n.2: Processing time of all groups under sequential implementation
|
Group Index |
Measured Sequential Time |
Measured Cumulative Sequential time |
|
0 |
t0 |
t0 |
|
1 |
t1 |
t0 + t1 |
|
2 |
t2 |
t0 + t1 + t2 |
|
… |
… |
… |
|
9 |
t9 |
t0 + t1 + ⋯ + t9 |
[C-n] Multiprocessing
Identify the bottleneck in the sequential processing and try to accelerate it using multiple processes.
[C-n.1] Record the time it takes to complete the processing of G0 by varying the number of processes. Please present the obtained results in Table C-n.1 shown below. Please put the table in your report exactly as it appears.
Table C-n.1: Processing time of G0 under multiprocessing implementation
|
|
Measured MP time |
Measured Cumulative MP Time |
||||||||||
|
Process → Pair Index↓ |
1 |
2 |
3 |
4 |
… |
… |
1 |
2 |
3 |
4 |
… |
… |
|
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
|
|
|
|
|
|
|
|
|
|
|
|
|
… |
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
|
|
|
|
|
|
|
|
|
|
|
|
[C-n.2] Record the time it takes to complete the processing of all groups by varying the number of processes. Please present the obtained results in Table C-n.2 shown below. Please put the table in your report exactly as it appears.
Table C-n.2: Processing time of all groups under multiprocessing implementation
[D-n] Determine the speed-up
[D-n.1] Determine the speed-up of processing
0 under multiprocessing implementation (C- n.1) versus sequential implementation (B-n.1).
[D-n.2] Determine the speed-up of processing all groups under multiprocessing implementation (C-n.2) versus sequential implementation (B-n.2).
Plot the speed-up curves w.r.t the number of processes and the number of workloads (pairs/groups) and write a detailed discussion.
[E] Compare the algorithms
Use the results obtained by all team members and present a detailed comparison of the various algorithms in the report.
IV. Submission
What to submit?
Each group should upload only one compressed file (.zip) to Canvas, including three types of files: the report (.pdf), the source code file(s) (.py), and the readme text file (.txt).
Follow the rules below to name your files:
• For the compressed file:
CA2_(Group ID)_(Matriculation number of all team members separated by an underscore).zip
Example: CA2_ 12_A0213331Z_A0010101Y_A0340127X_A0711001X.zip • For the report:
CA2_Report_(Group ID)_(Matriculation number of all team members separated by an underscore).pdf
Example: CA2_Report_ 12_ A0213331Z_A0010101Y_A0340127X_A0711001X.pdf
Report format
Report Formatting Guidelines – Please use single column format. The font of the main body should be Times New Roman 12pt.
Page Limit – Maximum 12 pages (excluding the first page and the last page)
First Page of your report – Title + Full Names + Matriculation numbers (this page is not counted in the report page limit mentioned below).
Last Page of your report – Title: Coding effort (this page is not counted in the report page limit mentioned below). Please see the next subsection on what to write here.
Code file(s) – Please put useful comments generously throughout in your code file for readers to understand the gist of computation taking place.
Coding effort
Attach ONE separate page titled “Coding Effort -” at the end of report. Describe how much is your coding effort (quantify in %). If you had either directly or partly used any CODE(s) from github or from any other site, list those sites and point us exactly where you obtained those codes. If you have used others codes and did not list or cite the reference directly no marks will be awarded. If you have generated your own data, describe how you generated your data needed for your experiments. If you are using any tools for your experiments (not for data), then point to them. These also can be part of your references listing after conclusions.
Readme file
This file must contain clear instructions to facilitate running your program and use of data and parameters. If you are using any specific packages, list them and also indicate the URLs from where the readers can download. If you are using any specific DATA from any URL, indicate the URL. If the readers need to set any input parameters (hard coded way) explicitly state that requirement. Do NOT ASSUME anything and omit details. Step-by-step instructions will be very useful.
V. Assessment
Grading
Your assignment will be graded out of 60 marks, and the final weight of this assignment is 35%. For the joint effort, all members will be awarded identical marks. The mark breakdown is shown in Table III.
Table III. Grade breakdown
Plagiarism Penalty
You are allowed to use only 15% of any written material collectively from all other sources. Anything more than this 15% will be penalized. References cited may show up in your plagiarism checker and you can ignore this percentage. Copying of coding and simulation results directly from others will be awarded 0 marks for both Source and Copier(s).