Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
ECE3073 Project - Milestone 2
Welcome and Getting Started
Welcome to Milestone 2 of the ECE3073 project! Before you read through this document, make sure that you have already gone through and read the Project - Outline. Ensure you have a good understanding of milestone 1 as you will build off the previous milestone. This document outlines the project's second milestone where you will design and create a computer system and subsequent application that performs efficient image processing on the images displayed in Milestone 1 while leveraging concepts from real-time operating systems (RTOS) and hardware and software optimisations.
This milestone has been broken down into several tasks:
● Task 0: Set up the Eclipse project with RTOS libraries
● Task 1: Implement several benchmarking metrics and displaying it on the HEX
● Task 2: Demonstrate your understanding of image manipulation within the designed computer system
● Task 3: Investigate and optimise the current computer system through hardware design improvements and software techniques to improve throughput and performance
● Report: Summarising your project findings in a concise manner, including improvements over the entirety of the project
Similar to milestone 1, to be most effective while attempting this project, we recommend that you first read through the entirety of this document before attempting each section. As you attempt each task, you should regularly go back to the specifications to ensure your design meets the requirements and pick up on anything you missed.
Ensure you keep records of your results for this milestone, as you will compile your results into a final report. You can also include any intermediate results that you obtained from milestone 1.
At the end of this document, you will also find:
- A marking guide to walk you through how your assignment will be marked
- Live demonstration procedure
- A style guide for both C and Verilog.
- A FAQ section to answer any common questions that people have. This will be updated as time goes on to answer any recurring questions.
Table of Contents
Revision History
Use this revision history to monitor for any changes we may need to make to this document to clarify requirements or provide hints
Description |
Version |
Date |
Initial release |
v1.0 |
22/04 |
Removed appendix requirement from report. Added a note for report referencing the marking rubric (excel sheet) |
v1.1 |
01/05 |
Amendment to the marking rubric and distribution within each task. Excel sheet for rubric is now available |
v1.2 |
02/05 |
Updated marking rubric for task 3. Moved “-1 for every 50ms above 150ms for edge detection runtime” from the 8 points criteria, to the 4 points criteria instead. |
v1.3 |
09/05 |
Updated special consideration information. Added clarification regarding demo, interview, and report, all of which must be based on your submission |
v1.4 |
10/05 |
Fixed typo in rubric. Full marks for improvements is 100% not 90% |
v1.5 |
14/05 |
Introduction and Learning Outcomes
Milestone 2 aims to reinforce your knowledge in several key areas of computer systems:
● Understanding the use of real-time operating systems in terms of performance capabilities and completing scheduled tasks (LO4)
● Utilising operating system functions such as task prioritisation to efficiently run two or more processes while maximizing CPU utilization (LO3)
● Measuring and improving system performance through the use of multithreading and/or hardware-based techniques (LO5)
● Documenting and providing relevant information to external stakeholders in a concise manner (LO2, LO4)
Your understanding of this content will be represented through the use of Quartus Platform Designer and IP-Catalog, and the writing of both high-level verilog, and the NIOS-II implementation of C. You will also provide documentation through the submitted report by the end of the milestone.
Key Information and Notices:
Estimated Time Commitment: 3-5 hours per week per group member. Note we can only provide general estimates, and you can use M1’s estimated commitment (with your actual commitment) as a benchmark for milestone 2’s commitment.
Academic Integrity: The ECE3073 project is to be completed in pairs. You can and should discuss the questions and concepts with other groups. However, you may not:
● write the code together with other groups,
● show other groups your HDL,
● copy other groups’ work, or
● present work you have found online as your own (i.e. without attribution),
This is plagiarism and/or collusion and is prohibited by Monash’s academic integrity policy. Breaches of this policy attract serious consequences. Note that all code submitted to this unit will be run through the MOSS system to check for plagiarism.
If you research some concept online or re-use HDL from your labs (or the unit) you must cite it (any citation method is acceptable).
Generative AI:
As a reminder, you can use generative artificial intelligence (AI) to troubleshoot your solutions or explain concepts. You must fully understand the work you submit for assessment. Any use of generative AI for this purpose must be appropriately acknowledged. You must not use generative artificial intelligence (AI) to generate any materials or content (e.g. report paragraphs, explanations, or code) for submission.
Submission
Due: Friday, 17th May, 4:30 PM (End Week 11) for the project files (demo). Friday 24th May, 4:30 PM for the final report.
Submission DEMONSTRATION File Structure:
The milestone 2 DEMONSTRATION submission box can be accessed here.
Your submission will need to contain the following files and a .zip archive named “Project_M2_<Group_Number>.zip”:
● Archived Quartus project (.qar file)
● Zip the entire Eclipse project (software folder containing both the project and bsp folder)
● Report in PDF format
Before submitting, ensure your project has all the up-to-date files. We recommend downloading your submission after you have zipped up your project, and have placed the submission on Moodle. After you are satisfied with the submission, you may finalise your submission on Moodle.
Some final checks:
● Up to date Platform Designer (qsys)
● Up to date Verilog and C code
● Up to date Eclipse project - you can
● Included your student IDs in each codes’ comments
Live-demonstration:
You will be required to do a live demonstration of your milestone 2 in week 12 during your lab session. There will be an announcement on Moodle/Ed regarding the scheduling for this demonstration. Your live demonstration and interview must be based on the solution submitted at the end of week 11.
Submission REPORT File Structure:
The milestone 2 REPORT submission box can be accessed here.
You will submit a PDF containing your group name. Your report must be based on the solution submitted at the end of week 11.
Late Submissions:
Late submissions will be accepted until your lab class without special consideration. The standard 10% penalty for every late day that the assignment is late.
Special Consideration:
Special consideration requests are possible but only apply to a single student. If approved, you will be assigned an alternative assessment.
Mark Distribution:
See mark distribution at the end of the assignment for the detailed distribution.
Coding Style Guides:
HDL and C Style Guides have been provided in Appendix A.
Suggested expectations
The suggested expectations and deadlines can be found in Project - Outline. This lists the week-by-week expectations for the project. You may choose to deviate from the suggested expectations depending on your circumstances.
Before You Start - For Those Without Functional M1 Files
This section provides important information for those without functional M1 files. We will be providing the .sof and .sopcinfo files that the ECE3073 teaching team has developed so that you can continue on M2 rather than going back to M1. However, you will be capped in the maximum obtainable marks, as you will not be able to implement some of the improvements and investigations in task 3 - namely Task 3.2 Self-Guided improvements.
The .sof and .sopcinfo files will provide you with the correct hardware setup, so you will have to transfer your M1’s C code over and adapt it to the settings of the provided files. Important information (ie. base addresses, IRQs etc.) can be found in the system.h file once you have set up the Eclipse project as per Task 0: RTOS Setup. You may choose to complete M2 first up to task 3, before returning to your M1’s Quartus project to fix any outstanding issues you still have to complete task 3 in its entirety.
The files can be found here.
Background
Image Convolution
Convolution is the process of moving through an image in a raster-scan order and applying a function to pixels within a region/kernel. As you can see from the link above, raster-scan is the process of moving one pixel at a time along the row of the image from left to right. Once we get to the end of the row, we start from the next row and repeat the same process as above. When we move one pixel at a time, we can look into the value of that pixel as well as surrounding ones and perform some sort of operation on them. These operations could be such as blurring the image through averaging the centre pixel and pixels around it (a kernel), or by applying a set of values on the kernel itself.
Image Blurring
Blurring an image is one of the most basic operations in image processing. It has been performed since the day of old-school cameras by manipulating the focal length and changing the focus in an image. Blurring an image has many uses, such as:
1. Reducing random noise in an image
2. Covering sensitive information in company documents
3. Creating an artistic effect on images for aesthetic purposes
To blur an image, the simplest method would be to:
1. Convolve through the image (described above) and sum pixel values within a desired window (example 3-pixel by 3-pixel, a.k.a. 3x3 kernel window).
2. After summing the pixel values, we divide it by the number of pixels within the window (9 for a 3x3 kernel window). This essentially generates an average value for that window.
3. This value is then used for the centre pixel position in the new blurred image.
4. This process is repeated in a raster scan order until the entire image is completed.
5. At the end, we should obtain an image of size (m-2) x (n-2), where m is the number of rows in the original image and n is the number of columns in the original image.
6. You may want to pad the image in this case to retain the original image resolution (m * n). Padding involves adding 0’s to the borders of the image.
7. An example of blurring is shown below:
Figure 1: Example of an average filter convolution that achieves image blurring where the centre pixel is labelled red
Edge Detection
Edge detection is the process of detecting edges or continuous lines in an image and is used in:
1. Medical imaging to diagnose diseases more easily;
Figure 2: Edge detection on an angiographic image. [1]
2. Scene/image segmentation;
Figure 3: The region of a satellite image after edge detection [2]
3. Fingerprint recognition;
Figure 4: Fingerprint recognition using pattern matching from edge detection [3]
4. Satellite imaging;
5. and many more…
The purpose of detecting sharp changes in image brightness is to capture important events and changes in the properties of the world. Edge detection is one of the fundamental steps in image processing, image analysis, image pattern recognition, and computer vision techniques.
One method of performing edge detection would be to use Sobel edge detection filters. These consist of filters in both the horizontal and vertical directions designed to detect edges in these directions. Sobel filters are shown below:
Figure 5: Sobel horizontal and vertical kernel filters
Example multiplication of your image with the X-Direction Kernel:
Figure 6: Example multiplication of a 3x3 image window with the 3x3 Sobel horizontal filter
Milestone 2 Project Overview:
This milestone aims to extend the functionality from milestone 1. There are 3 main components in this milestone:
1. Displays and Benchmarking: You will design a user interface that controls the images being displayed through the VGA port. The user should be able to select between two modes:
a. Quad image mode: Displays 4 images at once, with the ability to swap which image is displayed in which slot.
b. Single image mode: Displays 1 image of the user choices over the entire screen.
There will be some additional timing functionality that you will implement for each process that runs by using the DE10 switches. There will be additional detail in the subsequent tasks for this.
2. Implementation of Image Processing Techniques: You will process the base image in 3 different ways, which will then be displayed using the above user interface. You will implement
a. Image flipping: Alter your code from milestone one to be used within the RTOS environment.
b. Blurring: Using convolution, you will implement an algorithm to blur the base image from SDRAM.
c. Edge detection: Using convolution, you will implement an algorithm to perform sobel edge detection on the base image from SDRAM.
3. Improvements and Investigations: You will benchmark your implementations as you go, then implement and compare several optimization techniques.
User Interface
For the user interface, you should keep in mind the following recommended I/O allocations for the functionality:
● KEY[1:0]: should be used for switching between image modes
● SW[9:8]: should be used for swapping which benchmarking results are being displayed on the HEX displays
● SW[7:0]: should be used for controlling which images are currently being displayed (for both modes).
Task 0: RTOS Setup
As you will learn in the following weeks, real-time operating systems (RTOS) are very useful for running parallel tasks on your processor. Throughout this milestone, you will explore process parallelism and different hardware acceleration methods through the context of running and displaying several different image processing algorithms at the same time.
To ensure you have a working foundation, you should begin by creating a new uC/OS-II project in Eclipse and porting over all of your code from Milestone 1. This will also include re-editing the “bsp” for your new project as per the SDRAM setup in Lab 2.
Make sure your code works as expected from milestone 1 (the image displays on VGA and can be flipped) before continuing with the rest of this document.
Note: you can “close” an eclipse project without deleting it by right-clicking the directory, and selecting “Close Project”. This way, you can remove the old Eclipse project from your workspace. You will need to select the build configuration again after doing this.
Task 1: Displays and Benchmarking
In tasks 1 and 2, you'll delve into applying a range of image processing algorithms to the image stored in SDRAM. However, due to current system constraints, displaying multiple images simultaneously is not feasible. Task 1 is focused on addressing this issue effectively.
Task 1.1: Image Downscaling
There will be a total of 4 different images to be displayed on the screen at any time:
1. Default image
2. Flipped image (in both directions)
3. Blurred image
4. Edge detected image.
Even though you will implement the processing functionality in the later tasks, you can prepare to display them now. Expect these images to be stored in the SDRAM back-to-back starting from address 0x0. For example, if the four images are 10 bytes long each, they will be located at addresses 0x0, then 0xA, 0x14, and so on.
Figure 7: Visual representation of 4 images being displayed at once. “Default” refers to the original image.
Before we can display our images like this, we must create a function that will downscale a 160x120 image to a quarter of the size, i.e. 80x60. There are two ways to do this:
Method 1: Scan through the image stores in SDRAM and extract every second pixel. This will create a correctly sized image but look very jagged in areas as it is not what we call a “faithful” downscaling.
Method 2: Scan through each 2x2 chunk of the image and create a new pixel with a colour value equal to the average of each pixel in the 2x2 image region. This will result in a “faithful” downscaling but it is more computationally expensive and complex to implement.
As this is a key part of the project, you are given the option of either method 1 or 2. More marks will be provided for implementing method 2 than method 1 (see marking rubric). We recommend you start with method 1 and implement method 2 after completing the remainder of the project.
Task 1.2: Advanced image display and control interface design
Now that you can downsize images for display purposes, the next step is to create the base functionality for displaying them so that you can show the processed images when you generate them in the future.
In your RTOS environment, create an individual task for each of the four displays. Each task will select the image corresponding to the processing technique defined by the interface requirements below, downscale it, and subsequently display it in its relative quadrant of the 160x120 display. Each of these tasks should repeat every 250ms.
Your control interface should implement the following while also reserving SW[9:8] for future functionality:
a) Quad Image Mode: The user should be able to individually select which of the 4 images they wish to display on either of the 4 displays. For example, the user may wish to display the default image on all 4 displays (because it’s a cute dog), or maybe they only want to see each of the processed images on each display. Single Image Mode: The user should be able to toggle the system into a single display mode, which will display a single image of their choice that will take up the entire 160x120 display. This can be a difficult task if you have not yet used semaphores, so we recommend that you leave this until starting Task 3. You will also need to use semaphores to obtain full marks.
For example: If the user has selected the blurred image for display 1 (top right in Figure 7), your program will retrieve the blurred image from SDRAM, downscale it, and then display it. The selected process for the display is dictated by SW[7:0].
Task 1.3: Processing Setup and Image Flipping
To ensure you aren’t continuously processing each of the images, you must make a task whose sole purpose is to run each of the processing algorithms and save the result into the SDRAM every 500ms (this is to simulate some form of video, but the JTAG upload speed makes uploading footage take a very long time).
You must now test your new user interface by altering your image-flipping code from Milestone 1. Instead of instantly displaying the flipped image by pressing KEY[0], you should now process and flip (in both the x and y directions) the input image inside the processing task, and save it in SDRAM to be displayed as per Task 1.2.
Task 1.4: Benchmarking
In this task, you will set up a way to benchmark your CPU performance with your solution so far. In uC/OS-II, there is a function called OSTimeGet(). This function has no inputs and returns the current system time in milliseconds. By using this function both before and after the execution of something, we can calculate the time that it took to complete said execution. You can use the initial time as the benchmark and baseline of your current processor performance.
The report that you will write at the end of this project requires you to benchmark the ongoing progress across various tasks. By the end of the project, you can showcase the improvements that you have made. Task 3 will require you to record benchmarks for the edge detector explicitly
In your RTOS environment, create a task with the sole purpose of displaying statistics. This task should handle the following functionality:
a) CPU Usage display: Display the current CPU usage on two of the HEX displays.
b) Process Timing display: Display the time (in ms) that the selected process (listed below) takes to execute on the remaining 4 HEX displays. The process will be selected by the user inputting their choice on SW[9:8]. The benchmarks include the following:
i) 2’b00: Time it takes to flip an image (Task 1.3)
ii) 2’b01: Time it takes to blur an image (Task 2.1)
iii) 2’b10: Time it takes to run edge detection on an image (Task 2.2)
iv) 2’b11: Time it takes to do all 3
Task 2: Implementation of Image-Processing Techniques
In recent years, there has been a growing interest in utilising FPGAs for image-processing tasks. This surge in interest stems from the unparalleled capabilities of FPGAs to execute parallel processing tasks with exceptional efficiency. Unlike traditional processors, FPGAs offer highly customizable architectures that can be tailored to suit the specific requirements of image processing algorithms. Moreover, the parallelism inherent in FPGA architectures enables the acceleration of image processing tasks, making them ideal for real-time applications where speed and accuracy are paramount. As such, leveraging FPGAs for image processing not only enhances computational performance but also opens up avenues for developing innovative imaging solutions across various domains, including medical imaging, surveillance, robotics, and more. In Task 2, you will be exploring basic to intermediate image processing techniques to be implemented on the FPGA.
Task 2.1: Image Blurring
For this first part of Task 2, our main objective would be to blur the input image. Fortunately, you are provided with some skeleton code to get you going!
In the convolve function, we pass in the starting address of our image (image_addr), the output base address (out_addr) to write out the output, and the width and height of the image. The function has two for-loops:
● the outer loop iterates through the rows
● the inner one iterates through the columns.
The code snippet for this is shown below:
Figure 8: Example code snippet within convolve function
Within the two for-loops, we calculate the position to retrieve the image data from the SDRAM. Each iteration of the inner loop works to unpack the 1-dimensional data stored in the SDRAM into a 2-dimensional image representation. An example of the unpacking procedure is shown below. Note that we will use a 3x3 pixel area to be unpacked which we shall call our 3x3 window from here on.
Image showing an example of the data stored in SDRAM. Each box represents a pixel. The numbers in the pixels represent the position in the SDRAM block. Here we assume our input image is 3 pixels in height and 9 pixels in width (a.k.a 3x9 input image).
Figure 9: Example image showing how a 3x9 image might be stored in SDRAM
Image showing our 3x3 window moving row-wise within our image:
Figure 10: Example of a 3x3 kernel filter moving row-wise through the example image above
Steps to be performed for blurring:
1. Using the skeleton code provided, sum all the pixel values within the img_kernel array that you have filled using data from the SDRAM and divide it by the total number of pixels within the kernel.
2. Store that value in the correct corresponding output address on the SDRAM.
3. With the nested for-loop provided, this will automatically move through the image in the raster-scan order.
4. At the end of the convolution, you should obtain an image that is of (m-2) x (n-2) size of the original image where the original image is of size m x n.
5. Optional: You may pad the border pixels for better visualisation. See padding under Image Blurring
6. Display the blurred image on the VGA. An example is shown below:
Make sure you benchmark your current implementation of this function as it may be edited later on. You should also collect important code snippets and implementation details for your report later on. Read Reporting for more details. |
Figure 11: Original image (left). Blurred image (right)
Task 2.2: Sobel Edge Detection
Now that you have mastered convolutions through blurring, you will now implement edge detection. In this case, you will use simple edge detection filters called Sobel filters. Here are the steps in performing edge detection:
1. Load the Sobel Filters into the SDRAM. There are two filters, one for the horizontal direction and one for the vertical. The filters are shown here.
2. Extend the previously developed convolution function to apply the filters to the image saved in the SDRAM. Steps in convolution (to be performed for each filter in a raster scan order):
a. Multiply each 3x3 section in the image with the 3x3 filter. Sum all 9 values and store temporary values.
b. Move the scanning region one pixel to the right at a time and repeat step (a)
c. At the end of the row, move one pixel down and restart at the beginning of the row. Repeat steps (a) - (c) until the end of the image.
d. Repeat the steps above by calling the convolve function for the next filter.
3. At the end of both convolutions, sum both temporary values for each pixel in the image and provide a threshold value equivalent to half the maximum pixel value. If the summed value is greater than the threshold, set the summed value equal to the threshold. Store this new value for displaying on the VGA. HINT: Use absolute values when summing both temporary values.
The output of your edge detection should be placed in the SDRAM so that it may be displayed by the user interface you implemented in Task 1.2.
Make sure you benchmark your current implementation of this function as it may be edited later on. You should also collect important code snippets and implementation details for your report later on. Read Reporting for more details. |
Task 3: Improvements and Investigations
This section is about the heuristics and user design decisions behind implementing effective RTOS. The report you will write will be heavily based on the content and changes made within this task. You can see more about the reporting process down below (Reporting) and we recommend you read that first before continuing.
By now, you should have documented both the implementation details of your above processing algorithms, as well as the execution time of each one. The goal of this task is to use various acceleration techniques (both hardware and software) to lower the execution time of your image processing algorithms. Improving the performance of your base edge-detection algorithm is mandatory, but it is encouraged that you also improve your blurring algorithm.
Task 3.1 Semaphores
As you should now understand from the labs, semaphores are a crucial part of ensuring resources and processes in an RTOS are managed effectively.
Although the application of semaphores might not significantly enhance the speed of your algorithms, they can be very useful. Mutual exclusion semaphores are crucial for safeguarding shared resources and signalling semaphores are a good tool that allows tasks to be paused when they are unnecessary, hence freeing up processing time.
Before moving on, you should refactor your RTOS code to utilise semaphores. Even if your system technically works in certain areas without the use of semaphores, you must utilise them wherever a task accesses a shared resource. See the Mark Distribution section and the rubric for specific expectations about the use of semaphores.
Note: To gain full marks for this section, you must use both mutex semaphores and signalling semaphores.
Hint: Now might be a good time to implement single-image mode.
Task 3.2 Self-Guided improvements.
The purpose of this exercise is to deepen your understanding of your hardware and software design.
We have provided a list of strategies you can use to improve the speed of your algorithms and the performance of your system. You and your teammate(s) must choose separate techniques to implement, and then you will compare them in your report. Some of the strategies might work as intended, and others might not. This task requires you to investigate and document why they behave as they do.
This task will require you to research and implement the techniques based on the brief outline in this document.
Improvement Techniques:
1. Nios II custom instruction: Using the Nios II manual, you will implement a simple custom instruction to facilitate the use of hardware to calculate the convolutions faster. (See Appendix B for implementation details)
2. Hardware Accelerated Arithmetic: Speed up the arithmetic of your convolutions by passing the data into a hardware-based system, performing all arithmetic there, and then passing it back into the Nios processor. (See Appendix C for implementation details)
3. Software Improvements: Perform a large overhaul on your code for task 2 to optimise the algorithms and lower the runtime as much as possible. It is important to benchmark the current state of your code (and keep a backup) so that you have a baseline for your improvements. This must involve more than one non-trivial attempt at improving your code. Some improvements may include
a. Performing both convolutions in a single loop
b. Reducing the instructions running inside each convolution
c. Analysing and improving segments of code that take a particularly long time to run.
4. Custom: If you have a way to greatly improve the speed of your code that is non-trivial, then feel free to implement it and compare it to the others. You can post it on Ed (privately) or ask during the lab/applied sessions to get feedback on your idea.
Expectation
As stated above, each team member must implement different techniques to see which works best and if a combined integrated solution is possible. Whether or not your attempts were successful, in the report, you must provide a detailed explanation of the methods/processes carried out and reasons for failure (if applicable). You will need to measure the timing improvement and record this for the edge detector algorithm. You may choose to include the improvements for the averaging kernel as well for the report.
To obtain full marks for task 3, you will need to complete the following:
● Demonstrate a 100% speed up based on your original benchmark (ie. if it originally took 500ms, it now takes 250ms after implementing some improvements
● Demonstrate your edge detector algorithm can run under 150ms
Reporting
We are expecting you to submit a formal report of no more than 4 pages (roughly 2000 words) excluding HDL or other appendices. You will write the report in the context of writing a proof-of-concept report for your manager. Feel free to make up a pseudo-company (and appropriate) name as your title page. A template can be found here. You may choose to make your own template from scratch.
Your report should:
● Provide a concise introduction outlining project objectives, purpose, and design requirements.
● Present an overview of major components, including both hardware and software elements, emphasising the seamless integration of functionalities.
● Discuss the design and testing processes in detail, highlighting methodologies and outcomes along with quantitative results from the recorded benchmarks.
● Analyze the Real-Time Operating System (RTOS) design, focusing on resource allocation strategies and semaphore utilisation.
● Summarize the project with a conclusion, affirming the functionality and deployment readiness of the design.
We realise that 4 pages is not enough space for this much information! Part of writing well technically is learning how to be concise, which is also part of your challenge with this project. In industry, your project manager will not have the time to wade through hundreds of pages, so you will need to highlight the key points found during the proof-of-concept part of your project investigation. Playing around with formatting to ‘squeeze in’ more text will not make your report clearer, and so will also most likely attract a deduction.
Note that your report must be based on the solution submitted at the end of week 11.
You can find further details on the standards and requirements of technical reports from the Monash Report Standards site.
Notes:
● The title page and appendix pages do not count towards the word count.
● Read the marking rubric (excel sheet) in more detail - this will be released 02/05
Mark Distribution - 120 marks
The mark distribution is broken down by the 3 tasks, interview, coding practice, and report. Additional information regarding the mark distribution can be found here.
Task 1: RTOS
Downscaled image works with method 2 5
Tasks created for each display correctly 2
Quad and single image mode switches correctly 3
Specific task runs processing algorithms correctly 3
Benchmarking is set up correctly with OSTimeGet() 2
HEX display shows CPU Usage correctly 2
HEX display shows benchmark results 3
__
Total for Task 1: 20
Task 2: Edge Detection
Successful image blurring 6
Successful image edge-detection 10
Live demo of edge-detection and blurring displayed on VGA 2
Accurate measurement and display of time benchmarking 2
__
Total for Task 2: 20
Task 3: Improvements and Investigations
Live demo of optimised application with time reduction 8
Discussion on methods used to attain the % improvement in benchmark 4
Discussion on methods used to attain the time (150ms) benchmark 4
Correct usage of mutexes or protection semaphores where appropriate 5
Correct usage of signalling semaphores where appropriate 4
__
Total for Task 3: 25
Interview
Ability to present working product efficiently 5
Ability to convey your understanding of the project and your contributions 10
Clear understanding of the investigations performed in task 3 5
__
Total for Interview: 20
Coding practice
HDL good style guide marks 2
C good style guide marks 3
__
Total for coding practice: 5
Reporting
Introduction 2
Overview showing an understanding of integrated components 4 Discussion of implemented optimisations 7
Elaboration on effective RTOS Design 7
Discussion and conclusion 4
Writing and presentation 4
All required information is included in appendices 2
__
Total for Report: 30
__
Total for M2 120
Live demonstration procedure
After you submit your assignment at the end of week 11, you will be required to demonstrate your assignment live during your assigned lab session in week 12. Your demonstration and interview must be based on the solution submitted at the end of week 11. You will book a timeslot within your lab session to do this. We will publish a survey for you to book your time slot and we will try our best to cater to your preference. The demonstration for each pair will take approximately 20 minutes and you will need to arrive 10 minutes before your allocated slot.
In this timeslot, you will download your submission from Moodle, and set up your Eclipse project in front of the demonstrator. You will then proceed to demonstrate your working solution to the project (for example, if you have done all 3 tasks, we will ask you to
demonstrate the functionality of your project). We will also ask you to show us the code, and your platform designer setup. If you have not completed all 3 tasks, it is your responsibility to show your demonstrator what tasks you have completed.
You will only have 10 minutes to demonstrate your working solution to the TAs. You will need to become proficient in discussing and demonstrating the functionality of your project. After 10 minutes, we will interview each member of the group separately for 5 minutes. The interview will cover questions about:
● Your direct contribution to the milestone
● Your understanding of the project based on the parts you contributed
● Your understanding of your partner’s contribution in terms of the milestone
Although you may work on separate parts of the project simultaneously, you will need a holistic understanding of the whole project to receive exceptional marks.
In summary, the breakdown of the live demonstration procedure is:
● Arrive 10 minutes before the scheduled start time
● A TA will call you up to showcase the project
● First 10 minutes: You will demonstrate the functionality of the code in front of the TA. The TA will compile and setup the Quartus / Eclipse project beforehand.
● Next 5 minutes: We will interview one of the members about their understanding of the project
● Final 5 minutes: We will interview the other member about their understanding of the project
● There will be 5 minutes left for any final questions
● You are free to leave the lab after you have completed the demonstration and interview
The TA will directly mark your milestone 2 on Moodle. You will verify the marks in front of the TA. Once you leave the room, you will not be able to dispute your mark.
Appendix A: Programming style guides
It is always good practice to follow a good style guide when you are programming. In this unit, please use the following style guides:
Appendix B: Nios II custom instruction
The Nios/II architecture allows for custom instructions which can greatly improve the overhead of transferring data to/from your FPGA for use in hardware-optimised systems.
The Nios II Custom Instruction manual contains all of the information you will need to implement it. We recommend you use the “combinatorial” custom instruction, which takes in two 32-bit inputs and outputs a single 32-bit number. You should think about how to fully utilise the entire 32 bits, and using hardware for the internal actions of the instruction might greatly improve your run speed.
Appendix C: Hardware accelerated processing
Without using the NIOS II custom instruction, you need to implement a system that uses hardware to process the Sobel-kernels which should theoretically take significantly less time than your software implementation.
The base outline of this system looks as follows:
1. Export the pixel and kernel data from your Nios processor.
2. Make a Verilog module that executes the convolution of a single kernel
3. Re-import the final pixel value for that convolution.
References
[1] Edge detection on an angiographic image.
https://en.wikipedia.org/wiki/Edge_detection
[2] Bausys, R., Kazakeviciute-Januskeviciene, G., Cavallaro, F., & Usovaite, A. (2020). Algorithm selection for edge detection in satellite images by neutrosophic WASPAS method. Sustainability, 12(2), 548.
[3] Fundamentals of image gradients and edge detection
https://evergreenllc2020.medium.com/fundamentals-of-image-gradients-and-edge-detection-b093662ade1b