Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
COMP2017 9017 Assignment 1
Due: 23:59 2 April 2025
1 Assignment 1 - [SEGfault SOUNDboard] - 10%
It is important that you continually back up your assignment files onto your own machine, flash drives, external hard drives and cloud storage providers (as private). You are encouraged to submit your assignment regularly while you are in the process of completing it.
Full reproduction steps (seed, description of what you tried) MUST be given if you are enquiring about a test failure or if you believe there is a bug in the marking script.
Academic Declaration
By submitting this assignment you declare the following: I declare that I have read and understood the University of Sydney Student Plagiarism: Academic Integrity Policy, Coursework Policy and Procedures, and except where specifically acknowledged, the work contained in this assignment/project is my own work, and has not been copied from other sources or been previously submitted for award or assessment.
I understand that failure to comply with the Student Plagiarism: Academic Integrity Policy, Course work Policy and Procedures can lead to severe penalties as outlined under Chapter 8 of the University of Sydney By-Law 1999 (as amended). These penalties may be imposed in cases where any significant portion of my submitted work has been copied without proper acknowledgement from other sources, including published works, the Internet, Generative AI where approved, existing programs, the workof other students, or work previously submitted for other awards or assessments.
I acknowledge that I have reviewed and understood the University of Sydney’s guidelines on the responsible use of Generative AI 1 and will adhere to them in accordance with academic integrity policies.
I realise that I may be asked to identify those portions of the work contributed by me and required to demonstrate my knowledge of the relevant material by answering oral questions or by undertaking supplementary work, either written or in the tutorial, in order to arrive at the final assessment mark.
I acknowledge that the School of Computer Science, in assessing this assignment, may reproduce
1 https://www.sydney.edu.au/students/academic-integrity/ artificial-intelligence.html
it entirely, may provide a copy to another member of faculty, and/or communicate a copy of this assignment to a plagiarism checking service or in-house computer program, and that a copy of the assignment may be maintained by the service or the School of Computer Science for the purpose of future plagiarism checking.
2 Introduction
Editing audio involves various operations such as clipping, inserting, and moving. Clipping refers to selecting a portion of an audio file to keep or remove, inserting involves adding new portions at specific points, while moving would change a portion’s relative position in time.
To support these operations, memory must be moved or copied and this can lead to inefficiencies. Instead, an audio editor’s backend should use a shared backing store, where multiple operations reference the same underlying data.
3 Task
4 Structure
The audio data is sourced from a WAV file. The entire WAV file is read and stored into a buffer.
A track is a data structure that copies a continuous region of the buffer. A track can represent the entire audio or specific parts.
Any number of tracks can be created and the track can contain metadata that is useful to support the operations of this editor.
Each track is represented as an opaque data structure struct sound_seg that you must complete, according to the needs of your implementation. Each structure represents one audio track.
The audio editor exposes functions in section 5, which you will complete.
The functionalities of your program are divided into parts with varying levels of complexity. Each part has different requirements and is accompanied by specific assumptions. You should plan well for a particular level of achievement before coding. Writing helper functions are encouraged.
You are also required to answer the short questions described in section 6.
5 Functionality
5.1 Part 1: WAV file interaction, basic sound manipulation
Functions to interact between WAV files and a buffer
The wav_load() function reads raw audio samples from the specified WAV file fname and copies them into the destination buffer dest. The WAV file’s header is discarded during the loading process, leaving only the raw audio sample data in dest.
The wav_save() function creates or overwrite a WAV file, fname, using the audio samples provided in the source buffer src. The function constructs a valid WAV file, including the necessary header, and writes the audio samples to the file. Note: this function does not free the memory pointed to by src. You can find more about the WAV file format here.
Testing method: sanity
ASM 0.2: the provided path for wav_load(), wav_save() will always be valid. IO operations are always successful. dest will be large enough.
All other functions do not require reading a WAV file, and can be operated with int16_t arrays.
struct sound_seg* tr_init();
void tr_destroy(struct sound_seg* track);
tr_destroy() releases all associated resources and deallocates the heap allocated pointer to struct sound_seg.
Testing method: random
tr_length() will return the current number of samples contained within this track.
int16_t* dest, size_t pos, size_t len)
tr_read() copies len audio samples from position pos in the track data structure to a buffer dest. dest is an externally allocated buffer with guaranteed size of at least len.
const int16_t* src, size_t pos, size_t len)
tr_write() function copies len audio samples from a buffer, src, to the specified position pos, within the data structure track. Any previous data stored in the track for the range of pos to pos+len is overwritten.
If the number of audio samples to be written to the track extend beyond the length of the track, the track’s length is extended to accommodate the new data. Thus, a sequence of wav_load(), tr_init(), and tr_write() effectively transfers a WAV file to a track.
An ordering requirement when performing writes is to always write to lower indices before higher ones. This is only relevant for 5.3 onwards.
Testing method: sanity
Testing method: random
You should make the functionality of tr_init(), tr_destroy(), tr_length(), tr_read(), tr_write() your first priority. As the marking script uses them to check the behaviour of other
bool tr_delete_range(struct sound_seg* track, size_t pos, size_t len)
Note: Samples removed by tr_delete_range() do not necessarily have to be freed from mem ory immediately, but should be freed when tr_destroy() is called.
5.2 Part 2: Identify advertisements
In the modern world, audio media is often accompanied with an advertisement (ad). This is unwanted noise and we do not accept this. You will identify and remove these ads using Cross Correlation.
You are to create a function to search for the existence and locations of an ad within a target track.
const struct sound_seg* ad)
Functionality is tested by directly overwriting portions of the target with copies of the ad, ensuringidentical amplitudes. The ads will always have the same amplitude and there is no scaling needed.
Functionality is tested by copying multiple ads over target, with their amplitude values summed." Similarity is quantified by comparing correlation of the overwritten portion with the ad’s autocorrelation (cross correlation with the itself) at 0 relative time delay. As the reference, zero delay, this is 100% match. A portion is said to match if the ad is at least 95% of the reference 3 .
The return method for tr_identify() function is poorly designed. You may be asked to address this issue with an explanation. See section 6 for more details.
Part 2 Checklist
5.3 Part 3: Complex insertions
tr_insert() performs a logical insertion of a portion from a source track into a destination track.
After insertion, dest_track’s data before destpos remains unchanged, followed by the inserted portion, and then the remaining original data from dest_track.
Note: This function is conceptually the inverse of delete_range().
This consequence of a tr_insert() operation results in a parent-to-child relationship. The parent (src) and the child (dest) portions should have shared backing store and the data need only be stored once, saving memory. Further insert() operations performed on the parent or child similarly extend this shared backing, such that tr_write() to one sample in a portion of one track could result in changes across many other tracks. As tr_delete_range and future tr_inserts do not change track data but track structure, their changes are not propagated.
Note: for cases of self-insertions. The portion is determined at the time tr_insert() is called, before the portion is inserted. Thus, inserting a portion into oneself is well-defined.
Testing method: random*. Due to complexity of this function, extra tiered restrictions have been laid out - you may find that they significantly decrease programming complexity:
Part 3 Checklist
5.4 [COMP9017 ONLY] Part 4: Cleaning Up
• Pj is no longer a child.• Pi is no longer a parent if it does not have other children.
Consider tracks A, B, C, D, E with a shared portion between them and the corresponding parent->child relationships as A->B, B->C, C->D, A->E. If tr_resolve was called on {B, C}, then after calling tr_resolve():
• B->C no longer exists.• A->B still exists, as A was not provided. By similar logic, C->D also exists.• A->E still exists, as neither A nor E were provided.• The portion in B can now be delete_range’d, as it is no longer a parent.• A is a parent maintaining the portion (as before)• C becomes a parent maintaining the portion (duplicated as a result of breaking from B)
tr_resolve() has now effectively split the shared backing store into two. The portions in A, B, E in one, and C, D in another.
REQ 6.1: tr_resolve() removes every direct parent-child relationship if the list provided contains both parent and child.
5.5 Performance
Random testcases for tr_insert() enforce a max dynamic memory usage.
5.6 Global assumptions
ASM 7.2: The starting position for tr_write() and destpos for tr_insert() ranges from 0 to the target track length, inclusive.
6 Short answer questions
7 Marking
7.1 Compilation requirements
Using the make program, your submission should compile into an object file, which the user/marker will utilise.
Your submission must produce an object file named sound_seg.o using the command make sound_seg.o. The marking script will compile this into a shared library to be used. Thus, the flag -fPIC must be added.
You are free to (and encouraged to) add extra build rules and functions for your local testing, such as a main function or debug flags. ASAN is encouraged during local testing, and will be automatically added to your final submission.
When marking your code will be compiled and run entirely on the Ed workspace. The marker will run the aformentioned make commands to compile your program and run the executable. If it does not compile on the environment, then your code will receive no marks for your attempt. When submitting your work ensure that any binary files generated are not pushed onto the repository.
7.2 Test structure
• Creating temporary data,• Orchestrating calling of functions,• Comparing returned data with expected values.
7.3 Seeded testcases
If a failure is reached, the marking script attempts to return the input set that caused the failure, which you can use locally to debug. Additionally, you are also able to manipulate random testcases for your own testing - details have been provided in the EdStem lesson.
For each random testcase in a submission, the seed used is included in the feedback section and can be used to deterministically regenerate inputs. During the marking phase, a predetermined set (15+) of seeds will be used and the percentage passed will become your final mark for a specific test. The assignment EdStem lesson provides more details for configuring random testcases.
From rudimentary analysis, passing insert_no_overlap_*_random for a single seed implies you will also pass 95% of other seeds, and passing other random tests for a single seed implies 99+%. If you only submit once and all 7 random testcases pass, you would expect a HD mark with very low variance. Submitting more than once, and thus testing using multiple seeds, greatly increases this confidence level; but even if you only submit once, the confidence of passing the reserved seeds far exceed the confidence of passing a private testcase if only a static testcase is used.
All final test inputs will be posted after 17 April.
7.4 Marking criteria
|
Marks |
Item |
Notes |
|
3/20 |
Code Style |
Manual marking |
|
5/20 |
5.1 Correctness |
Automatic tests |
|
4/20 |
5.2 Correctness |
Automatic tests |
|
8/20 |
5.3 Correctness |
Automatic tests |
[COMP9017 ONLY] 9017 students will have their above marks scaled by 0.9. 5.4 Correctness counts for 2/20.
7.5 Restrictions
• The code must entirely be written in the C programming language.• Must use dynamic memory for tracks.• Free all dynamic memory that is used.• NOT use any external libraries other than those in libc.• NOT use VLAs.• NOT have unclean repositories. This means no object, executable, or temporary files for any commit in the repository, just your final submission.• Only include header files that contain declarations of functions and extern variables. Do not define functions within header files.• Must use meaningful commits and meaningful comments on commits. 7• Other restricted functions may come at a later date.• Any and all comments must be written only in the English language.
• NOT manually use return code 42, reserved by ASAN.
The red flag items below will result in an immediate zero. Negative marks can be assigned if you do not follow the spec or if your code is unnecessarily or deliberately obfuscated:
- • Any attempts to deceive or disrupt the marking system.
- • Use any of the below functions. You shouldn’t need to use these functions at all in your pro gram, and you are doing something terribly wrong if you are.
– _init, atexit(2), _exit(2), _Exit(3)– dlopen(3), dlsym(3), dlclose(3)– fork(2), vfork(2), execve(2), exec*(3), clone(2)– kill(2), tkill(2), tgkill(2)– getpid(2), getppid(2), ptrace(2), getpgrp(2), setpgrp(2)
8 Submission Checklist
• Submission have a valid makefile with the rule sound_seg.o and compiles.• Reviewed all restrictions (not all are automatically checked)• Program is organised into multiple source and header files (for larger programs).• Not include any object file, binary, or junk data in your git repo.• If you have used AI, references.zip formatted according to EdStem slides submitted with source code.
Glossary
child A portion that has been inserted from another part of a track. The portion is the child to the portion that it was copied from. A sample may only belong to one parent. writes to the child must be reflected in the parent. 8–10, 14
sanity A directed testcase targeting a specific functionality. For example, a sanity test for REQ 2.1 may be to create a track, write into it, modify the original buffer, then verifying if the buffer and the track contents are different. Randomness may still be involved. 5, 11
shared backing store A shared backing store is a memory management technique where multiple references to the same underlying data are used instead of copying or moving memory. . 4, 8, 10, 15
9 Appendix
9.1 Worked function example
Figure 1: This example uses two tracks. They are created and filled using a sequence of tr_init, and tr_write of data.
Figure 2: Either track can be extended via a call to tr_write. By calling write on the end of the parent, new data is effectively concatenated.
Figure 3: The initial insert extracts a portion s1 from the parent, and places the portion into the child, also extending it. Due to shared backing store, there is a logical relationship between s1 and d1.
Figure 4: A second, overlapping insert occurs, placing d2 before d1. Note that 1) while the child is extended and indices for d1 changed, the logical relashionship remains. 2) the overlapping samples of s1 and s2 means that parts d1 and d2 (highlighted in purple), even though unrelated, also share samples.
Figure 5: tr_delete_range will fail if any of the specified samples is a parent (in this case, s1 and s2. Child samples such as a part of d2 can still be deleted (the command deletes the last 5 samples of d2, and 5 samples after the end, for 10 total). Because d2 no longer contains the last 5samples, The last 5 samples of s2 (in red) also stops being a parent; there is no immedate change, but those samples can now be deleted. Again noticed how the indices for d1 were shifted without impacting the parent-child relationship.