Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
MATH1005/MATH6005 Written Assignment 2: Error detecting codes
April 2024
Most communication channels are subject to noise that creates errors. We say the channel is unreliable. If you have played the game of “telephone”, also known as “whis- pers”, you have seen an extreme example of the errors that are brought into communi- cation via unreliable channels. One needs to realize that almost all of the channels of communication we use are unreliable.
Why are channels unreliable? Communication channels are subject to interference. Even when communication takes place using highly engineered equipment such as satellite and microwave transmission, signals fade, while weather and other disturbances occur between the transmitter and receiver. For example, here is a website that discusses the effects that heavy rain and snow, ice and water accumulation can have on a satellite TV reception: http://dish-cable. com/rain_fade. htm
Unreliable channels are a problem because we need communication to be reliable! One way to create reliable communication over unreliable channels is to add redundant information to the message in a clever way so that one can say either:(1) this message has definitely been corrupted and should not be acted upon; or
(2) there is no evidence that this message has been corrupted so it is probably an accurate record of what was sent.
What we are describing is an error detecting code.
1 Error Detecting Codes
Most error detecting codes work as follows. Let A denote the set of all possible messages. Let C denote another nonempty set. Let h : A 一 C denote a function called a hash function. When a sender wishes to transmit a message m e A, they instead transmit a pair (m,c), and element of A X C, with h(m) = c. So the sender sends the message and something else which is computed from the message. The recipient receives a transmission (m( , c( ). If the transmission went well, then m = m( and c( = h(m); but the recipient cannot know mto check this. The recipient can only compute h(m( ) and check it against c( . If h(m( ) c( , then the recipient knows that something definitely went wrong in the transmission; in this case, the recipient should not act upon the message (they may instead ask for a retransmission—an automatic repeat request might be built into the communications protocol). In the case that h(m( ) = c( , no error is detected. This does not necessarily mean that m = m( . Instead it means that one of the following has occurred:
1. m = m( and c = c(
2. m m( and c = c( and h(m) = h(m( )
3. m m( and c c( and h(m( ) = c(
The first case is the desirable scenario that the message was transmitted accurately and sender has no reason to doubt its accuracy. The second and third cases are scenarios in which the message contains errors, but the recipient has no reason to suspect this. These are dangerous, so we would like to minimize the occurrences of the second and third cases. The likelihood of the second case depends on how many different messages map to the same “hash code”. The likelihood of the the third case also depends on how many different message maps to the same hash-code, and how random-like the function h is. An excellent choice of h makes the second and third possibilities unlikely, while keeping C small so that h(m) is small for every message, and so that the transmissions are not much longer than the messages.
2 ISBN-10
Notational Convention: In what follows, for any positive integer d and any integers a,b, we shall write a b as alternate notation for a b ( mod d). Thus a b means d that b — a is divisible by d (equivalently, a and b leave the same remainder upon division by d).
An ISBN-10 is a 10-digit code assigned to a publication (books etc). Its use was standard in the publishing industry between 1970 and 2007. The first 9 characters of an ISBN-10 code are digits that uniquely identify the publication (the title, author, edition etc). The tenth character, which is either a digit or the letter X, is a check character used to detect errors in transmission or recording.
More formally, let X = 10 and
D = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
C = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, X},
A = X D X · · · .
9 times
We define a function h : A 一 C by the rule
A(a1 ,..., a9 ) e A h((a1 ,... a9 ))
iai.
When the ISBN-10 is written, no parentheses or commas are included. Often hyphens are used (for reasons we may ignore).
For example, the paperback third edition of Herstein’s “Abstract Algebra” was assigned the ISBN-10 code 0471368792. In this case we have
a1 = 0, a2 = 4, a3 = 7, a4 = 1, a5 = 3, a6 = 6, a7 = 8, a8 = 7, a9 = 9.
Then
iai = 1(0) + 2(4) + 3(7) + 4(1) + 5(3) + 6(6) + 7(8) + 8(7) + 9(9)
= 277 = 25(11) + 2.
It follows that that the message (9 digits that uniquely identify the publication)is m = (0, 4, 7, 1, 3, 6, 8, 7, 9), the hash-code (something computed from the message) is h(m) = 2 and the transmission (the ISBN 10) is the concatenation of these with the parentheses omitted.
Definition 1. An element (a1 ,..., a9 , c) ∈ D × ... D × C is a valid ISBN-10 if
c ≡
11
iai.
Lemma 2. An element (a1 ,..., a9 , a10 ) ∈ D × ... D × C is a valid ISBN-10 if and only
if
iai 0.
Proof. Let (a1 ,..., a9 , a10 ) e D X ...D X C. Then
(a1 ,..., a9 , a10 ) (is a valid ISBN-10)
a10 11() iai (using the definition above)
0 11()( iai)- (a10 )
011()( iai)+ (-1)a10
0 11() iai)+ 10a10 (because 10 11(三) -1)
0 11() iai.
Theorem 3 (The ISBN-10 detects any transmission error in which exactly one character
is changed). Let m = (a1 ,... a9 , a10 ) and m\ = (a,..., a, a) be elements of DX ... DX
C. If m is a valid ISBN-10, and m and m\ differ in exactly one component, then m\ is not a valid ISBN-10.
Proof. Suppose that misa valid ISBN-10, and m and m\ differ in exactly one component.
Then there exists an integer j such that 1 j 10 and a aj and a = ak for all
integers k such that 1 三 k 三 10 and k j.
By Lemma 2, to show that m\ is not a valid ISBN-10 it suffices to show that 0 三1\1 ia .
Since a d() b if and only if d divides b - a, it suffices to show that ia is not divisible
by 11. Now
ia 11() ia — 0
11() ia — iai (by Lemma 2, because m is a valid ISBN-10)
11() i(a — ai ) (using algebraic properties of summation)
11()j(a — aj ) (because ak = a when k j).
Since 1 三 j 三 10, 11 does not divide j. Since 0 三 aj 三 10 and 0 三 a 三 10, — 10 三 aj — a 三 10. Since aj a , aj — a 0. Hence
aj — a e {— 10, . . . , — 1} n {1, . . . 10},
and 11 does not divide aj — a. By the Prime Divisibility Property, 11 does not divide j(aj — a). Hence m\ is not a valid ISBN-10.
Theorem 4. If any two distinct digits in a valid ISBN-10 are transposed, then the resulting string is not a valid ISBN-10.
roof. Let (a1 ,..., a10 ), (b1 ,..., b10 ) e D X ··· X D X C. Suppose that (a1 ,..., a10 ) is a valid ISBN-10 and (a,..., a) is obtained from (a1 ,..., a10 ) by transposing two distinct digits. Then there exist integers j,k such that 1 三 j < k 三 10 and a = ak and a = aj
and a = ai for all integers i such that 1 三 i 三 10 and i j,k. The fact that the digits
transposed are distinct means that aj ak .
By Lemma 2, to show that (a,..., a) is not a valid ISBN-10 it suffices to show that
10 10
0 三1\1 ia . Since a d(三) b if and only if d divides b — a, it suffices to show that ia is
not divisible by 11. Now
ia — 0
ia — iai
i(a — ai )
j(a — aj ) + k(a
j(ak — aj ) + k(aj
(k — j)(aj — ak )
— ak )
— ak )
(by Lemma 2, because m is a valid ISBN-10)
(using algebraic properties of summation)
(because ai = a when i j, k).
(because a = ak and a = aj ).
Since 1 三 j < k 三 10, we have that 1 三 k — j 三 9; it follows that 11 does not divide k — j. Since 1 aj , ak 10 and aj ak , we have that — 10 aj — ak 10 and aj — ak 0; it follows that 11 does not divide aj — ak . By the Prime Divisibility Property, 11 does not
divide (k — j)(aj — ak ). HenceΣi(1)1 ia is not divisible by 11. II
3 ISBN-13
An ISBN-13 is a 13-digit code assigned to a publication (books etc). It replaced the ISBN-10 on January 1, 2007. The first 12 characters of an ISBN-13 code are digits that uniquely identify the publication (the title, author, edition etc). The thirteenth character is a check digit used to detect errors in transmission or recording. More formally, let D = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. An element (a1 ,..., a13 ) e D X ··· X D is a valid ISBN-13
if
a1 + 3a2 + a3 + 3a4 + ··· + a11 + 3a12 + a13 三 0.
10
We usually write an ISBN-13 as a string of digits, rather than a 13-tuple of digits.
Theorem 5. If one digit in a valid ISBN-13 is changed, then the resulting string of 13 digits is not a valid ISBN-13.
Proof. Let m = (a1 ,..., a13 ) and m( = (a,..., a) be 13-tuples of digits. Suppose that m is a valid ISBN-13 and m( is obtained from m by changing one entry. Then there
exists j e {1, . . . , 13} such that a aj and a = ak for all k j.
To show that m( is not a valid ISBN-13 we must show that
0 =1\0 a + 3a + a + 3a + ... 3a + a .
Since a d(=) b if and only if d divides b - a, it suffices to show that
a + 3a + a + 3a + ... 3a + a
is not divisible by 10. Now
a + 3a + a + 3a + ... 3a + a - 0
a + 3a + a + 3a + ... 3a + a - (a1 + 3a2 + a3 + 3a4 + ... 3a12 + a13 )
(because m is a valid ISBN-13)
(a - a1 ) + 3(a - a2 ) + ... 3(a - a12 ) + (a - a13 )
3 - a(aj)j ) if(if)j(j) is(is) eve(od)n(d)
(because ak = a when k j).
Consider first the case that j is odd. Since 0 三 aj 三 9 and 0 三 a 三 9, -9 三 a - aj 三 9. Since aj a , a - aj 0. Hence
a - aj e {-9, . . . , -1} n {1, . . . 9},
and 10 does not divide a - aj . Thus we have that a + 3a + a + 3a + ... 3a + a is
not divisible by 10.
Now consider the case that j is even. Since 0 三 aj 三 9 and 0 三 a 三 9, -9 三 a -aj 三
9. Since aj a , a 一 aj 0. Hence
a 一 aj e {一9). . . )一1} n {1). . . 9}.
It follows that
3(a 一 aj ) e {一27)一24 . . . )一3} n {3)6). . . 27})
and 10 does not divide 3(a 一 aj ). Thus we have that a + 3a + a + 3a + ... 3a + a is not divisible by 10.
Unfortunately, the ISBN-13 error detecting code does not always detect the transpo- sition of distinct digits. We note that both 1020000000007 and 2010000000007 are valid ISBN-13’s; perhaps more alarmingly, we note that both 160000000001 and 610000000001
are valid ISBN-13’s.
4 The Task: TrySBN-16
Imagine that publications are now to receive 15-digit identification numbers. Publishers agree to add a 16th character which will be an error detecting code. Invent an error detecting code scheme, called TrySBN-16, that is guaranteed to detect:
(T1) errors in which exactly one character is wrong; and
(T2) errors in which two distinct characters are transposed.
In no more than 6 pages (comprising a memorandum of at most 3 pages together with a technical appendix of at most three pages), describe your code scheme and persuade the leaders of the publishing industry that your scheme achieves its aims, and is a good choice for their industry.
. An excellent memorandum will be three pages (or less) of typeset text that gives some context of the scheme, describes the scheme with precision but without assuming technical expertise of the reader, and persuades the reader that the scheme is a good choice for the publishing industry. The audience for your memorandum comprises well-educated, but not necessarily technically-trained, professionals in the publishing industry. You should make appropriate formatting choices.
. An excellent technical appendix will be three pages (or less) of typeset text that proves that the error detection scheme will detect errors of type (T1) and (T2), and give examples of errors that will not be detected by your code scheme. The audience for your technical appendix is familiar with congruence arithmetic and with the details of ISBN-10 and ISBN-13, but not with your scheme. You should mak appropriate formatting choices.
You must cite all sources used. You may NOT use AI in preparing this assignment. You will submit your assignment in two ways. You will submit one file, comprising both the memorandum and the technical appendix, through gradescope. You will also submit the memorandum (without the technical appendix) through a Wattle portal that will run your submission through TurnItIn.