Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
Math549 Coding Theory and Cryptography
The Entropy and Redundancy of Language
Due date: May 9th, 11:59pm
The following rudimentary language has been developed:
• There are 20 characters.
• Messages consist of singletons, digrams and trigrams only.
• For any message, there is a 0.5 probability it will be a singleton, a 0.3 probability it will be a digram, and a 0.2 probability it will be a trigram.
• Among singleton messages: there are 10 minor characters and 10 major characters. The probability a major character occurs is 3 times as much as the probability a minor character occurs. So if the probability a minor character occurs is r, then the probability a major character occurs is 3r. All major characters are equally likely and all minor characters are equally likely.
• Among digram messages: let p be the probability of a digram occuring which consists of two minor characters. Then the probability of a digram with two major characters occuring is 3p and the probability of a digram with one minor and one major character occuring is 2p.
• Among trigram messages: all trigrams that occur include at least one major character. let q be the probability of a trigram occuring which contains exactly one major character. Then the probability of a trigram containing two major characters is 3q and the probability of a trigram containing threemajor characters is 6q. Note that this means the probability that a trigram occurs consisting of all minor characters is zero.
Using the above facts, calculate the Entropy and the Redundancy of the language. Show, and explain, all working. You will need to do the following steps:
• Calculate p, q and r. To do this you’ll need to work out how many messages there are of each type, so how many digrams are there that consist of two minor characters, of a major and a minor, of two major characters. Likewise for trigrams.
• Using the values of p, q and r, you can now determine the entropy of this language.
• The absolute rate of language is exactly the same as the maximum possible entropy, and to calculate that you need to know how many possible messages there are. (See your first step.)
• To calculate the redundancy of the language, you should do it directly as a comparison of entropies, as the rate of language for our language is skewed by the fact we’re allowing three different lengths (rate of language is generally estimated by taking a fixed length of the language, the longer the more accurate – this allows a direct comparison against the absolute rate of language for that lenghh).
Thus, to calculate the redundancy, you need to calculate
as a percentage.Important points:
• You shouldn’t need any references (beyond the notes), but if you do use some other source, please cite them. If you do not use any other source, you should state that also.
All submissions must be in a single PDF file through Canvas before the deadline.
Assessment Value: 40 points