Hello Dee296,
With the values you provided and following the excellent guide at
http://www.cs.duke.edu/csed/poop/huff/info/
the huffman coding I came up with is:
X1: 0
X2: 11
X3: 101
X4: 1001
X5: 10001
X6: 100001
X7: 100000
I build up the tree staring with the least frequent values as follows:
Note: (A , B) refers to the one / zero bit values for the left / right
branches of the tree.
(X6, X7), total frequency 0.020
(X5, (X6 , X7)), total frequency 0.050
(X4, (X5, (X6, X7))), total frequency 0.150
(X3, (X4, (X5, (X6, X7)))), total frequency 0.350
at this point, there are three "trees" in the forest remaining, the
one above at frequency 0.350, X1 at 0.350 and X2 at 0.300. I chose to
match the big tree with X2 but matching X2 with X1 works as well. With
my choice, I get
(X2, (X3, (X4, (X5, (X6, X7))))), total frequency 0.650
and
((X2, (X3, (X4, (X5, (X6, X7))))), X1), total frequency 1.000
Walking the tree (left is zero, right is one) provides the encodings listed above.
According to the source code referred to at
http://www.nist.gov/dads/HTML/huffmanEncoding.html
or more directly at
http://www.cs.mu.oz.au/~alistair/inplace.c
the calculation of entropy is
sum (-p(i) * ln2 (p(i)))
where p(i) is the probability for each symbol (1 through 7 in this
case). Using that formula I get
entropy = 2.128638214
which is close to the computed "average number of bits"
sum ( p(i) * lenght of encoded symbol i)
or
average bits = 2.22
I found this information using the search phrase
huffman coding
and following the most promising references.
If this is some homework assignment - I strongly suggest you check the
result I provided carefully. The algorithm I referred to may not be
exactly the same as expected by your professor and I assumed the term
"upper branch" refers to the branch with higher probability and "lower
branch" refers to the branch with lower probability.
If you need some additional information or if you find the answer
unclear, please make a clarification request prior to your deadline so
I can provide some further explanation.
--Maniac |