Problems.P50

Description

Part of Ninety-Nine Haskell Problems. Some solutions are in Solutions.P50.

Synopsis

# Documentation

huffman :: [(Char, Int)] -> [(Char, String)] Source #

Given a list of characters and their number of occurrences, construct a list of the characters and their Huffman encoding.

In the encoding, '0' and '1' will denote the 0 and 1 bits.

### Examples

>>> huffman [('a',45),('b',13),('c',12),('d',16),('e',9),('f',5)]
[('a',"0"),('b',"101"),('c',"100"),('d',"111"),('e',"1101"),('f',"1100")]


The encoding table computed by huffman can be used to compress data:

>>> length $encodeHuffman (huffman$ countCharacters text) text
3552


Compare this to the length of a fixed-length 5-bit encoding:

>>> length $encodeHuffman loweralpha text 4375  or the length of the more standard ASCII encoding with 8 bits: >>> length$ encodeHuffman ascii text
7000


Huffman coding is unambiguous, so we can get back the original text:

>>> let table = huffman \$ countCharacters text
>>> let encodedText = encodeHuffman table text
>>> let decodedText = decodeHuffman table encodedText
>>> decodedText == text
True


# Supporting functions

The functions below are not part of the problem. Instead, they are used to illustrate the use of Huffman coding.

countCharacters :: String -> [(Char, Int)] Source #

Count the number of occurrences of a character in a string.

encodeHuffman :: [(Char, String)] -> String -> String Source #

Given an encoding table and a string, encode the string.

While this is intended for use in illustrating Huffman coding, it is not limited to such. In particular, it can encode with a fixed-length encoding table.

decodeHuffman :: [(Char, String)] -> String -> String Source #

Given an encoding table and a string, decode the string.

While this is intended for use in illustrating Huffman coding, it is not limited to such. In particular, it can decode with a fixed-length encoding table.

loweralpha :: [(Char, String)] Source #

Fixed-length encoding of lower case letters and a space using 5 bits.

ascii :: [(Char, String)] Source #

Fixed-length encoding of ASCII characters using 8 bits.

Long text against which various encoding schemes can be tried.