Copyright | Copyright (C) 2021 Yoo Chung |
---|---|

License | GPL-3.0-or-later |

Maintainer | dev@chungyc.org |

Safe Haskell | Safe-Inferred |

Language | GHC2021 |

Part of Ninety-Nine Haskell Problems. Some solutions are in Solutions.P50.

# Documentation

huffman :: [(Char, Int)] -> [(Char, String)] Source #

Given a list of characters and their number of occurrences, construct a list of the characters and their Huffman encoding.

In the encoding, `'0'`

and `'1'`

will denote the 0 and 1 bits.

### Examples

`>>>`

[('a',"0"),('b',"101"),('c',"100"),('d',"111"),('e',"1101"),('f',"1100")]`huffman [('a',45),('b',13),('c',12),('d',16),('e',9),('f',5)]`

The encoding table computed by `huffman`

can be used to compress data:

`>>>`

3552`length $ encodeHuffman (huffman $ countCharacters text) text`

Compare this to the length of a fixed-length 5-bit encoding:

`>>>`

4375`length $ encodeHuffman loweralpha text`

or the length of the more standard ASCII encoding with 8 bits:

`>>>`

7000`length $ encodeHuffman ascii text`

Huffman coding is unambiguous, so we can get back the original text:

`>>>`

`let table = huffman $ countCharacters text`

`>>>`

`let encodedText = encodeHuffman table text`

`>>>`

`let decodedText = decodeHuffman table encodedText`

`>>>`

True`decodedText == text`

# Supporting functions

The functions below are not part of the problem. Instead, they are used to illustrate the use of Huffman coding.

countCharacters :: String -> [(Char, Int)] Source #

Count the number of occurrences of a character in a string.

encodeHuffman :: [(Char, String)] -> String -> String Source #

Given an encoding table and a string, encode the string.

While this is intended for use in illustrating Huffman coding, it is not limited to such. In particular, it can encode with a fixed-length encoding table.

decodeHuffman :: [(Char, String)] -> String -> String Source #

Given an encoding table and a string, decode the string.

While this is intended for use in illustrating Huffman coding, it is not limited to such. In particular, it can decode with a fixed-length encoding table.

loweralpha :: [(Char, String)] Source #

Fixed-length encoding of lower case letters and a space using 5 bits.