哈夫曼編碼算法用字符在文件中出現(xiàn)的頻率表來建立一個用0,1串表示各字符的最優(yōu)表示方式怒坯。給出現(xiàn)頻率高的字符較短的編碼炫狱,出現(xiàn)頻率較低的字符以較長的編碼,可以大大縮短總碼長敬肚。
Huffman Coding兩個步驟:
- 編碼(從輸入的字符數(shù)據(jù)構(gòu)建一顆哈夫曼樹毕荐,并將字符串轉(zhuǎn)化位01編碼)
- 解碼(遍歷哈夫曼樹將01編碼轉(zhuǎn)化為字符)
構(gòu)建哈夫曼樹的過程:
- 計算輸入數(shù)據(jù)的每一個字符的出現(xiàn)頻率。
- 從最小堆中提取兩個頻率最小的字符艳馒。
- 創(chuàng)建一個頻率等于兩個節(jié)點頻率之和的新內(nèi)部節(jié)點憎亚。使第一個提取的節(jié)點為其左子節(jié)點,另一個提取的節(jié)點為其右子節(jié)點弄慰。將此節(jié)點添加到最小堆中第美。
- 重復(fù)step2和step3直到最小堆為空。
假如有如下幾個字母及它們出現(xiàn)的次數(shù)(頻率):
字符 | 字?jǐn)?shù) |
---|---|
a | 5 |
b | 4 |
c | 3 |
d | 2 |
e | 1 |
在線演示霍夫曼樹的構(gòu)建:https://people.ok.ubc.ca/ylucet/DS/Huffman.html
C++使用STL實現(xiàn):
// C++ program for Huffman Coding
#include <bits/stdc++.h>
using namespace std;
// A Huffman tree node
struct MinHeapNode {
// One of the input characters
char data;
// Frequency of the character
unsigned freq;
// Left and right child
MinHeapNode *left, *right;
MinHeapNode(char data, unsigned freq)
{
left = right = NULL;
this->data = data;
this->freq = freq;
}
};
// For comparison of
// two heap nodes (needed in min heap)
struct compare {
bool operator()(MinHeapNode* l, MinHeapNode* r)
{
return (l->freq > r->freq);
}
};
// Prints huffman codes from
// the root of Huffman Tree.
void printCodes(struct MinHeapNode* root, string str)
{
if (!root)
return;
if (root->data != '$')
cout << root->data << ": " << str << "\n";
printCodes(root->left, str + "0");
printCodes(root->right, str + "1");
}
// The main function that builds a Huffman Tree and
// print codes by traversing the built Huffman Tree
void HuffmanCodes(char data[], int freq[], int size)
{
struct MinHeapNode *left, *right, *top;
// Create a min heap & inserts all characters of data[]
priority_queue<MinHeapNode*, vector<MinHeapNode*>, compare> minHeap;
for (int i = 0; i < size; ++i)
minHeap.push(new MinHeapNode(data[i], freq[i]));
// Iterate while size of heap doesn't become 1
while (minHeap.size() != 1) {
// Extract the two minimum
// freq items from min heap
left = minHeap.top();
minHeap.pop();
right = minHeap.top();
minHeap.pop();
// Create a new internal node with
// frequency equal to the sum of the
// two nodes frequencies. Make the
// two extracted node as left and right children
// of this new node. Add this node
// to the min heap '$' is a special value
// for internal nodes, not used
top = new MinHeapNode('$', left->freq + right->freq);
top->left = left;
top->right = right;
minHeap.push(top);
}
// Print Huffman codes using
// the Huffman tree built above
printCodes(minHeap.top(), "");
}
// Driver program to test above functions
int main()
{
char arr[] = { 'a', 'b', 'c', 'd', 'e', 'f' };
int freq[] = { 5, 9, 12, 13, 16, 45 };
int size = sizeof(arr) / sizeof(arr[0]);
HuffmanCodes(arr, freq, size);
return 0;
}
// This code is contributed by Aditya Goel
Reference: