哈夫曼树原理
秉着能不写就不写的理念,关于哈夫曼树的原理及其构建,还是贴一篇博客吧。
https://www.gaodaima.com/article/97396.htm
其大概流程
哈夫曼编码代码
# 树节点类构建 class TreeNode(object): def __init__(self, data): self.val = data[0] self.priority = data[1] self.leftChild = None self.rightChild = None self.code = "" # 创建树节点队列函数 def creatnodeQ(codes): q = [] for code in codes: q.append(TreeNode(code)) return q # 为队列添加节点元素,并保证优先度从大到小排列 def addQ(queue, nodeNew): if len(queue) == 0: return [nodeNew] for i in range(len(queue)): if queue[i].priority >= nodeNew.priority: return queue[:i] + [nodeNew] + queue[i:] return queue + [nodeNew] # 节点队列类定义 class nodeQeuen(object): def __init__(self, code): self.que = creatnodeQ(code) self.size = len(self.que) def addNode(self,node): self.que = addQ(self.que, node) self.size += 1 def popNode(self): self.size -= 1 return self.que.pop(0) # 各个字符在字符串中出现的次数,即计算优先度 def freChar(string): d ={} for c in string: if not c in d: d[c] = 1 else: d[c] += 1 return sorted(d.items(),key=lambda x:x[1]) # 创建哈夫曼树 def creatHuffmanTree(nodeQ): while nodeQ.size != 1: node1 = nodeQ.popNode() node2 = nodeQ.popNode() r = TreeNode([None, node1.priority+node2.priority]) r.leftChild = node1 r.rightChild = node2 nodeQ.addNode(r) return nodeQ.popNode() codeDic1 = {} codeDic2 = {} # 由哈夫曼树得到哈夫曼编码表 def HuffmanCodeDic(head, x): global codeDic, codeList if head: HuffmanCodeDic(head.leftChild, x+'0') head.code += x if head.val: codeDic2[head.code] = head.val codeDic1[head.val] = head.code HuffmanCodeDic(head.rightChild, x+'1') # 字符串编码 def TransEncode(string): global codeDic1 transcode = "" for c in string: transcode += codeDic1[c] return transcode # 字符串解码 def TransDecode(StringCode): global codeDic2 code = "" ans = "" for ch <a>本文来源gao($daima.com搞@代@#码(网</a>in StringCode: code += ch if code in codeDic2: ans += codeDic2[code] code = "" return ans # 举例 string = "AAGGDCCCDDDGFBBBFFGGDDDDGGGEFFDDCCCCDDFGAAA" t = nodeQeuen(freChar(string)) tree = creatHuffmanTree(t) HuffmanCodeDic(tree, '') print(codeDic1,codeDic2) a = TransEncode(string) print(a) aa = TransDecode(a) print(aa) print(string == aa)
接下来就是一段一段分析代码
1.树结点类的构建:
共有5个属性:结点的值,结点的优先度,结点的左子结点,结点的右子结点,结点值的编码(这个没有什么好说的,这些属性都是被需要的)
2.创建树结点队列函数:
对于所有的字母结点,我们将其组成一个队列,这里使用list列表来完成队列的功能。将所有树节点够放进列表中,当然传进来的是按优先度从小到大已排序的元素列表
3.为队列添加节点元素,并保证优先度从大到小排列:
当有新生成的结点时,需将其插入列表,并放在合适位置,使队列依然时按优先度从小打到排列的。