一、問題描述和基本要求
對一幅BMP格式的灰度圖像進行二元Fano編碼宫仗、譯碼蜒犯。
二组橄、算法思想
1949年,費諾(R.M. Fano)提出了一種編碼方法罚随,稱之為費諾碼或Fano碼玉工。它屬于概率匹配編碼,但一般也不是最佳的編碼方法淘菩,只有當信源的概率分布呈現(xiàn)分布形式的條件下遵班,才能達到最佳碼的性能。元費諾碼的編碼步驟如下:
① 將個信源符號按概率分布的大小潮改,以遞減的次序排列起來狭郑,設(shè)
②將排列好的信源符號按概率值劃分成r大組,使每組的概率之和接近于相等汇在,并對每組各賦予一個元碼符號翰萨。
③ 將每一大組的信源符號再分成組,使劃分后的兩個組的概率之和接近于相等糕殉,再分別賦予一個元碼符號亩鬼。
④ 依次下去,直至每個小組只剩一個信源符號為止阿蝶。
⑤ 將逐次分組過程中得到的碼元排列起來就是各信源符號的編碼雳锋。
費諾碼的編碼方法實際上是一種構(gòu)造碼樹的方法,所以費諾碼是即時碼羡洁,并且它考慮了信源的統(tǒng)計特性玷过,使概率大的信源符號能對應(yīng)碼長較短的碼字,從而有效地提高了編碼效率。
對于本課設(shè)的費諾編碼辛蚊,首先掃描第一遍輸入符號流粤蝎,統(tǒng)計各個符號出現(xiàn)的概率,之后按照上述算法構(gòu)造費諾編碼袋马,再對輸入符號流進行第二遍掃描進行編碼诽里。
根據(jù)費諾編碼的原理,將滿足上述條件的概率序列飞蛹,斷開為兩個子序列和,然后對這兩個子序列對應(yīng)的各事件分別進行0和1編碼灸眼,遞歸地對子序列重復(fù)該過程卧檐,直到每個子序列的長度都為1,就可以得到每個事件的編碼焰宣。這顯然就是分治法的思想霉囚,因此可以用分治法來解決費諾編碼問題。由上式可知匕积,費諾編碼的關(guān)鍵問題就是找到分界點k盈罐,使得k兩側(cè)的概率序列的和相差最小。
為求得分界點闪唆,一個很自然的想法就是先找到序列概率和的一半對應(yīng)的兩側(cè)位置和盅粪,然后觀測這兩個分界點哪個使得分界點兩側(cè)的概率和相差最小。
設(shè)計以下遞歸結(jié)構(gòu), 其中為開始位置悄蕾,為結(jié)束位置票顾,僅為計算方便,為當前遞歸結(jié)構(gòu)對應(yīng)的碼字帆调,那么F的形式化描述為:
F(start,end,possibility,string):
if start == end:
p_start對應(yīng)的信源符號s_start的碼字為string
return
找到符合上述條件的k_1和k_2奠骄,分別計算偏差值d_1和d_2,選擇偏差值最小的k
F(start,k,sum(p_start,p_{start+1},…,p_k ),string+"0")
F(k+1,end,sum(p_{k+1},p_{k+2},…,p_end ),string+"1")
由上述遞歸結(jié)構(gòu)F即可求得每個信源符號的碼字番刊,從而可以對文件進行編碼和解碼含鳞。
三、源代碼
#!/usr/bin/env python3
import os
import pickle
from math import log
# python3 -m pip install matplotlib
import matplotlib.pyplot as plt
# python3 -m pip install numpy
import numpy as np
# python3 -m pip install opencv-python
from cv2 import IMREAD_GRAYSCALE, absdiff, imread, imshow, imwrite
# python3 -m pip install prettytable
from prettytable import PrettyTable
class WriterWrapper:
def __init__(self, f):
self.buf = ""
self.f = f
def __call__(self, con: str):
self.buf += con
while len(self.buf) >= 8:
out = self.buf[0:8]
self.buf = self.buf[8:]
conint = int(out, 2)
self.f.write(bytes([conint]))
def flush(self):
if len(self.buf) > 0:
self.buf += "0"*(7 - ((len(self.buf) - 1) % 8))
while len(self.buf) > 0:
out = self.buf[0:8]
self.buf = self.buf[8:]
conint = int(out, 2)
self.f.write(bytes([conint]))
class ReaderWrapper:
def __init__(self, f):
self.buf = ""
self.f = f
def next(self):
if self.buf != "":
c = self.buf[0]
self.buf = self.buf[1:]
return c
c = self.f.read(1)
if c == b"":
return ""
bstr = bin(ord(c))[2:]
bstr = "0"*(7 - ((len(bstr) - 1) % 8)) + bstr
self.buf = bstr
return self.next()
def read(self, c : int):
s = ""
for i in c:
s += self.next()
return s
class Symbol:
def __init__(self, symbol, weight, pos):
self.word = ""
self.pos = pos
self.weight = weight
self.symbol = symbol
def __lt__(self, value):
return self.pos < value.pos
class FanoNode:
def __init__(self, left = None, right = None):
self.symbol = None
self.left = left
self.right = right
def Fano(start, end, possibility, string, syms) -> FanoNode:
leaf = FanoNode()
if start == end - 1:
leaf.symbol = syms[start]
syms[start].word = string
return leaf
mid = possibility / 2.0
target = 0.0
index = start
while target < mid:
target += syms[index].pos
index += 1
subseq1left = syms[start:index-1]
subseq1right = syms[index-1:end]
pos1left = sum(map(lambda x: x.pos, subseq1left))
pos1right = sum(map(lambda x: x.pos, subseq1right))
delta1 = abs(pos1left-pos1right)
subseq2left = syms[start:index]
subseq2right = syms[index:end]
pos2left = sum(map(lambda x: x.pos, subseq2left))
pos2right = sum(map(lambda x: x.pos, subseq2right))
delta2 = abs(pos2left-pos2right)
if delta1 < delta2:
leaf.left = Fano(start, index-1, pos1left, string + "0", syms)
leaf.right = Fano(index-1, end, pos1right, string + "1", syms)
else:
leaf.left = Fano(start, index, pos2left, string + "0", syms)
leaf.right = Fano(index, end, pos2right, string + "1", syms)
return leaf
if __name__ == "__main__":
print("請將要壓縮的圖像文件命名為origin.bmp芹务。")
print("讀入源圖像文件...", end="")
originimg = imread("origin.bmp", IMREAD_GRAYSCALE)
originfilesize = os.path.getsize("origin.bmp")
print("成功")
print("分析原圖像文件...", end="")
originshape = originimg.shape
originsize = originimg.size
flatten = originimg.flatten()
counter = dict.fromkeys(set(flatten), 0)
table = PrettyTable()
table.field_names = ["信源符號", "數(shù)量", "概率"]
SymbolStringMap = {}
syms = []
for item in flatten:
counter[item] += 1
for item in counter:
syms.append(Symbol(item, counter[item], counter[item] / originsize))
table.add_row([item, counter[item], counter[item] / originsize])
syms.sort(reverse=True)
Hs = sum(map(lambda weight: -weight / originsize *
log(weight / originsize, 2), counter.values()))
table.sortby = "數(shù)量"
table.reversesort = True
print("完成")
print("原圖像分辨率為:{1}*{0}蝉绷,作為信源,共有{2}個锄禽、{3}種信源符號潜必,熵為{4:.2f}, "
"信源概率表如下:".format(*originshape, originsize, len(counter), Hs))
print(table)
print("正在進行費諾編碼...")
Fano(0, len(syms), 1.0, "", syms)
table = PrettyTable()
table.field_names = ["信源符號", "碼字", "碼字長度", "數(shù)量", "概率"]
Len = 0
for sym in syms:
table.add_row([sym.symbol, sym.word, len(sym.word),
sym.weight, sym.pos])
Len += sym.weight*len(sym.word)
SymbolStringMap[sym.symbol] = sym.word
print("編碼完成,平均碼長為{:.2f}沃但,編碼表如下:".format(Len/originsize))
table.sortby = "數(shù)量"
print(table)
print("信源的熵為{:.2f}磁滚,平均碼長為{:.2f},編碼效率為{:.2f}%".format(Hs, Len/originsize, Hs/Len*originsize*100))
print("將編碼后的文件寫入到新文件target.bin中。編碼采用費諾編碼垂攘,先寫入圖像尺寸和文件長度维雇,再寫入編碼表,最后寫入編碼后圖像矩陣")
f = open("target.bin", "wb")
# 寫入文件分辨率和長度
f.write(pickle.dumps(originshape))
f.write(pickle.dumps(Len))
# 寫入編碼表
f.write(pickle.dumps(counter))
# 寫入圖像矩陣
wt = WriterWrapper(f)
for item in flatten:
wt(SymbolStringMap[item])
wt.flush()
f.close()
TarLen = os.path.getsize("target.bin")
print("新文件target.bin的大小為{}字節(jié)晒他,源文件大小為{}字節(jié)吱型,壓縮比為{:.2f}%".format(
TarLen, originfilesize, TarLen/originfilesize*100))
print("現(xiàn)在開始從新文件target.bin中恢復(fù)圖像信息...")
f = open("target.bin", "rb")
# 讀取圖像分辨率
originshape = pickle.load(f)
# 讀取長度
Len = pickle.load(f)
# 讀取編碼表
counter = pickle.load(f)
print("正在進行費諾譯碼...")
syms = []
originsize = originshape[0]*originshape[1]
for item in counter:
syms.append(Symbol(item, counter[item], counter[item] / originsize))
syms.sort(reverse=True)
p = root = Fano(0, len(syms), 1.0, "", syms)
cur = 0
rd = ReaderWrapper(f)
imgflatten = []
while (cur != Len):
c = rd.next()
cur += 1
if c == "0":
p = p.left
elif c == "1":
p = p.right
else:
raise AssertionError("c must be 0 or 1")
break
if (p.left == None) and (p.right == None):
imgflatten.append(p.symbol.symbol)
p = root
imgarr = np.array(imgflatten, dtype=np.uint8)
imgarr.resize(originshape)
imwrite("target.bmp", imgarr)
dif = absdiff(originimg, imgarr)
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
fig = plt.figure('result')
plt.subplot(1, 3, 1)
plt.title('原圖像')
plt.imshow(originimg, plt.cm.gray)
plt.axis('off')
plt.subplot(1, 3, 2)
plt.title('編解碼后的圖像')
plt.imshow(imgarr, plt.cm.gray)
plt.axis('off')
plt.subplot(1, 3, 3)
plt.title('差異')
plt.imshow(dif, plt.cm.gray)
plt.subplots_adjust(wspace=0.15)
plt.axis('off')
plt.show()