前言
來啦老鐵周循!
這兩天在研究怎么檢測强法、錄制電腦的聲音万俗,預(yù)計(jì)對未來的自動(dòng)化測試場景有幫助,因此記錄一下學(xué)習(xí)過程饮怯。初次研究闰歪,僅具參考意義,不具指導(dǎo)意義哈~
仍然以 Python 語言來練手蓖墅,拋出本文關(guān)鍵字:
-
PyAudio
同時(shí)库倘,順帶練手一下前幾期學(xué)的工具:Python 命令行工具庫:Fire
本文代碼倉庫供參考:
學(xué)習(xí)路徑
- PyAudio 模塊簡介;
- PyAudio 模塊安裝论矾;
- 使用 PyAudio 模塊操作音頻于樟;
- 使用 PyAudio 模塊進(jìn)行聲音自動(dòng)檢測與錄制;
1. PyAudio 模塊簡介拇囊;
2. PyAudio 模塊安裝;
(以 mac os 為例)
1. 安裝 portaudio靶橱;
- 命令行執(zhí)行以下命令寥袭;
brew install portaudio
- 否則安裝 PyAudio 時(shí)會(huì)有如下報(bào)錯(cuò):
2. 安裝 PyAudio 模塊;
- 命令行執(zhí)行以下命令关霸;
pip3 install PyAudio
- 安裝成功后如:
3. 使用 PyAudio 模塊操作音頻传黄;
在使用 PyAudio 模塊操作音頻之前,先貼一個(gè)可免費(fèi)下載 wav 文件的網(wǎng)站 队寇,在網(wǎng)上膘掰,如果你想下載素材,大部分都要錢的佳遣,這個(gè)網(wǎng)站是我搜索的時(shí)候無意中發(fā)現(xiàn)的:
接下來正式進(jìn)入使用 PyAudio 模塊操作音頻的學(xué)習(xí)识埋;
- 播放音頻文件;
- 創(chuàng)建一個(gè) python 文件用于演示播放音頻文件零渐,如 player.py ;
import pyaudio
import wave
import sys
CHUNK = 1024
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
sys.exit(-1)
wf = wave.open(sys.argv[1], 'rb')
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
data = wf.readframes(CHUNK)
while data != b"":
stream.write(data)
data = wf.readframes(CHUNK)
stream.stop_stream()
stream.close()
p.terminate()
- 使用以下命令播放音頻文件窒舟;
python3 player.py test.wav
- 錄音;
- 創(chuàng)建一個(gè) python 文件用于演示錄制音頻诵盼,如 recorder.py ;
import pyaudio
import wave
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
- 使用以下命令錄制電腦系統(tǒng)音頻惠豺,包括麥克風(fēng);
python3 recorder.py
- 錄制并馬上播放风宁;
- 創(chuàng)建一個(gè) python 文件用于演示錄制并馬上播放洁墙,如 recordAndPlayImmediately.py;
import pyaudio
import wave
import time
import sys
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
sys.exit(-1)
wf = wave.open(sys.argv[1], 'rb')
p = pyaudio.PyAudio()
def callback(in_data, frame_count, time_info, status):
data = wf.readframes(frame_count)
return data, pyaudio.paContinue
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True,
stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(0.1)
stream.stop_stream()
stream.close()
wf.close()
p.terminate()
- 使用以下命令演示錄制并馬上播放戒财;
python3 recordAndPlayImmediately.py
- 播放音頻文件(callback 方式)热监;
- 創(chuàng)建一個(gè) python 文件用于演示播放音頻文件的 callback 方式,如 playerCallbackVersion.py;
import pyaudio
import wave
import time
import sys
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
sys.exit(-1)
wf = wave.open(sys.argv[1], 'rb')
p = pyaudio.PyAudio()
def callback(in_data, frame_count, time_info, status):
data = wf.readframes(frame_count)
return data, pyaudio.paContinue
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True,
stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(0.1)
stream.stop_stream()
stream.close()
wf.close()
p.terminate()
- 使用以下命令播放音頻文件(callback 方式)固翰;
python3 playerCallbackVersion.py test.wav
- 錄制并馬上播放(callback 方式)狼纬;
- 創(chuàng)建一個(gè) python 文件用于演示錄制并馬上播放(callback 方式)羹呵,如 recordAndPlayImmediately.py;
import pyaudio
import time
WIDTH = 2
CHANNELS = 1
RATE = 44100
p = pyaudio.PyAudio()
def callback(in_data, frame_count, time_info, status):
return in_data, pyaudio.paContinue
stream = p.open(format=p.get_format_from_width(WIDTH),
channels=CHANNELS,
rate=RATE,
input=True,
output=True,
stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(0.1)
stream.stop_stream()
stream.close()
p.terminate()
- 使用以下命令播放音頻文件(callback 方式)疗琉;
python3 python3 recordAndPlayImmediatelyCallbackVersion.py
(以上幾個(gè)示例的代碼來源:https://docs.python.org/zh-cn/3/library/audioop.html)
當(dāng)然冈欢,這個(gè)命令會(huì)一直運(yùn)行去采集聲音,因?yàn)?stream.is_active() 一直都是 True盈简,即電腦系統(tǒng)凑耻、麥克風(fēng)一直有聲音,因此一直能夠采集到聲音柠贤。
想要能夠自動(dòng)停止采集香浩,接下來我們來探索聲音的自動(dòng)采集與錄制~
4. 使用 PyAudio 模塊進(jìn)行聲音自動(dòng)檢測與錄制;
接下來我會(huì)以這樣的一個(gè)場景進(jìn)行聲音的采集臼勉、錄制邻吭、自動(dòng)根據(jù)音量大小停止采集,即:
- 在系統(tǒng)播放音頻的時(shí)候宴霸,當(dāng)有從系統(tǒng)采集到音頻囱晴,則錄制,直到?jīng)]有從系統(tǒng)采集到音頻瓢谢;
這里的“有從系統(tǒng)采集到音頻”我們可以理解為從系統(tǒng)采集到的音頻具有一定的音響或音量畸写,反之,“沒有從系統(tǒng)采集到音頻”則可以理解為從系統(tǒng)采集到的音頻音響或音量低到一定程度氓扛;
為了規(guī)避麥克風(fēng)的干擾枯芬,使我們能夠?qū)崿F(xiàn)自動(dòng)停止音頻采集,我們可以:
- 關(guān)閉電腦的麥克風(fēng)采郎;
- 設(shè)置一個(gè)音響或音量閾值千所,當(dāng)采集到的音頻的音響或音量低于這個(gè)閾值,并持續(xù)某一段時(shí)間蒜埋,則認(rèn)為“沒有從系統(tǒng)采集到音頻”真慢;
關(guān)閉電腦的麥克風(fēng)我就不做研究了,這是電腦設(shè)置理茎。我們來研究第 2 個(gè)黑界,要對 “音響或音量” 大小進(jìn)行實(shí)時(shí)評估,這時(shí)候我們需要用到對聲音片段進(jìn)行數(shù)學(xué)處理的模塊:
-
audioop
這是 python 自帶的一個(gè)模塊皂林,相關(guān)文檔:https://docs.python.org/zh-cn/3/library/audioop.html
我們可以用 audioop 的聲音片段均方根值 rms 來評估聲音片段的“音響或音量”:
然后根據(jù)電腦音量設(shè)置朗鸠、測試時(shí)周邊噪音大小情況(很明顯,我們當(dāng)然不能在噪音非常大的情況下進(jìn)行聲音采集础倍,并且這種情況下采集的音頻也沒有意義)烛占,制定一個(gè)合適的 rms 閾值。
例如,當(dāng)我電腦的音量設(shè)置為 50%忆家,周邊沒有特別的噪音的情況下犹菇,rms 閾值可以設(shè)置為 100,雨天可以設(shè)置為 500芽卿,當(dāng)持續(xù)采集到的聲音的 rms 值均低于該閾值揭芍,則自動(dòng)停止聲音的采集;
其他細(xì)節(jié)卸例,如為了邊播放邊錄制称杨,我們用到多進(jìn)程 multiprocessing 模塊,為了練習(xí)前幾期學(xué)的工具:Python 命令行工具庫:Fire筷转,使用了 Python 命令行工具庫 Fire姑原。
這是我目前能想到的方案,代碼位于 main.py 文件內(nèi)呜舒,僅供參考:
import audioop
from multiprocessing import Process
import fire
import pyaudio
import wave
stream_format = pyaudio.paInt16
pyaudio_instance = pyaudio.PyAudio()
sample_width = pyaudio_instance.get_sample_size(stream_format)
global audio_frames
class Detector(object):
def __init__(self):
self.source_file = ""
self.channels = None
self.rate = None
self.chunk = None
self.audio_min_rms = None
self.max_low_audio_flag = None
self.recording = False
self.recording_file = ""
self.audio_frames = []
def __str__(self):
return ""
def play(self, source_file="", chunk=None):
source_file = source_file if not self.source_file else self.source_file
chunk = chunk if not self.chunk else self.chunk
f = wave.open(source_file, "rb")
p = pyaudio.PyAudio()
file_format = p.get_format_from_width(f.getsampwidth())
stream = p.open(format=file_format, channels=f.getnchannels(), rate=f.getframerate(), output=True)
data = f.readframes(chunk)
while data != b"":
stream.write(data)
data = f.readframes(chunk)
stream.stop_stream()
stream.close()
p.terminate()
return self
def detect_audio(self, channels=None, rate=None, chunk=None, audio_min_rms=None, max_low_audio_flag=None,
recording=False, recording_file=""):
channels = channels if not self.channels else self.channels
rate = rate if not self.rate else self.rate
chunk = chunk if not self.chunk else self.chunk
audio_min_rms = audio_min_rms if not self.audio_min_rms else self.audio_min_rms
max_low_audio_flag = max_low_audio_flag if not self.max_low_audio_flag else self.max_low_audio_flag
recording = recording if not self.recording else self.recording
recording_file = recording_file if not self.recording_file else self.recording_file
self.channels = channels
self.rate = rate
self.chunk = chunk
self.audio_min_rms = audio_min_rms
self.max_low_audio_flag = max_low_audio_flag
self.recording = recording
self.recording_file = recording_file
print("* start detecting audio ~")
self.channels = channels
self.rate = rate
stream = pyaudio_instance.open(format=stream_format,
channels=channels,
rate=rate,
input=True,
frames_per_buffer=chunk)
low_audio_flag = 0
detect_count = 0
while True:
detect_count += 1
stream_data = stream.read(chunk)
rms = audioop.rms(stream_data, 2)
print(f"the {detect_count} time detecting:", rms)
low_audio_flag = 0 if rms > audio_min_rms else low_audio_flag + 1
# 100 為經(jīng)驗(yàn)值锭汛,即連續(xù) 100 次采樣都是小音量,則可以認(rèn)為沒有音頻袭蝗,根據(jù)實(shí)際情況設(shè)置
if low_audio_flag > max_low_audio_flag:
print("* no audio detected, stop detecting ~")
break
self.audio_frames.append(stream_data)
stream.stop_stream()
stream.close()
pyaudio_instance.terminate()
if recording:
self.record()
return self
def record(self, recording_file=""):
recording_file = recording_file if not self.recording_file else self.recording_file
self.recording_file = recording_file
wf = wave.open(recording_file, 'wb')
wf.setnchannels(self.channels)
wf.setsampwidth(sample_width)
wf.setframerate(self.rate)
wf.writeframes(b''.join(self.audio_frames))
wf.close()
return self
def play_and_detect(self, source_file, channels, rate, chunk, audio_min_rms, max_low_audio_flag, recording,
recording_file):
self.source_file = source_file
self.channels = channels
self.rate = rate
self.chunk = chunk
self.audio_min_rms = audio_min_rms
self.max_low_audio_flag = max_low_audio_flag
self.recording = recording
self.recording_file = recording_file
play_process = Process(target=self.play)
detect_process = Process(target=self.detect_audio)
play_process.start()
detect_process.start()
play_process.join()
detect_process.join()
return self
if __name__ == '__main__':
fire.Fire(Detector)
測試:
1. 播放音頻文件店乐;
python3 main.py - play --source_file=test.wav --chunk=1024
2. 單純檢測音頻;
python3 main.py - detect_audio --channels=1 --rate=44100 --chunk=1024 --audio_min_rms=500 -max_low_audio_flag=100
3. 檢測并錄制音頻呻袭;
python3 main.py - detect_audio --channels=1 --rate=44100 --chunk=1024 --audio_min_rms=500 -max_low_audio_flag=100 - record --recording_file=recording.wav
4. 播放音頻的同時(shí)錄制音頻;
python3 main.py - play_and_detect --source_file=test.wav --channels=1 --rate=44100 --chunk=1024 --audio_min_rms=500 -max_low_audio_flag=100 --recording=True --recording_file=recording.wav
對于錄制到的音頻文件腺兴,如果要進(jìn)行與原音頻的比對左电,可能還需要用到降噪能力,簡單的降噪模塊如 noisereduce 等页响,復(fù)雜的咱也還不會(huì)呀篓足,后續(xù)有機(jī)會(huì)咱們再繼續(xù)研究這方面的知識;
好了闰蚕,今天先玩到這里吧栈拖,我們改日再戰(zhàn)~
如果本文對您有幫助,麻煩動(dòng)動(dòng)手指點(diǎn)點(diǎn)贊没陡?
非常感謝涩哟!