https://juejin.im/post/5c26d44ae51d45619a4b8b1e
如果你想先看看最終效果再?zèng)Q定看不看文章 -> bilibili
示例代碼下載
第一篇:一步一步教你實(shí)現(xiàn)iOS音頻頻譜動(dòng)畫(一)
本文是系列文章中的第二篇节猿,上篇講述了音頻播放和頻譜數(shù)據(jù)計(jì)算椎咧,本篇講述數(shù)據(jù)處理和動(dòng)畫的繪制拿愧。
前言
在上篇文章中我們已經(jīng)拿到了頻譜數(shù)據(jù)氯质,也知道了數(shù)組每個(gè)元素表示的是振幅谷婆,那這些數(shù)組元素之間有什么關(guān)系呢?根據(jù)FFT
的原理群发, N個(gè)音頻信號(hào)樣本參與計(jì)算將產(chǎn)生N/2個(gè)數(shù)據(jù)(2048/2=1024)宰睡,其頻率分辨率△f=Fs/N = 44100/2048≈21.5hz,而相鄰數(shù)據(jù)的頻率間隔是一樣的茄螃,因此這1024個(gè)數(shù)據(jù)分別代表頻率在0hz缝驳、21.5hz、43.0hz....22050hz下的振幅归苍。
那是不是可以直接將這1024個(gè)數(shù)據(jù)繪制成動(dòng)畫用狱?當(dāng)然可以,如果你剛好要顯示1024個(gè)動(dòng)畫物件拼弃!但是如果你想可以靈活地調(diào)整這個(gè)數(shù)量夏伊,那么需要進(jìn)行頻帶劃分。
嚴(yán)格來(lái)說(shuō)吻氧,結(jié)果有1025個(gè)溺忧,因?yàn)樵谏掀恼碌?code>FFT計(jì)算中通過(guò)
fftInOut.imagp[0] = 0
咏连,直接把第1025個(gè)值舍棄掉了。這第1025個(gè)值代表的是奈奎斯特頻率值的實(shí)部鲁森。至于為什么保存在第一個(gè)FFT
結(jié)果的虛部中祟滴,請(qǐng)翻看第一篇。
頻帶劃分
頻帶劃分更重要的原因其實(shí)是這樣的:根據(jù)心理聲學(xué)歌溉,人耳能容易的分辨出100hz和200hz的音調(diào)不同垄懂,但是很難分辨出8100hz和8200hz的音調(diào)不同,盡管它們各自都是相差100hz痛垛,可以說(shuō)頻率和音調(diào)之間的變化并不是呈線性關(guān)系草慧,而是某種對(duì)數(shù)的關(guān)系。因此在實(shí)現(xiàn)動(dòng)畫時(shí)將數(shù)據(jù)從等頻率間隔劃分成對(duì)數(shù)增長(zhǎng)的間隔更合乎人類的聽(tīng)感匙头。
[圖片上傳失敗...(image-220f6b-1552345063365)]
<figcaption></figcaption>
圖1 頻帶劃分方式
打開(kāi)項(xiàng)目AudioSpectrum02-starter
冠蒋,您會(huì)發(fā)現(xiàn)跟之前的AudioSpectrum01
項(xiàng)目有些許不同,它將FFT
相關(guān)的計(jì)算移到了新增的類RealtimeAnalyzer
中乾胶,使得AudioSpectrumPlayer
和RealtimeAnalyzer
兩個(gè)類的職責(zé)更為明確。
如果你只是想瀏覽實(shí)現(xiàn)代碼朽寞,打開(kāi)項(xiàng)目
AudioSpectrum02-final
即可识窿,已經(jīng)完成本篇文章的所有代碼
查看RealtimeAnalyzer
類的代碼,其中已經(jīng)定義了 frequencyBands脑融、startFrequency喻频、endFrequency 三個(gè)屬性,它們將決定頻帶的數(shù)量和起止頻率范圍肘迎。
public var frequencyBands: Int = 80 //頻帶數(shù)量
public var startFrequency: Float = 100 //起始頻率
public var endFrequency: Float = 18000 //截止頻率
復(fù)制代碼
現(xiàn)在可以根據(jù)這幾個(gè)屬性確定新的頻帶:
private lazy var bands: [(lowerFrequency: Float, upperFrequency: Float)] = {
var bands = [(lowerFrequency: Float, upperFrequency: Float)]()
//1:根據(jù)起止頻譜甥温、頻帶數(shù)量確定增長(zhǎng)的倍數(shù):2^n
let n = log2(endFrequency/startFrequency) / Float(frequencyBands)
var nextBand: (lowerFrequency: Float, upperFrequency: Float) = (startFrequency, 0)
for i in 1...frequencyBands {
//2:頻帶的上頻點(diǎn)是下頻點(diǎn)的2^n倍
let highFrequency = nextBand.lowerFrequency * powf(2, n)
nextBand.upperFrequency = i == frequencyBands ? endFrequency : highFrequency
bands.append(nextBand)
nextBand.lowerFrequency = highFrequency
}
return bands
}()
復(fù)制代碼
接著創(chuàng)建函數(shù)findMaxAmplitude
用來(lái)計(jì)算新頻帶的值,采用的方法是找出落在該頻帶范圍內(nèi)的原始振幅數(shù)據(jù)的最大值:
private func findMaxAmplitude(for band:(lowerFrequency: Float, upperFrequency: Float), in amplitudes: [Float], with bandWidth: Float) -> Float {
let startIndex = Int(round(band.lowerFrequency / bandWidth))
let endIndex = min(Int(round(band.upperFrequency / bandWidth)), amplitudes.count - 1)
return amplitudes[startIndex...endIndex].max()!
}
復(fù)制代碼
這樣就可以通過(guò)新的analyse
函數(shù)接收音頻原始數(shù)據(jù)并向外提供加工好的頻譜數(shù)據(jù):
func analyse(with buffer: AVAudioPCMBuffer) -> [[Float]] {
let channelsAmplitudes = fft(buffer)
var spectra = [[Float]]()
for amplitudes in channelsAmplitudes {
let spectrum = bands.map {
findMaxAmplitude(for: $0, in: amplitudes, with: Float(buffer.format.sampleRate) / Float(self.fftSize))
}
spectra.append(spectrum)
}
return spectra
}
復(fù)制代碼
動(dòng)畫繪制
看上去數(shù)據(jù)都處理好了妓布,讓我們捋一捋袖子開(kāi)始繪制動(dòng)畫了姻蚓!打開(kāi)自定義視圖SpectrumView
文件,首先創(chuàng)建兩個(gè)CAGradientLayer
:
var leftGradientLayer = CAGradientLayer()
var rightGradientLayer = CAGradientLayer()
復(fù)制代碼
新建函數(shù)setupView()
匣沼,分別設(shè)置它們的colors
和locations
屬性狰挡,這兩個(gè)屬性分別決定漸變層的顏色和位置,再將它們添加到視圖的layer
層中释涛,它們將承載左右兩個(gè)聲道的動(dòng)畫加叁。
private func setupView() {
rightGradientLayer.colors = [UIColor.init(red: 52/255, green: 232/255, blue: 158/255, alpha: 1.0).cgColor,
UIColor.init(red: 15/255, green: 52/255, blue: 67/255, alpha: 1.0).cgColor]
rightGradientLayer.locations = [0.6, 1.0]
self.layer.addSublayer(rightGradientLayer)
leftGradientLayer.colors = [UIColor.init(red: 194/255, green: 21/255, blue: 0/255, alpha: 1.0).cgColor,
UIColor.init(red: 255/255, green: 197/255, blue: 0/255, alpha: 1.0).cgColor]
leftGradientLayer.locations = [0.6, 1.0]
self.layer.addSublayer(leftGradientLayer)
}
復(fù)制代碼
接著在View
的初始化函數(shù)init(frame: CGRect)
和 init?(coder aDecoder: NSCoder)
中調(diào)用它,以便在代碼或者Storyboard
中創(chuàng)建SpectrumView
時(shí)都可以正確地進(jìn)行初始化唇撬。
override init(frame: CGRect) {
super.init(frame: frame)
setupView()
}
required init?(coder aDecoder: NSCoder) {
super.init(coder: aDecoder)
setupView()
}
復(fù)制代碼
關(guān)鍵的來(lái)了它匕,定義一個(gè)spectra
屬性對(duì)外接收頻譜數(shù)據(jù),并通過(guò)屬性觀察didSet
創(chuàng)建兩個(gè)聲道的柱狀圖的UIBezierPath
窖认,經(jīng)過(guò)CAShapeLayer
包裝后應(yīng)用到各自CAGradientLayer
的mask
屬性中豫柬,就得到了漸變的柱狀圖效果告希。
var spectra:[[Float]]? {
didSet {
if let spectra = spectra {
// left channel
let leftPath = UIBezierPath()
for (i, amplitude) in spectra[0].enumerated() {
let x = CGFloat(i) * (barWidth + space) + space
let y = translateAmplitudeToYPosition(amplitude: amplitude)
let bar = UIBezierPath(rect: CGRect(x: x, y: y, width: barWidth, height: bounds.height - bottomSpace - y))
leftPath.append(bar)
}
let leftMaskLayer = CAShapeLayer()
leftMaskLayer.path = leftPath.cgPath
leftGradientLayer.frame = CGRect(x: 0, y: topSpace, width: bounds.width, height: bounds.height - topSpace - bottomSpace)
leftGradientLayer.mask = leftMaskLayer
// right channel
if spectra.count >= 2 {
let rightPath = UIBezierPath()
for (i, amplitude) in spectra[1].enumerated() {
let x = CGFloat(spectra[1].count - 1 - i) * (barWidth + space) + space
let y = translateAmplitudeToYPosition(amplitude: amplitude)
let bar = UIBezierPath(rect: CGRect(x: x, y: y, width: barWidth, height: bounds.height - bottomSpace - y))
rightPath.append(bar)
}
let rightMaskLayer = CAShapeLayer()
rightMaskLayer.path = rightPath.cgPath
rightGradientLayer.frame = CGRect(x: 0, y: topSpace, width: bounds.width, height: bounds.height - topSpace - bottomSpace)
rightGradientLayer.mask = rightMaskLayer
}
}
}
}
復(fù)制代碼
其中translateAmplitudeToYPosition
函數(shù)的作用是將振幅轉(zhuǎn)換成視圖坐標(biāo)系中的Y
值:
private func translateAmplitudeToYPosition(amplitude: Float) -> CGFloat {
let barHeight: CGFloat = CGFloat(amplitude) * (bounds.height - bottomSpace - topSpace)
return bounds.height - bottomSpace - barHeight
}
復(fù)制代碼
回到ViewController
,在SpectrumPlayerDelegate
的方法中直接將接收到的數(shù)據(jù)交給spectrumView
:
// MARK: SpectrumPlayerDelegate
extension ViewController: AudioSpectrumPlayerDelegate {
func player(_ player: AudioSpectrumPlayer, didGenerateSpectrum spectra: [[Float]]) {
DispatchQueue.main.async {
//1: 將數(shù)據(jù)交給spectrumView
self.spectrumView.spectra = spectra
}
}
}
復(fù)制代碼
敲了這么多代碼轮傍,終于可以運(yùn)行一下看看效果了暂雹!額...看上去效果好像不太妙啊。請(qǐng)放心创夜,喝杯咖啡放松一下杭跪,待會(huì)一個(gè)一個(gè)來(lái)解決。
<figcaption></figcaption>
圖2 初始動(dòng)畫效果
調(diào)整優(yōu)化
效果不好主要體現(xiàn)在這三點(diǎn):1)動(dòng)畫與音樂(lè)節(jié)奏匹配度不高驰吓;2)畫面鋸齒過(guò)多涧尿; 3)動(dòng)畫閃動(dòng)明顯。 首先來(lái)解決第一個(gè)問(wèn)題:
節(jié)奏匹配
匹配度不高的一部分原因是目前的動(dòng)畫幅度太小了檬贰,特別是中高頻部分姑廉。我們先放大個(gè)5倍看看效果,修改analyse
函數(shù):
func analyse(with buffer: AVAudioPCMBuffer) -> [[Float]] {
let channelsAmplitudes = fft(buffer)
var spectra = [[Float]]()
for amplitudes in channelsAmplitudes {
let spectrum = bands.map {
//1: 直接在此函數(shù)調(diào)用后乘以5
findMaxAmplitude(for: $0, in: amplitudes, with: Float(buffer.format.sampleRate) / Float(self.fftSize)) * 5
}
spectra.append(spectrum)
}
return spectra
}
復(fù)制代碼
<figcaption></figcaption>
圖3 幅度放大5倍之后翁涤,低頻部分都超出畫面了
低頻部分的能量相比中高頻大許多桥言,但實(shí)際上低音聽(tīng)上去并沒(méi)有那么明顯,這是為什么呢葵礼?這里涉及到響度的概念:
響度(loudness又稱音響或音量)号阿,是與聲強(qiáng)相對(duì)應(yīng)的聲音大小的知覺(jué)量。聲強(qiáng)是客觀的物理量鸳粉,響度是主觀的心理量扔涧。響度不僅跟聲強(qiáng)有關(guān),還跟頻率有關(guān)届谈。不同頻率的純音枯夜,在和1000Hz某個(gè)聲壓級(jí)純音等響時(shí),其聲壓級(jí)也不相同艰山。這樣的不同聲壓級(jí)湖雹,作為頻率函數(shù)所形成的曲線,稱為等響度曲線曙搬。改變這個(gè)1000Hz純音的聲壓級(jí)劝枣,可以得到一組等響度曲線。最下方的0方曲線表示人類能聽(tīng)到的最小的聲音響度织鲸,即聽(tīng)閾舔腾;最上方是人類能承受的最大的聲音響度,即痛閾搂擦。
[圖片上傳失敗...(image-f77a53-1552345063364)]
<figcaption></figcaption>
圖4 橫坐標(biāo)為頻率稳诚,縱坐標(biāo)為聲壓級(jí),波動(dòng)的一條條曲線就是等響度曲線(equal-loudness contours)瀑踢,這些曲線代表著聲音的頻率和聲壓級(jí)在相同響度級(jí)中的關(guān)聯(lián)扳还。
原來(lái)人耳對(duì)不同頻率的聲音敏感度不同才避,兩個(gè)聲音即使聲壓級(jí)相同,如果頻率不同那感受到的響度也不同氨距∩J牛基于這個(gè)原因,需要采用某種頻率計(jì)權(quán)來(lái)模擬使得像人耳聽(tīng)上去的那樣俏让。常用的計(jì)權(quán)方式有A楞遏、B、C首昔、D等寡喝,A計(jì)權(quán)最為常用,對(duì)低頻部分相比其他計(jì)權(quán)有著最多的衰減勒奇,這里也將采用A計(jì)權(quán)预鬓。
[圖片上傳失敗...(image-1c48b-1552345063364)]
<figcaption></figcaption>
圖5 藍(lán)色曲線就是A計(jì)權(quán),是根據(jù)40 phon的等響曲線模擬出來(lái)的反曲線
在RealtimeAnalyzer
類中新建函數(shù)createFrequencyWeights()
赊颠,它將返回A計(jì)權(quán)的系數(shù)數(shù)組:
private func createFrequencyWeights() -> [Float] {
let Δf = 44100.0 / Float(fftSize)
let bins = fftSize / 2 //返回?cái)?shù)組的大小
var f = (0..<bins).map { Float($0) * Δf}
f = f.map { $0 * $0 }
let c1 = powf(12194.217, 2.0)
let c2 = powf(20.598997, 2.0)
let c3 = powf(107.65265, 2.0)
let c4 = powf(737.86223, 2.0)
let num = f.map { c1 * $0 * $0 }
let den = f.map { ($0 + c2) * sqrtf(($0 + c3) * ($0 + c4)) * ($0 + c1) }
let weights = num.enumerated().map { (index, ele) in
return 1.2589 * ele / den[index]
}
return weights
}
復(fù)制代碼
更新analyse
函數(shù)中的代碼:
func analyse(with buffer: AVAudioPCMBuffer) -> [[Float]] {
let channelsAmplitudes = fft(buffer)
var spectra = [[Float]]()
//1: 創(chuàng)建權(quán)重?cái)?shù)組
let aWeights = createFrequencyWeights()
for amplitudes in channelsAmplitudes {
//2:原始頻譜數(shù)據(jù)依次與權(quán)重相乘
let weightedAmplitudes = amplitudes.enumerated().map {(index, element) in
return element * aWeights[index]
}
let spectrum = bands.map {
//3: findMaxAmplitude函數(shù)將從新的`weightedAmplitudes`中查找最大值
findMaxAmplitude(for: $0, in: weightedAmplitudes, with: Float(buffer.format.sampleRate) / Float(self.fftSize)) * 5
}
spectra.append(spectrum)
}
return spectra
}
復(fù)制代碼
再次運(yùn)行項(xiàng)目看看效果格二,好多了是嗎?
<figcaption></figcaption>
圖6 A計(jì)權(quán)之后的動(dòng)畫表現(xiàn)
鋸齒消除
接著是鋸齒過(guò)多的問(wèn)題竣蹦,手段是將相鄰較長(zhǎng)的拉短較短的拉長(zhǎng)蟋定,常見(jiàn)的辦法是使用加權(quán)平均。創(chuàng)建函數(shù)highlightWaveform()
:
private func highlightWaveform(spectrum: [Float]) -> [Float] {
//1: 定義權(quán)重?cái)?shù)組草添,數(shù)組中間的5表示自己的權(quán)重
// 可以隨意修改,個(gè)數(shù)需要奇數(shù)
let weights: [Float] = [1, 2, 3, 5, 3, 2, 1]
let totalWeights = Float(weights.reduce(0, +))
let startIndex = weights.count / 2
//2: 開(kāi)頭幾個(gè)不參與計(jì)算
var averagedSpectrum = Array(spectrum[0..<startIndex])
for i in startIndex..<spectrum.count - startIndex {
//3: zip作用: zip([a,b,c], [x,y,z]) -> [(a,x), (b,y), (c,z)]
let zipped = zip(Array(spectrum[i - startIndex...i + startIndex]), weights)
let averaged = zipped.map { $0.0 * $0.1 }.reduce(0, +) / totalWeights
averagedSpectrum.append(averaged)
}
//4:末尾幾個(gè)不參與計(jì)算
averagedSpectrum.append(contentsOf: Array(spectrum.suffix(startIndex)))
return averagedSpectrum
}
復(fù)制代碼
analyse
函數(shù)需要再次更新:
func analyse(with buffer: AVAudioPCMBuffer) -> [[Float]] {
let channelsAmplitudes = fft(buffer)
var spectra = [[Float]]()
for amplitudes in channelsAmplitudes {
let weightedAmplitudes = amplitudes.enumerated().map {(index, element) in
return element * weights[index]
}
let spectrum = bands.map {
findMaxAmplitude(for: $0, in: weightedAmplitudes, with: Float(buffer.format.sampleRate) / Float(self.fftSize)) * 5
}
//1: 添加到數(shù)組之前調(diào)用highlightWaveform
spectra.append(highlightWaveform(spectrum: spectrum))
}
return spectra
}
復(fù)制代碼
<figcaption></figcaption>
圖7 鋸齒少了扼仲,波形變得明顯
閃動(dòng)優(yōu)化
動(dòng)畫閃動(dòng)給人的感覺(jué)就好像丟幀一樣远寸。造成這個(gè)問(wèn)題的原因,是因?yàn)轭l帶的值前后兩幀變化太大屠凶,我們可以將上一幀的值緩存起來(lái)驰后,然后跟當(dāng)前幀的值進(jìn)行...沒(méi)錯(cuò),又是加權(quán)平均矗愧! (⊙﹏⊙)b 繼續(xù)開(kāi)始編寫代碼灶芝,首先需要定義兩個(gè)屬性:
//緩存上一幀的值
private var spectrumBuffer: [[Float]]?
//緩動(dòng)系數(shù),數(shù)值越大動(dòng)畫越"緩"
public var spectrumSmooth: Float = 0.5 {
didSet {
spectrumSmooth = max(0.0, spectrumSmooth)
spectrumSmooth = min(1.0, spectrumSmooth)
}
}
復(fù)制代碼
接著修改analyse
函數(shù):
func analyse(with buffer: AVAudioPCMBuffer) -> [[Float]] {
let channelsAmplitudes = fft(buffer)
let aWeights = createFrequencyWeights()
//1: 初始化spectrumBuffer
if spectrumBuffer.count == 0 {
for _ in 0..<channelsAmplitudes.count {
spectrumBuffer.append(Array<Float>(repeating: 0, count: frequencyBands))
}
}
//2: index在給spectrumBuffer賦值時(shí)需要用到
for (index, amplitudes) in channelsAmplitudes.enumerated() {
let weightedAmp = amplitudes.enumerated().map {(index, element) in
return element * aWeights[index]
}
var spectrum = bands.map {
findMaxAmplitude(for: $0, in: weightedAmplitudes, with: Float(buffer.format.sampleRate) / Float(self.fftSize)) * 5
}
spectrum = highlightWaveform(spectrum: spectrum)
//3: zip用法前面已經(jīng)介紹過(guò)了
let zipped = zip(spectrumBuffer[index], spectrum)
spectrumBuffer[index] = zipped.map { $0.0 * spectrumSmooth + $0.1 * (1 - spectrumSmooth) }
}
return spectrumBuffer
}
復(fù)制代碼
再次運(yùn)行項(xiàng)目唉韭,得到最終效果:
<figcaption></figcaption>
結(jié)尾
音頻頻譜的動(dòng)畫實(shí)現(xiàn)到此已經(jīng)全部完成夜涕。本人之前對(duì)音頻和聲學(xué)毫無(wú)經(jīng)驗(yàn),兩篇文章涉及的方法理論均參考自互聯(lián)網(wǎng)属愤,肯定有不少錯(cuò)誤女器,歡迎指正。
參考資料
[1] 維基百科, 倍頻程頻帶, en.wikipedia.org/wiki/Octave…
[2] 維基百科, 響度, zh.wikipedia.org/wiki/%E9%9F…
[3] mathworks住诸,A-weighting Filter with Matlab驾胆,www.mathworks.com/matlabcentr…
[4] 動(dòng)畫效果:網(wǎng)易云音樂(lè)APP涣澡、MOO音樂(lè)APP。感興趣的同學(xué)可以用卡農(nóng)鋼琴版
音樂(lè)和這兩款A(yù)PP進(jìn)行對(duì)比_丧诺,會(huì)發(fā)現(xiàn)區(qū)別入桂。
作者:potato04
鏈接:https://juejin.im/post/5c26d44ae51d45619a4b8b1e
來(lái)源:掘金
著作權(quán)歸作者所有。商業(yè)轉(zhuǎn)載請(qǐng)聯(lián)系作者獲得授權(quán)驳阎,非商業(yè)轉(zhuǎn)載請(qǐng)注明出處抗愁。