微信小程序錄音+百度語音接口+情緒識別接口簡單案例

框架介紹

前端

使用微信小程序進(jìn)行錄音，并將錄音內(nèi)容上傳至后端，根據(jù)后端返回的結(jié)果判斷錄音內(nèi)容所表達(dá)的情緒

后端

基于Python的flask框架搭建一個簡易的后端切厘，用于保存上傳錄音的音頻，并對音頻進(jìn)行轉(zhuǎn)碼后提交至百度語音識別接口獲取識別的文字內(nèi)容，再將識別的文字內(nèi)容提交至百度對話情緒識別接口獲取所帶情緒的系數(shù)比诉植，根據(jù)系數(shù)比判斷所表達(dá)的情緒

所需環(huán)境

語言環(huán)境

Python 3.0+（并安裝相關(guān)模塊：pip install flask、pip install baidu-aip）

工具

微信開發(fā)者工具（開發(fā)微信小程序）
下載鏈接：https://developers.weixin.qq.com/miniprogram/dev/devtools/download.html
ffmpeg（對錄音音頻進(jìn)行轉(zhuǎn)碼）
下載鏈接：https://ffmpeg.zeranoe.com/builds/

微信小程序開發(fā)

全局配置文件

app.json

{
  "pages": [
    "pages/index/index",
    "pages/show/show",
    "pages/logs/logs"
  ],
  "window": {
    "backgroundTextStyle": "light",
    "navigationBarBackgroundColor": "#0094ff",
    "navigationBarTitleText": "錄音界面",
    "navigationBarTextStyle": "black"
  },
  "tabBar": {
    "color": "#000",
    "selectedColor": "#f23030",
    "backgroundColor": "#0094ff",
    "borderStyle": "white",
    "list": [
      {
        "pagePath": "pages/index/index",
        "text": "初始頁面"
      },
      {
        "pagePath": "pages/show/show",
        "text": "查看結(jié)果"
      }
    ]
  },
  "sitemapLocation": "sitemap.json"
}

錄音及請求界面

界面設(shè)計代碼（pages/index/index.wxml）

<view class="main">
  <button bindtap='play'>播放錄音</button>
  <button bindtap='submit'>上傳</button>
  
  <button bindtouchstart="clickDown" bind:touchend="clickUp"  touchcancel="clickUp" bindtouchmove="clickMove" class="record_btn">
    <span wx:if="{{recording}}">{{cancel_record?'松開取消':'松開發(fā)送'}}</span>
    <span wx:else>按下錄音</span>
  </button>
</view>

錄音及上傳功能代碼（pages/index/index.js）

const app = getApp();
const recorder = wx.getRecorderManager();
const player = wx.createInnerAudioContext();
const file = wx.getFileSystemManager();
var that;
var file_path;

Page({
  onLoad: function (options) {
    that = this;

    //先定好停止錄音后要干嘛
    recorder.onStop(function suc(e) {
      console.log(e.tempFilePath);
      file_path = e.tempFilePath;
      //保存錄音文件的臨時路徑
      that.setData({
        filePath: e.tempFilePath,
      })
      wx.setStorageSync('filePath', e.tempFilePath);
      wx.showLoading({
        title: '文件讀取中...'
      })
      wx.hideLoading();
    })
  },
  submit() {
    wx.uploadFile({
      // 將音頻上傳
      url: 'http://127.0.0.1:5000/voice',
      filePath: file_path,
      name: 'file',
      header: {
        'content-type': 'multipart/form-data'
      },
      success: function (res) {
        console.log(res.data);
        app.globalData.result = res.data;
        // 將返回的數(shù)據(jù)放入全局里
      },
      fail: function (res) {
      }
    });

  },
  //手指按下
  clickDown(e) {
    console.log('start');
    that.setData({
      recording: true,
      start_y: e.touches[0].clientY,
      cancel_record: false,
    })

    //開始錄音
    recorder.start({
      // duration: 60000,//最大時長
      // sampleRate: that.data.rate,//采樣率
      // numberOfChannels: 1,//錄音通道數(shù)
      // encodeBitRate: 16000,//編碼碼率昵观，有效值見下表格
      format: 'silk',//音頻格式
      // frameSize: 2000,//指定大小 kb
    })
  },
  //手指移動
  clickMove(e) {
    if (e.touches[0].clientY - that.data.start_y <= -50) {
      that.setData({
        cancel_record: true,
      })
    } else {
      that.setData({
        cancel_record: false,
      })
    }
    return false;
  },
  //手指松開
  clickUp(e) {
    if (that.data.cancel_record) {
      wx.showModal({
        title: '提示',
        content: '您選擇了取消發(fā)送,確定嗎?',
        confirmText: '繼續(xù)發(fā)送',
        cancelText: '取消重錄',
        success: res => {
          if (res.confirm) {
            wx.showToast({
              title: '發(fā)送成功',
            })
          } else {
            wx.showToast({
              title: '您選擇了取消',
            })
          }
          that.setData({
            recording: false
          })
        }
      })
    } else {
      wx.showToast({
        title: '發(fā)送成功',
      })
      that.setData({
        recording: false
      })
    }
    recorder.stop();
    return false;
  },
  //播放
  play() {
    player.src = that.data.filePath;
    player.play();
  },
})

注：
開發(fā)時注意關(guān)閉合法域名等的校驗晾腔，如圖：

數(shù)據(jù)展示界面

界面設(shè)計代碼（pages/show/show.wxml）

<text>你說的話是：</text>
<view> {{result.result.text}}</view>
<text>識別的情緒是：</text>
<view> {{result.result.emotion}}</view>

載入數(shù)據(jù)代碼（pages/show/show.js）

var app = getApp();
Page({
  onLoad: function (options) {
    console.log(app.globalData);
    var that = this;
    
    if (app.globalData.result){
      that.setData({
        result: JSON.parse(app.globalData.result),
        // 將數(shù)據(jù)轉(zhuǎn)json格式供讀取
      });
    }else{

    }    
  },
  onShow() {
    this.onLoad();
    // 每次打開頁面時重新載入
  },
})

后端代碼（.py）

from flask import Flask, request
from aip import AipSpeech, AipNlp
import os
import json

app = Flask(__name__)

@app.route('/voice', methods=['GET','POST'])
def main():
    '''后端請求處理接口，將小程序上傳的音頻保存啊犬、轉(zhuǎn)碼灼擂，并發(fā)送到語音識別接口識別文字后，再調(diào)用對話情緒識別接口'''
    result = {"status": 0}
    try:
        f = request.files['file']
        f.save("voice/xxx.silk")
        # 保存文件至本地路徑觉至，需要提前在py文件的路徑下創(chuàng)建個voice文件夾
        os.system("ffmpeg -y -i voice/xxx.silk -acodec pcm_s16le -f s16le -ac 1 -ar 16000 voice/a.pcm")
        # 文件轉(zhuǎn)碼
        text = get_result()
        # 調(diào)用語音識別接口獲取識別文字
        emotion = get_emotion(text)
        # 調(diào)用對話情緒識別接口獲取判斷結(jié)果
        result["status"] = 1
        result["result"] = emotion
    except Exception as e:
        print(repr(e))
    finally:
        return json.dumps(result)

def get_result(path='voice/a.pcm'):
    '''調(diào)用百度語音識別接口剔应，返回識別后的文字結(jié)果'''
    # 百度語音識別接口的ID、KEY等信息
    APP_ID = ''
    API_KEY = ''
    SECRET_KEY = ''
    client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
    # 讀取轉(zhuǎn)碼后的文件
    def get_file_content(filePath):
        with open(filePath, 'rb') as fp:
            return fp.read()
    result = client.asr(get_file_content(path), 'pcm', 16000, {
        'dev_pid': 1536,
    })
    print(result)
    return result["result"][0]

def get_emotion(text=""):
    '''調(diào)用百度對話情緒識別接口语御，返回判斷的情緒結(jié)果'''
    # 百度自然語言處理接口的ID领斥、KEY等信息
    APP_ID = ''
    API_KEY = ''
    SECRET_KEY = ''
    emotion_result = {
        "text": text
        }
    client = AipNlp(APP_ID, API_KEY, SECRET_KEY)
    options = {
        "scene": "talk"
    }
    try:
        result = client.emotion(text, options)
        for each in result["items"]:
            if each["label"] == "optimistic":
                opti = each["prob"]
            elif each["label"] == "pessimistic":
                pess = each["prob"]

        if opti >= pess:
            emotion_result["emotion"] = "高興"
        else:
            emotion_result["emotion"] = "悲傷"
    except Exception as e:
        print(repr(e))
    finally:
        return emotion_result

if __name__ == '__main__':
    app.run(debug=True)

使用測試

使用步驟

先運行Python后端文件，然后在微信小程序當(dāng)中點擊錄音->上傳沃暗，當(dāng)在控制臺看到返回的結(jié)果時月洛，進(jìn)入查看結(jié)果頁面查看

結(jié)果圖示

使用測試1

使用測試2（感覺好真實...）

注意點

百度語音識別接口音頻轉(zhuǎn)碼問題

對于調(diào)用百度語音識別接口，并非直接上傳音頻即可正確識別孽锥，在上傳前需要進(jìn)行音頻的轉(zhuǎn)碼嚼黔，否則往往結(jié)果會返回錯誤碼3301，并提示你音頻質(zhì)量不行惜辑，其中轉(zhuǎn)碼基于ffmpeg工具唬涧，相關(guān)命令可參考官方提供的文檔：https://cloud.baidu.com/doc/SPEECH/ASR-Tool.html#.E8.BD.AC.E6.8D.A2.E5.91.BD.E4.BB.A4.E7.A4.BA.E4.BE.8B
當(dāng)然鑒于微信小程序的錄音格式還很特殊，建議直接生成.silk文件盛撑，然后根據(jù)如下命令直接生成轉(zhuǎn)碼后的.pcm文件供接口識別：

ffmpeg -y -i xxx.silk（源文件） -acodec pcm_s16le -f s16le -ac 1 -ar 16000 xxx.pcm（輸出文件）

微信小程序錄音格式問題

由于微信小程序的錄音格式比較特殊（貌似是silk格式）碎节，因此當(dāng)后端將錄音保存到本地時，無法直接打開文件讀取抵卫，當(dāng)然上傳到百度語音識別接口也同樣無法識別狮荔，為此如果想要在本地播放該音頻胎撇，可以下載軟件：silk v3（建議下載完整版），將文件轉(zhuǎn)格式后即可播放殖氏，軟件地址：https://kn007.net/topics/batch-convert-silk-v3-audio-files-to-mp3-in-windows/

使用步驟

1.在軟件下載完成后晚树，將壓縮包內(nèi)文件都拷貝到一個文件夾下，打開文件silk2mp3.exe如圖：

2.導(dǎo)入對應(yīng)的silk文件雅采，并選擇：解碼->開始轉(zhuǎn)換即可爵憎，此時生成的解碼文件就可以播放了

其他

ffmpeg參考

https://cloud.tencent.com/developer/article/1119403