點燈坊

失くすものさえない今が強くなるチャンスよ

VAD 簡介

Sam Xiao's Avatar 2024-06-11

VAD 全名為 Voice Activity Detection,可偵測人聲的開始與結束,可藉此作判斷並將人聲送往伺服器,本文介紹純 HTML 方式。

Version

Voice Activity Detection for JavaScript 0.7

HTML

<!doctype html>
<html>
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <script src="https://cdn.jsdelivr.net/npm/onnxruntime-web@1.14.0/dist/ort.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/@ricky0123/vad-web@0.0.7/dist/bundle.min.js"></script>
    <title>HTML Lab</title>
  </head>
  <body>
    <div>VAD Test</div>
  </body>
  <script type="module">
    let myVad = await vad.MicVAD.new({
      onSpeechStart: () => {
        console.log('onSpeechStart')
      },
      onSpeechEnd: () => {
        console.warn('onSpeechEnd')
      },
    })
    myVad.start()
  </script>
</html>

Line 6

<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web@1.14.0/dist/ort.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@ricky0123/vad-web@0.0.7/dist/bundle.min.js"></script>
  • 要使用 VAD,必須引用以上兩個 JavaScript

Line 13

<script type="module">
  let myVad = await vad.MicVAD.new({
    onSpeechStart: () => {
      console.log('onSpeechStart')
    },
    onSpeechEnd: () => {
      console.warn('onSpeechEnd')
    },
  })
  myVad.start()
</script>
  • 為了要使用 top level await,因此 <script> 要加上 type=module
  • 使用 await 建立 myVad object
  • onSpeechStart:當偵測到人聲開始時會觸發 SpeechStart event
  • onSpeechEnd:當偵測到人聲結束時會處方 SpeechEnd event

Conclusion

  • 若使用純 HTML 方式,只要簡單引用兩個 JavaScript 即可使用 VAD

Reference

Ricky Samore, Voice Activity Detection for JavaScript