使用 Faster-Whisper 轻松将音频转录为文本

使用 Faster-Whisper 轻松将音频转录为文本

2024-3-5·devcxl
devcxl

本文讲解了如何使用Faster-Whisper转写大量语音文件

最近需要将一批电话录音处理文字信息,马上想到了OpenAI的whisper的api

但是这些电话录音包含一些敏感信息,我并不想让这些录音文件上传到OpenAI,虽然OpenAI的隐私政策表明不会拿用户的数据进行训练,但实际上,谁知道呢(摊手)

于是使用OpenAI的whisper进行处理,但是录音文件又长又多,whisper处理起来比较慢,于是转用faster-whisper,处理速度确实提升了不少

接下来就开始讲解下怎么用。

安装Python环境和cuda这些不再详细讲了,开始讲重点

首先安装faster-whisper和python-docx

pip install faster-whisper python-docx

接下来获取需要转录的文件的绝对路径

import os
path = '/path/to/your/filesDir/'
all_items = os.listdir(path)
full_paths = [os.path.join(path, f) for f in all_items if os.path.isfile(os.path.join(path, f))]

然后定义一个生成文档的函数generatorWord


from docx import Document
from docx.shared import Pt
from docx.enum.text import WD_ALIGN_PARAGRAPH
def generatorWord(transcript,title,filepath):
  # 创建一个新的Word文档
  doc = Document()
  title_paragraph = doc.add_heading(title, level=0)
  title_paragraph.runs[0].font.size = Pt(16)
  title_paragraph.paragraph_format.alignment = WD_ALIGN_PARAGRAPH.CENTER
  # 将文本添加到文档中,格式化时间戳
  for line in transcript:
      # 分割时间戳和文本内容
      timestamp, text = line.split('] ')
      p = doc.add_paragraph()
      run = p.add_run(timestamp + '] ')
      run.font.size = Pt(12)
      run = p.add_run(text)
      run.font.size = Pt(12)
      p.paragraph_format.space_after = Pt(2)  # 设置段落后间距为6磅
      p.paragraph_format.space_before = Pt(0)  # 设置段落前间距为0磅

  doc.save(filepath)

  print("文档已生成完成:%s" % (filepath))

最后使用faster-whisper将音频文件转为文字并使用generatorWord函数生成对应的docx文档

from faster_whisper import WhisperModel
model_size = "large-v3"
# Run on GPU with FP32
model = WhisperModel(model_size, device="cuda", compute_type="float32")

for full_path in full_paths:
    segments, info = model.transcribe(full_path, beam_size=5)
    print("开始处理文件 %s | %s(%f)" % ( full_path, info.language, info.language_probability))
    handler_item = []
    for segment in segments:
        handler_item.append("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
    print("语音识别完成,正在转为word文档。")
    filename=os.path.basename(full_path)
    generatorWord(handler_item,filename,f'/path/to/you/outdir/{filename}.docx')