Member-only story

Audio to Text using Python

Python Coding
2 min readDec 27, 2024

--

import speech_recognition as sr

recognizer = sr.Recognizer()

audio_file = "audiobook.wav"

from pydub import AudioSegment
AudioSegment.from_mp3(audio_file).export(audio_file , format="wav")

with sr.AudioFile(audio_file) as source:
audio_data = recognizer.record(source)

try:
text = recognizer.recognize_google(audio_data)
print("Extracted Text:", text)
except sr.UnknownValueError:
print("Could not understand the audio.")
except sr.RequestError as e:
print("API Error:", e)

This code demonstrates how to convert an audio file (specifically an MP3) to WAV format using the pydub library, then transcribe its speech content into text using the speech_recognition library with Google Speech Recognition API. Here's a step-by-step explanation:

1. Importing Libraries

import speech_recognition as sr
from pydub import AudioSegment
  • speech_recognition: Used for speech-to-text conversion.
  • pydub: A library for processing audio files, including format conversions.

2. Creating a Recognizer

recognizer = sr.Recognizer()
  • Creates a Recognizer object, which is used for processing audio and converting it to text.

3. Setting Up the Audio File

--

--

Python Coding
Python Coding

Written by Python Coding

Learn python tips and tricks with code I Share your knowledge with us to help society. Python Quiz: https://www.clcoding.com/p/quiz-questions.html

No responses yet