Initialize a multidimensional array in C

Voice recognition is one of the most useful features in various applications such as home automation and artificial intelligence. In this section, we will learn how to use Python and Google's Speech API to complete voice recognition.

In this case, we will use the microphone to provide audio for voice recognition. To configure the microphone, there are some parameters.

To use this module, we must install the SpeechRecognition module. There is also another module called pyaudio, which is optional. With this feature, we can set different audio modes.

sudo pip3　install SpeechRecognition
sudo apt-get install python3-pyaudio

For external microphones or USB microphones, we need to provide an accurate microphone to avoid any difficulties. On Linux, if you type 'lsusb' to display the relevant information of USB devices.

The second parameter is 'chunk size'. With this option, we can specify how much data to read at a time. This will be2of power, for example1024Or2048Etc.

We also need to specify the sampling rate to determine the frequency of processing the recorded data.

Since there may be some unavoidable noise around, we must adjust the ambient noise to obtain accurate sound.

Steps to recognize voice

Get other information related to the microphone.
Configure the microphone with block size, sample rate, and ambient noise adjustment.
Wait for a while to get the sound

After identifying the voice, try to convert it to text, otherwise some errors may occur.

Stop this process.

Example Code

import speech_recognition as spreg
#Setup the sampling rate and the data size
sample_rate =　48000
data_size =　8192
recog = spreg.Recognizer()
with spreg.Microphone(sample_rate = sample_rate, chunk_size = data_size) as source:
recog.adjust_for_ambient_noise(source)
print('Tell Something: ')
　　　speech = recog.listen(source)
try:
　　　text = recog.recognize_google(speech)
　　　print('You have said: ')　+　text)
except spreg.UnknownValueError:
　　　print('Unable to recognize the audio')
except spreg.RequestError as e:　
　　　print("Request error from Google Speech Recognition service; {}".format(e))

Output Result

$ python3　318.speech_recognition.py
Tell Something:　
You have said: here we are considering the asymptotic notation Pico to calculate the upper bound　
of the time complexity so then the definition of the big O notation is like this one
$

Without using a microphone, we can also convert some audio files into speech as input.

Example Code

import speech_recognition as spreg
sound_file = 'sample_audio.wav'
recog = spreg.Recognizer()
with spreg.AudioFile(sound_file) as source:
　　　speech = recog.record(source)  # use record instead of listening
　　　try:
　　　　　　text = recog.recognize_google(speech)
　　　　　　print('The file contains: ')　+　text)
　　　except spreg.UnknownValueError:
　　　　　　print('Unable to recognize the audio')
　　　except spreg.RequestError as e:　
　　　　　　print("Request error from Google Speech Recognition service; {}".format(e))

Output Result

$ python3　318a.speech_recognition_file.py　
The file contains: staying ahead of the curve demand planning new technology it also helps you progress in your career
$

Basic Tutorial

SQLite Tutorial