Hi everyone, in this article we talk about a different topic: Speech to text. My friends from university have been trying to develop an application that can hear and convert from speech to text. They found a bash script for this, it is nice and easy 🙂 But there is a problem about audio codec and speech to text API. For learning new technologies, I want to help them and I found new things.
We can use IBM Watson Service API instead of Google API to speech to text, it can work with CURL so, bash scripting can be use easily. But we need an account on IBM Cloud.
1- Go to the Service Page and Sign Up
2- Fill the Form
3- Active the account from mail inbox and Login.
4- After login, you see a page like this:
5- After close the “Welcome message”, click “Existing Services” on the left menu. We came here from “Get Started” so, when we sign up, service is created automatically. Select “Region” and your service will be listed.
6- Click the name and you will be redirected. New page include the credential. Click “Show” on the top right corner of the “Credential” container.
7- Bash script:
echo "Recording your Speech (Ctrl+C to Transcribe)" arecord -q -f cd -t wav -d 0 -r 16000 | flac - -f --best -s -o test.flac echo "Converting Speech to Text..." curl -X POST -u 3b26a41f-046d-.... -6fcd49e99b45:yF....hN \ --header "Content-Type: audio/flac" \ --data-binary @test.flac \ "https://stream.watsonplatform.net/speech-to-text/api/v1/recognize"
8- Save this as “speech2text.sh” and give executable permission.
chmod +x speech2text.sh
9- Finally! Run the script.
Here is the output: