Speech to text in Pega

Speech to text in Pega

Using the Web Speech API in Pega application

Pega applications have evolved and are being used on all kinds of devices - desktop, tablets, mobile etc. It is important to incorporate the latest technologies into the application to make the user experience better. With speech recognition API, use cases like taking a note, drafting an email etc. can be achieved.

Let us see how we can implement this speech recognition API

We shall modify Pega pulse in such a way that it takes notes by listening to user.

Screenshot 2022-08-18 12.37.14.png

We will be using Web speech API for this use case. To learn more about this API, please check this MDN Documentation

For this use case we need a javascript file. This file will contain functions and logic to invoke speech recognition API. These functions will convert speech into text and assign the text to textarea in Pulse section.

Steps to configure speech recognition API

  1. Modify Pega pulse section and include two buttons 'Start' and 'Stop'. These buttons will activate and deactivate speech recognition. Add two empty labels in the same section. Screenshot 2022-08-19 14.52.28.png

  2. Give Tour ID's to these two buttons and labels.

    • Start button - startSpeech
    • Stop button - stopSpeech
    • Label1 - speechInterim
    • Label2 - speechStatus Screenshot 2022-08-19 14.53.38.png Screenshot 2022-08-19 15.14.48.png
  3. Provide a tour id of value 'pulseNote' to textarea field in Pega pulse section.

  4. Create a javascript binary file in Pega and include this .js file in scripts and styles tab of container harness rule. Screenshot 2022-08-19 14.58.48.png

  5. Get the code from this Gist and paste in the javascript file and save the rule.

  6. Save all the rules and test the functionality. When the start button is clicked for the first time, browser will ask permission for microphone. Make sure this permission is granted.

At this point, browser will capture your speech and convert it to text and assign this text to textarea in Pega pulse section. To stop the speech recognition, just click on Stop button.

pega-speechrecognition.gif

It is just a sample to get started with, this speech recognition can be applied to wide range of use cases in Pega workflows and in non pega applications as well.

I hope this gave a good idea to explore browser API's in Pega applications and would like to hear other interesting use cases you applied this API in.

Reach out to me if you encounter any issues or have any questions.

Thank you.

Did you find this article valuable?

Support Krishna Nidri by becoming a sponsor. Any amount is appreciated!