Plugin Development - Voice Control


Starting with version 3.4, plugins now support custom voice control.  In order to offload voice control to a plugin, the plugin must provide two things:

  1. Method:  A method that returns an IDictionary<string, string> used to create a voice command group with voice commands.
  2. Handler:  A method that accepts a Dictionary<string, string> parameter to process the results.

When a voice command created through a plugin is recognized in FoxVox, it will invoke the handler method specified by the plugin to handle the recognition operation.  This allows the plugin to be responsible for both setting up the voice recognition and processing the results.

A new optional method to support registering FoxVox with the plugin is now available also, called Register.  This method must accept two parameters of the types Action<string> and Dictionary<string, string> which FoxVox passes through.  The first parameter is the callback action inside FoxVox which allows a custom message to be posted to its event log, and as indicated, accepts a single string parameter being the message to post.  The second parameter is a dictionary of string values supplied by  FoxVox.  Currently it only contains a single entry for "Version" which provides the current version of FoxVox.  In the future, more info can be added if needed.

Set a plugin to operate as voice commands by designating it for binding to voice key values as indicated here:


Designate plugin for voice control

Voice Group/Command Creation

Voice commands created through a plugin are essentially the same as those created through the UI.  They adhere to the same rules and perform the same way as their manual counterparts.  The plugin provides the voice key values by way of a dictionary collection, where the key is the name of the group & command (separated by '|'), and the value provides the voice key definition using a structured format.

When defining the voice key for a new command group, the voice command should be omitted.  This key will then be used to create the Voice Command Group container which holds all of the commands.  All subsequent voice commands created within the group should then start with the group name followed by a '|' separator.  If a voice command is created using a voice group which doesn't have a specific voice group key defined, one without a voice command and separator, a default group with that name having no voice keys will be created.  Remember that the voice keys on the command group indicate the first word or words to be spoken in the command, unless otherwise indicated with a wildcard.  When no voice keys are defined, it becomes a default group selected only after no other voice command groups match the spoken phrase.  Once the Voice Command Group is created, all of the other entries will be created as Voice Commands on that group with their key as their name.

Here are a few examples:

{"Test", ""}
Creates a voice group "Test" with no defined voice keys, which will act as a default group.
{"Test", "Test,Testing"}
Creates a voice group "Test" with two voice keys, "Test" and "Testing".
{"Test|One", "One"}
Creates a voice command "One" on voice group "Test" with a single voice key "One".  To access this command it would be necessary to speak a phrase matching the voice key for the group "Test" as well as containing the word "One" to satisfy the individual command.
Voice Key Structured Format

Each value in the key/value pair in the dictionary is a specially formatted string.  There are four tokens used in voice command key creation:

  • | = (vertical bar) Separator for voice keys
  • , = (comma) Separator for key values
  • ! = (exclamation) Designates a blocking key value
  • @ = (at sign) Designates an assisting phrase only

Here are some examples:

"red|balloon"
Creates 2 voice keys, the first with one keyword "red" and the second "balloon".  User must say "red balloon".
"red,blue|balloon"
Creates 2 voice keys, the first with two keywords, "red" and "blue".  User must say "red balloon" or "blue balloon".
"red,!blue|ball,!balloon"
Creates 2 voice keys.  User must say "red" and "ball", but must not say either "blue" or "balloon".
"red,!blue|ball,balloon,@bouncy"
Creates 2 voice keys.  User must say "red" and either "ball" or "balloon" but not "blue".  Also, it will understand the word "bouncy" if spoken in the phrase but isn't required.  Note, it doesn't matter which key @bouncy is defined on - all assisting phrases will be captured for the command.

This structured format applies to voice keys for the command group as well as each command, with one exception: the command group only supports a single voice key (with multiple key values) so any extra keys designated with '|' will be ignored.

Putting it all together, here is an example of a valid method in C#:

        public static Dictionary<string,string> GetVoiceKeys()
        {
            return new Dictionary<string, string>
            {
                //Create 'Default' VC on 'Default' VCG
                {"Default|Default", ""},
                //Create 'Test' VCG
                {"Test", "Plugin,Test,!Code,@hint"},
                //Create 'Default' on 'Test' VCG
                {"Test|Default", ""},
                //Create VC's on 'Test' VCG
                {"Test|One", "Test|One,!Two,@hint one"},
                {"Test|Two", "Test|Two,!One,@hint two"},
            };
        }

Usually it is simpler to create all command groups and commands in a single method.  However, it is also possible to create multiple voice command groups from within the same plugin utilizing another method (or alternatively by utilizing a method parameter).  Here's an example of another method in the same class used to create a default command group for unrecognized commands:

        public static Dictionary<string,string> GetDefaultKey()
        {
            return new Dictionary<string, string>
            {
                {"Default|Default", ""},
            };
        }
Create a default command group with a default voice key to capture all unrecognized speech

Simply add another plugin to FoxVox (or easier, copy it with a right-click drag/drop) setting the method to the alternate one.

Voice Command Handling

After setting up a voice command group with voice commands, it is necessary to designate a method to handle processing speech results.  FoxVox requires the method to take a single parameter of type Dictionary<string, string> which contains the collection of the library variable values at the moment the speech is processed.  This includes all the speech (Sp_) variables which contain the processed phrase, the recognized voice command group and command (omitted if not identified), various formats of any recognized number, etc. along with any other library variables being tracked.  These variables can be used to determine the logic on what to execute inside the method.

Here is an example of a handler method in C# that writes all the parameter variable values to a text file and opens it for display:

        public static void SpeechHandler(Dictionary<string, string> results)
        {
            if (results?.Any() != true) return;
            var file = "D:\\Output.txt";
            File.WriteAllText(
                file,
                results
                    .Select(v => $"{v.Key}: {v.Value}")
                    .Aggregate((s1, s2) => s1 + Environment.NewLine + s2));
            Process.Start(new ProcessStartInfo { FileName = file, UseShellExecute = true, });
        }

Here are the contents written to the file when the phrase "Test One" is spoken:

Sp_Num: 1
Sp_NumEnd: 1
Sp_NumInt: 1
Sp_NumSep: 1
Sp_HasNum: True
Sp_Confidence: 0.94
Sp_First: Test
Sp_Last: One
Sp_Phrase: Test One
Sp_Group: MyFunctions
Sp_Command: One

It is possible to use the same method for handling speech processing on multiple voice command groups, or split them out into their own handlers on each one.  It's just a matter of designating the desired Handler on each plugin defined.

Sample Plugin

For those interested, here is the full code used for the plugin in this example.  The first example provides a static class, whereas the second provides an instantiated class which can preserve and track its own values.   Note that for now instantiated classes require an empty constructor without parameters in order to be created in FoxVox.

using System.Diagnostics;
namespace MyLib
{
    public static class MyFunctions
    {
        private static Version? FVVersion;
        public static void RegisterFV(Action<string> msgInvoke, Dictionary<string, string> info)
        {
            FVMessage = msgInvoke;
            FVVersion = info.TryGetValue("Version", out var v) && Version.TryParse(v, out var version) ? version : null;
            ShowMessage($"Plugin is {((FVVersion?.Major ?? 0) == 3 ? "" : "in")}compatible with v{FVVersion}");
        }
        // Legend for voice command creation:
        // Key = Voice Command Group Name | Voice Command Name
        //   | = Separator between VCG name and VC
        //   Note: Create VCG by omitting the "|" and VC name
        // Value = Voice command keys
        //   | = Separator for voice keys
        //   , = Separator for key values
        //   ! = Designates a blocking key
        //   @ = Designates an assisting phrase
        public static Dictionary<string,string> GetVoiceKeys()
        {
            return new Dictionary<string, string>
            {
                //Create 'Default' VC on 'Default' VCG
                {"Default|Default", ""},
                //Create 'Test' VCG
                {"Test", "Plugin,Test,!Code,@hint"},
                //Create 'Default' on 'Test' VCG
                {"Test|Default", ""},
                //Create VC's on 'Test' VCG
                {"Test|One", "Test|One,!Two,@hint one"},
                {"Test|Two", "Test|Two,!One,@hint two"},
            };
        }
        public static void SpeechHandler(Dictionary<string, string> results)
        {
            if (results?.Any() != true) return;
            var file = "D:\\Output.txt";
            File.WriteAllText(
                file,
                results
                    .Select(v => $"{v.Key}: {v.Value}")
                    .Aggregate((s1, s2) => s1 + Environment.NewLine + s2));
            Process.Start(new ProcessStartInfo { FileName = file, UseShellExecute = true, });
            ShowMessage("Results shown!");
        }
        private static Action<string>? FVMessage;
        public static void ShowMessage(string message)
        {
            FVMessage?.Invoke(message);
        }
        public static void Cmd(string StartDir = "")
        {
            Process.Start(new ProcessStartInfo
            {
                FileName = Path.Combine(Environment.SystemDirectory, "cmd.exe"),
                WorkingDirectory = StartDir,
            });
        }
        public static double Triple(double val)
        {
            return val * 3;
        }
    }
}
Example of a non-instantiable static class providing methods for custom voice control
using System.Diagnostics;
namespace MyLib
{
    public class MyFunctionsObj
    {
        private readonly Guid Id = Guid.NewGuid();
        private Version? FVVersion;
        public void RegisterFV(Action<string> msgInvoke, Dictionary<string, string> info)
        {
            FVMessage = msgInvoke;
            FVVersion = info.TryGetValue("Version", out var v) && Version.TryParse(v, out var version) ? version : null;
            ShowMessage($"Plugin is {((FVVersion?.Major ?? 0) == 3 ? "" : "in")}compatible with v{FVVersion}");
        }
        // Legend for voice command creation:
        // Key = Voice Command Group Name | Voice Command Name
        //   | = Separator between VCG name and VC
        //   Note: Create VCG by omitting the "|" and VC name
        // Value = Voice command keys
        //   | = Separator for voice keys
        //   , = Separator for key values
        //   ! = Designates a blocking key
        //   @ = Designates an assisting phrase
        public Dictionary<string,string> GetVoiceKeys()
        {
            return new Dictionary<string, string>
            {
                //Create 'Default' VC on 'Default' VCG
                {"Default|Default", ""},
                //Create 'Test' VCG
                {"Test", "Plugin,Test,!Code,@hint"},
                //Create 'Default' on 'Test' VCG
                {"Test|Default", ""},
                //Create VC's on 'Test' VCG
                {"Test|One", "Test|One,!Two,@hint one"},
                {"Test|Two", "Test|Two,!One,@hint two"},
            };
        }
        public void SpeechHandler(Dictionary<string, string> results)
        {
            if (results?.Any() != true) return;
            var file = "D:\\Output.txt";
            File.WriteAllText(
                file,
                results
                    .Select(v => $"{v.Key}: {v.Value}")
                    .Aggregate((s1, s2) => s1 + Environment.NewLine + s2));
            Process.Start(new ProcessStartInfo { FileName = file, UseShellExecute = true, });
            ShowMessage($"Results shown by object {Id}!");
        }
        private Action<string>? FVMessage;
        public void ShowMessage(string message)
        {
            FVMessage?.Invoke(message);
        }
        public double LastVal { get; set; }
        public double Triple(double val)
        {
            LastVal = val * 3;
            return LastVal;
        }
    }
}
Example of a non-static instantiable class providing instance methods for custom voice control

These plugin examples also contains a couple of methods used for integrating custom variables and functions as detailed in another tutorial here.  Have fun creating your own custom plugin for voice control with FoxVox!

Get FoxVox

Download NowName your own price

Leave a comment

Log in with itch.io to leave a comment.