Speech To Text
| Property | Type | Access | Description |
|---|---|---|---|
ShowAdvancedOptions |
bool |
get/set |
Whether to reveal advanced configuration in the editor. [default=false]. Toggle on to show options like the script callback fields. |
ModelSourceUrl |
Uri |
get/set |
Path to the speech recognition model file (.bin) used for transcription. Larger models give more accurate recognition but cost more processing time per second of audio. Loading a new model reinitialises the operator. |
AutoStart |
bool |
get/set |
Whether to start listening automatically once the model finishes loading. [default=true]. Saves a manual click when the project is loaded fresh; turn off if you want to start transcription only on demand. |
OperatorState |
OperatorState |
get |
Current state of the operator (read-only). Reports whether the model is loading, ready, running, stopped, or in an error state. |
StartCommand |
Command |
get |
Begin transcribing the layer's audio. Available once a valid model is loaded. |
StopCommand |
Command |
get |
Stop transcribing audio. |
ClearCommand |
Command |
get |
Clear the on-screen captions and reset the audio context so previous speech doesn't bleed back in. |
ConfidenceLevel |
int |
get/set |
Minimum confidence (in percent) a recognised word must score to appear on screen. [min=10, max=100, default=70]. Raise to suppress weak guesses — fewer words appear, but the ones that do are more likely correct. Lower to surface more text, at the cost of occasional misheard words. |
AudioBuffer |
int |
get/set |
How much audio is collected before each transcription pass, in milliseconds. [min=100, max=2000, default=300]. Lower values give snappier captions at the cost of accuracy — there's less context for the model to reason from. Higher values give better accuracy but captions appear with more delay. |
PauseThreshold |
int |
get/set |
How long the speaker must be silent before a new subtitle card is started, in milliseconds. [min=0, max=5000, default=750]. Shorter values break the captions into smaller chunks more often. Longer values let long sentences flow as one block but cards stay on screen longer between sentences. |
NoSpeechThreshold |
int |
get/set |
How aggressively to filter audio segments that probably contain no speech, as a percentage. [min=0, max=100, default=60]. Lower values filter more aggressively — good for noisy environments where the model hears phantom words during silence. Raise if quiet speech is being missed. |
SubtitlesScreenTimout |
int |
get/set |
How long captions stay on screen after the speaker stops talking, in milliseconds. [min=100, max=10000, default=5000]. Longer values give the audience more reading time. Shorter values keep the screen uncluttered between sentences. |
ResetThresholdValuesCommand |
Command |
get |
Reset all threshold values to their defaults (confidence, audio buffer, pause detection, no-speech, screen timeout). |
ShowSubtitlesCheckBox |
bool |
get/set |
Whether to draw captions on the output image. [default=true]. Turn off if you only want to use the recognised text from a script (via RecentText and the callback) without on-screen captions. |
SubtitlesPosX |
int |
get/set |
Horizontal position of the caption block, in pixels from the left edge. [min=0, max=4096]. |
SubtitlesPosY |
int |
get/set |
Vertical position of the caption block, in pixels from the top edge. [min=0, max=4096]. |
ResetTextPositionCommand |
Command |
get |
Reset caption position to the default location. |
RecentText |
FormattedMessage |
get |
Most recently transcribed text (read-only). Updates every time the speech recognition engine produces a new result. Read this from a script to forward the live transcript to chat overlays, captions widgets, or any external system. |
IsSubtitleActive |
bool |
get |
True while a caption is currently being shown (read-only). Resets when the text-on-screen timeout expires or the operator is cleared. Useful for scripts that need to react when speech starts or ends. |
SubtitleStartTime |
string |
get |
UTC timestamp when the current subtitle segment started (read-only). Updated on each new speech segment. Useful for tagging subtitles with absolute time when feeding them to external systems. |
SubtitleEndTime |
string |
get |
UTC timestamp when the current subtitle segment ended (read-only). Empty while the speaker is still talking; populated once the segment closes. |
SubtitleStartPts |
long |
get |
Video stream timestamp marking when the current subtitle segment started (read-only). Zero if the input does not provide a presentation timestamp. Useful for matching captions to specific frames when post-processing recordings. |
SubtitleEndPts |
long |
get |
Video stream timestamp marking when the current subtitle segment ended (read-only). Zero while speech is still active, or if the input does not provide a presentation timestamp. |
FontSize |
int |
get/set |
Caption font size, in pixels. [min=11, max=60, default=32]. |
FontColorR |
int |
get/set |
Red component of the caption text colour. [min=0, max=255, default=255]. |
FontColorG |
int |
get/set |
Green component of the caption text colour. [min=0, max=255, default=255]. |
FontColorB |
int |
get/set |
Blue component of the caption text colour. [min=0, max=255, default=255]. |
FontAlpha |
int |
get/set |
Caption text opacity. [min=0, max=255, default=255]. 0 is fully transparent, 255 is fully solid. |
SubtitleBackgroundAlpha |
int |
get/set |
Caption background opacity. [min=0, max=255, default=90]. 0 hides the background, 255 makes it fully solid. A subtle dark background helps readability over busy footage. |
ResetTextAppearanceCommand |
Command |
get |
Reset all text appearance settings (font size, colour, alpha, background) to their defaults. |
MaxLineLimit |
int |
get/set |
Maximum number of caption lines on screen at once. [min=1, max=10]. When the limit is reached the oldest line scrolls away. Lower values keep the screen uncluttered; higher values give more reading time across longer monologues. |
MaxCharPerLineLimit |
int |
get/set |
Maximum characters per caption line before wrapping. [min=1, max=200]. |
SmallLettersOnly |
bool |
get/set |
Whether all captions are forced to lower-case. [default=false]. "I" and a few common contractions are still kept capitalised for readability. |
ResetTextSettingsCommand |
Command |
get |
Reset text settings (max lines, max chars per line, lower-case mode) to their defaults. |
EnableTextReplacement |
bool |
get/set |
Whether to apply find-and-replace rules to the recognised text. [default=false]. Useful for fixing recurring misheard words ("Hugh" → "you"), expanding domain abbreviations, or filtering profanity. Pair with a rules file via TextReplacementFileUrl. |
TextReplacementFileUrl |
Uri |
get/set |
Path to a JSON file containing find-and-replace rules. Each rule is a key/value pair where the key is the pattern to find and the value is the replacement. Supports exact matches and wildcard patterns (* for any characters, ? for one character). Reload Media re-reads the file if it changes on disk. |
RulesLoadedCount |
int |
get |
Number of rules successfully loaded from the rules file (read-only). |
ReplacementsAppliedCount |
int |
get |
Total number of replacements applied since the rules file was loaded (read-only). |
EnableReplacementStats |
bool |
get/set |
Whether to log replacement statistics to a file in the Reports folder. [default=false]. Records which rules fired and how often, written to a file next to the executable and updated periodically while running. Useful for tuning the rules file by seeing what's actually being matched. |
ReplacementStatsFileName |
string |
get |
Path of the current replacement-stats file (read-only). |
ReplacementStatsStatus |
string |
get |
Status of the replacement-stats logger (read-only). |
EnableTranscription |
bool |
get/set |
Whether to write all recognised speech to a transcript file in the Reports folder. [default=false]. Each speech segment is written as a timestamped line. Useful for archiving live shows and for accessibility post-production. The project must be saved first, since the transcript filename is based on the project name. |
TranscriptFileName |
string |
get |
Path of the current transcript file (read-only). |
ScriptCallbackFunction |
string |
get/set |
Name of a Script Engine function to call each time the transcription updates. Receives a JSON payload with the recognised text and PTS timing. Leave empty to disable. Useful for forwarding live captions to chat overlays, third-party caption systems, or moderation pipelines. |
See also: Speech To Text in Operators — user-facing introduction, screenshots, and section summaries.