LLM (Ollama)

LLM (Ollama) input properties for Script Engine. Talks to a locally hosted Ollama server and produces text generated by a large language model. Useful for AI hosts and assistants in broadcasts, automatic summarisation of chat or news feeds, on-the-fly translation, scripted Q&A bots, content moderation, and any workflow that needs natural-language text generated by AI on demand. Includes optional chat persistence so conversations are saved to disk and can be resumed later. Response speed and quality depend entirely on the chosen model and the server hardware.

Property Type Access Description
ShowAdvancedOptions bool get/set Whether to reveal advanced configuration in the editor. [default=false].
AutoStart bool get/set Whether to connect to the Ollama server automatically when the project loads. [default=true]. Saves a manual click when the project is loaded fresh; turn off if you want to connect only on demand from a script or button press.
ServerUrl string get/set URL of the Ollama server to talk to (e.g. http://localhost:11434). Use http:// for direct connections — local ollama serve only serves plain HTTP. Use https:// for Ollama Cloud (https://ollama.com) or for a self-hosted Ollama sitting behind a reverse proxy that terminates TLS. Vanilla ollama serve has no built-in authentication and accepts every request regardless of any Authorization header — for endpoints that DO require a Bearer token (Ollama Cloud, a reverse- proxied Ollama with auth configured, or another Ollama-API-compatible service), set the COMPOSER_OLLAMA_APIKEY environment variable on the Composer machine.
LlmStatus LlmStatus get Current state of the LLM input (read-only). Reports whether the input is disconnected, connecting, connected, receiving a response, thinking, or in an error state.
ConnectCommand Command get Connect to the Ollama server using the current ServerUrl.
DisconnectCommand Command get Disconnect from the Ollama server and cancel any in-flight response.
AvailableModels StringCollectionEnum get/set List of every model installed on the connected Ollama server. Populated after connecting. Pick the one you want to chat with — text models for conversations, embedding-only or image-only models won't accept prompts and the Send button will be disabled. Changing the selection resets the current chat history.
ContextSize ContextWindowSize get/set Maximum tokens the model can attend over per request (Ollama's num_ctx). Set to ModelDefault to use the Modelfile value (typically 2048-4096); any other member forces the corresponding override. Larger values let longer chats fit but increase VRAM usage.
EnableThinking bool get/set Enable the reasoning phase on thinking-capable models. Ignored on models without a thinking capability.
Temperature float get/set Controls how creative or predictable the model's responses are [min=0.0, max=2.0]. Lower values make the model pick the most expected words — best for code, math, and factual answers. Higher values make it take more chances and pick less obvious words — best for creative writing and brainstorming. 0.0 always picks the single most likely word (predictable but boring).
TopP float get/set Probability threshold for filtering word choices [min=0.0, max=1.0]. At each step, the model gathers the most-likely words until their combined probability reaches Top P; those words become the eligible pool. The pool size adapts to the model's confidence — fewer words when one choice dominates, more when probabilities spread evenly. Counterintuitively, lower Top P often keeps fewer words, not more (Top P = 0.5 may end up with just one word). Useful range: 0.9–1.0; below 0.5 is effectively greedy. Lower it if responses contain odd or surprising words.
TopK int get/set Hard cap on how many word choices the model considers at each step [min=0, max=200]. 0 = disabled (no Top K filter; only Top P narrows the pool); 1 = always picks the single most likely word (fully predictable, equivalent to Temperature 0.0); 200 = effectively no cap. Applies before Top P in the sampling pipeline: Top K caps the candidate count first, then Top P narrows further within those K words. The smaller pool always wins — a low Top K can prevent Top P from gathering as many words as it would like.
Seed int get/set Controls the randomness used when picking words [default=-1]. -1 picks a fresh random number every request, so the same prompt produces a different response each time (normal chat behaviour). Setting a specific number (e.g. 42) makes the output reproducible: same seed + same prompt + same options gives the exact same response every time. Useful for regression tests ("did my prompt change actually improve things, or was the difference just random?"), debugging weird responses (reproduce them to investigate), or demos that need consistent output.
MaxOutputTokens int get/set Maximum length of the model's response, measured in tokens (~0.75 words per token, so 100 tokens ≈ 75 words ≈ a short paragraph) [default=-1, unlimited]. -1 = no limit; the model stops when it is naturally done. Set a positive value (e.g. 100) to enforce a hard cap on response length — useful for bounding latency or cost in automated pipelines. This is a guillotine, not a polite request: the model is unaware of the cap and will be cut off mid-sentence. For "respond briefly" behaviour, instruct the model in the System Prompt instead, and use this only as a safety ceiling.
StopSequences string get/set Custom stop sequences, one per line. The model halts immediately when it produces any of these strings (the matched string is excluded from the response). Empty lines are ignored. The model's own chat-template end-of-turn tokens (e.g. <|eot_id|>, <end_of_turn>) are applied automatically and are not shown here — you don't need to add them. Use this field for your own stops, e.g. "User:" to prevent the model from roleplaying both sides of a conversation, or a marker like "---" to halt at a specific boundary.
ResetTuningCommand Command get Resets the six Response Tuning options to the selected model's effective defaults (Modelfile values where present, Ollama floor otherwise). Requires an active connection; disabled otherwise.
SystemPrompt string get/set Optional system prompt that frames every request the model receives. Leave empty to use whatever system prompt the model ships with. Set it to give the model a persona, restrict it to a topic, or override the default. Changes apply immediately to the next prompt. Useful for "act as a sports commentator", "always reply in JSON", or strict moderation rules.
UserPrompt string get/set The user message to send to the model on the next Send Prompt. Set this from a script for fully automated chat flows, or type into the field for interactive use. Cleared on each successful send.
SendPromptCommand Command get Send the current UserPrompt to the connected model. Requires an active connection and a model that supports text completion. Replies are streamed into LastResponseText and surfaced through the script callback if one is configured.
ClearChatCommand Command get Start a new chat — cancels any in-flight response and resets chat history, token counters, and the last response.
LastPrompt string get The user prompt from the most recent exchange (read-only). Mirrors what was sent to the model so a script can correlate prompt and response.
LastResponseText string get The full text of the most recent response from the model (read-only). Updated as the response streams in. Read this from a script to forward AI-generated text to overlays, captions, or external systems.
LastResponseTime int get Time taken to receive the most recent response, in milliseconds (read-only). Useful for monitoring server load and detecting slow responses.
ChatMessageCount int get Number of prompts sent in the current chat session (read-only). Resets on New Chat, model change, or disconnect.
ScriptCallbackFunction string get/set Name of a Script Engine function to call each time a response is received. The function receives an object with prompt, response, model, and messageCount fields. Leave empty to disable. Useful for forwarding AI replies to chat overlays, triggering scene changes, or feeding generated text into other components.
EnableChatPersistence bool get/set Whether to auto-save each exchange to a chat file under Documents/Composer/LLM Chats/. [default=false]. On preserves conversations across sessions so you can resume them later. Files are compacted automatically so size stays bounded by the selected model's context window.
AvailableChats StringCollectionEnum get/set List of saved chats found in the chats folder. Pick an entry to load it immediately; the active chat is pre-selected. Refreshed when chat history is enabled and after each auto-save. The top entry is empty — picking it starts a fresh chat.
LastSaved string get Timestamp of the last disk write for the current chat (read-only). Empty until the first auto-save or load.
OpenChatFolderCommand Command get Open the chats folder in the operating system's file manager. Useful for backing up, renaming, or deleting chat files outside of Composer.
RefreshChatListCommand Command get Rescan the chats folder. Use after renaming or deleting chat files outside of Composer to refresh AvailableChats.
PlaybackState PlaybackState get Connection state — drives the enable/disable state of commands (read-only).
ModelVram string get Memory consumed by the model currently loaded in Ollama (read-only).
ModelProcessor string get How much of the model is running on the GPU vs the CPU (read-only). Format like "100% GPU" or "60% GPU/40% CPU". Lower GPU percentages mean slower responses — Ollama spills layers to CPU when the model doesn't fit in VRAM.
ModelContextLength int get How much text (in tokens) the model can hold in context (read-only). Populated after the first prompt is sent. Compare with TokensUsed to gauge how full the chat is.
TokensUsed int get Tokens consumed by the chat history on the most recent request (read-only). Compare with ModelContextLength to see how full the chat history is — click New Chat to reset when getting close to the limit. Reset by New Chat, model change, or disconnect.
TruncationCount int get Number of times Ollama has silently truncated the chat history this session (read-only). Non-zero means the model has lost earlier context and responses may be degraded (shorter or less detailed) — click New Chat to reset, or raise Context Size to give the model more room (this can also be configured server-side in Ollama).

Inherits from: AbstractInput, AbstractAudioProcessing, AbstractAudioMetering.

See also: LLM (Ollama) in Inputs — user-facing introduction, screenshots, and section summaries.