Skip to main content

AI Pipeline

A DataClass of type BASIC can declare a "pipeline" block to orchestrate multi-phase background processing. The platform executes phases sequentially, supporting both JavaScript scripts and parallel AI extraction tasks.

Declaration

"pipeline": {
"phaseField": "pipelinePhase",
"phases": [
{ "name": "init", "script": "convertToMarkdown" },
{ "name": "extraction", "tasks": [ { "name": "extractInfo", "promptFile": "prompts/extractInfo", "schemaFile": "schemas/extractInfo" } ] },
{ "name": "aggregate", "script": "aggregateDeed" }
]
}
PropertyTypeDescription
phaseFieldstringThe field on the class that holds the current phase name. Must be a TEXT value type.
phasesPhase[]Ordered list of phases to execute.

Each phase has a name and either a script (script-phase) or tasks (task-phase):

PropertyTypeDescription
namestringPhase name. The value stored in phaseField must match this name to trigger the phase.
scriptstringScript name to invoke (script-phase). References a gateway or JavaScript script defined on the class.
tasksTask[]Parallel AI extraction tasks to run (task-phase).

Each task in a task-phase:

PropertyTypeDescription
namestringUnique task name within the pipeline. Used as the key in params in the aggregation script.
promptFilestringPath to a Handlebars prompt template file.
schemaFilestringPath to a Gemini responseSchema JSON file.

Execution flow

  1. DataParsingService — on create (or PATCH with an empty phaseField), automatically sets phaseField to the name of the first phase.
  2. DataService — queues a RUN_PIPELINE background task whenever phaseField changes.
  3. PipelineService.runPhase() — reads the current value of phaseField and executes the matching phase:
    • Script-phase: invokes the named script; on success, advances phaseField to the next phase name.
    • Task-phase: creates parallel commons.aiTask objects (max 2 concurrent Gemini calls), stores responseJson and token usage. On success → advances to next phase. On error → sets status = ERROR on the intake object.
    • Final phase: leaves phaseField set to the last phase name and sets status = READY on the intake object.
warning

Never set phaseField to null after the pipeline completes. DataParsingService treats an empty phaseField as a fresh start and will re-initialise the pipeline, causing an infinite loop.

commons.aiTask data model

Each parallel Gemini call produces a commons.aiTask object linked to the intake.

FieldTypeDescription
taskNameTEXTUnique task name within the pipeline.
phaseTEXTName of the phase that created this task.
statusLISTPENDING / RUNNING / DONE / ERROR
responseJsonJSONRaw JSON response from Gemini.
errorMessageTEXTSet when status = ERROR.
promptTokensINTEGERPrompt token count from Gemini usageMetadata.
responseTokensINTEGERResponse token count.
totalTokensINTEGERTotal token count.

Relation: commons.aiTaskcommons.intake (required, cascade delete).

Prompt templates

promptFile references a Handlebars template. The context available in the template:

  • All fields of the intake object as flat key-value pairs (e.g. {{markdownContent}}).
  • inputTasks — a map of taskName → responseJson string for previously completed tasks in the same pipeline run.
## Document
{{markdownContent}}

## Known shareholders
{{inputTasks.extractShareholders}}

Extract the share allocation for each shareholder listed above.

Aggregation script

When a script-phase follows a task-phase, the completed task results are available in the script's params argument as parsed JavaScript objects — no JSON.parse required.

function execute(api, dataObject, params) {
const shareholders = params.extractShareholders; // direct JS object
shareholders.items.forEach(s => {
// write back to dataObject fields
});
}

The key in params matches the name of the task as declared in the pipeline definition.

Schema files

schemaFile references a JSON file containing the Gemini responseSchema. Define the expected response structure here.

warning

Use "type": "string" for monetary amounts and other decimal values. Using "type": "number" activates Gemini's constraint solver and causes timeouts of 90 seconds or more.

{
"type": "object",
"properties": {
"shareCapital": { "type": "string" },
"currency": { "type": "string" }
}
}