AI Pipeline
A DataClass of type BASIC can declare a "pipeline" block to orchestrate multi-phase background processing. The platform executes phases sequentially, supporting both JavaScript scripts and parallel AI extraction tasks.
Declaration
"pipeline": {
"phaseField": "pipelinePhase",
"phases": [
{ "name": "init", "script": "convertToMarkdown" },
{ "name": "extraction", "tasks": [ { "name": "extractInfo", "promptFile": "prompts/extractInfo", "schemaFile": "schemas/extractInfo" } ] },
{ "name": "aggregate", "script": "aggregateDeed" }
]
}
| Property | Type | Description |
|---|---|---|
phaseField | string | The field on the class that holds the current phase name. Must be a TEXT value type. |
phases | Phase[] | Ordered list of phases to execute. |
Each phase has a name and either a script (script-phase) or tasks (task-phase):
| Property | Type | Description |
|---|---|---|
name | string | Phase name. The value stored in phaseField must match this name to trigger the phase. |
script | string | Script name to invoke (script-phase). References a gateway or JavaScript script defined on the class. |
tasks | Task[] | Parallel AI extraction tasks to run (task-phase). |
Each task in a task-phase:
| Property | Type | Description |
|---|---|---|
name | string | Unique task name within the pipeline. Used as the key in params in the aggregation script. |
promptFile | string | Path to a Handlebars prompt template file. |
schemaFile | string | Path to a Gemini responseSchema JSON file. |
Execution flow
DataParsingService— on create (or PATCH with an emptyphaseField), automatically setsphaseFieldto the name of the first phase.DataService— queues aRUN_PIPELINEbackground task wheneverphaseFieldchanges.PipelineService.runPhase()— reads the current value ofphaseFieldand executes the matching phase:- Script-phase: invokes the named script; on success, advances
phaseFieldto the next phase name. - Task-phase: creates parallel
commons.aiTaskobjects (max 2 concurrent Gemini calls), storesresponseJsonand token usage. On success → advances to next phase. On error → setsstatus = ERRORon the intake object. - Final phase: leaves
phaseFieldset to the last phase name and setsstatus = READYon the intake object.
- Script-phase: invokes the named script; on success, advances
Never set phaseField to null after the pipeline completes. DataParsingService treats an empty phaseField as a fresh start and will re-initialise the pipeline, causing an infinite loop.
commons.aiTask data model
Each parallel Gemini call produces a commons.aiTask object linked to the intake.
| Field | Type | Description |
|---|---|---|
taskName | TEXT | Unique task name within the pipeline. |
phase | TEXT | Name of the phase that created this task. |
status | LIST | PENDING / RUNNING / DONE / ERROR |
responseJson | JSON | Raw JSON response from Gemini. |
errorMessage | TEXT | Set when status = ERROR. |
promptTokens | INTEGER | Prompt token count from Gemini usageMetadata. |
responseTokens | INTEGER | Response token count. |
totalTokens | INTEGER | Total token count. |
Relation: commons.aiTask → commons.intake (required, cascade delete).
Prompt templates
promptFile references a Handlebars template. The context available in the template:
- All fields of the intake object as flat key-value pairs (e.g.
{{markdownContent}}). inputTasks— a map oftaskName → responseJson stringfor previously completed tasks in the same pipeline run.
Aggregation script
When a script-phase follows a task-phase, the completed task results are available in the script's params argument as parsed JavaScript objects — no JSON.parse required.
function execute(api, dataObject, params) {
const shareholders = params.extractShareholders; // direct JS object
shareholders.items.forEach(s => {
// write back to dataObject fields
});
}
The key in params matches the name of the task as declared in the pipeline definition.
Schema files
schemaFile references a JSON file containing the Gemini responseSchema. Define the expected response structure here.
Use "type": "string" for monetary amounts and other decimal values. Using "type": "number" activates Gemini's constraint solver and causes timeouts of 90 seconds or more.
{
"type": "object",
"properties": {
"shareCapital": { "type": "string" },
"currency": { "type": "string" }
}
}