Poiesis Architecture
Database
Poiesis
uses MongoDB to store task data, instead of storing the data directly in the database, it adds extra fields and redundancy to the data to make it easier to query and analyze.
{
"id": "123",
"task_name": "test",
"task_status": "RUNNING",
"data": {
"task_id": "123",
"task_name": "test",
"task_status": "RUNNING"
}
}
INFO
The fields like task_name
, task_status
are redundant, but they are useful as they are used to filter tasks in the database.
Poiesis
also then stores extra info per task such as user_id
which is the unique id from OIDC provider, service_hash
which is the hash of the service document when the task is created etc. To know more about the fields, please refer to the TaskSchema.
Task creation
INFO
The above diagram shows the flow of task creation in Poiesis
, its not verbatim but gives a high level overview of the process.
Initialization:
- The User submits a task request to the API.
- The API generates a unique ID (UUID) for the task and creates a corresponding record in MongoDB (
TaskDB
). This database entry is the central source of information for the task state. - Once the task is persisted in the database, the API triggers the creation of the main
Torc
Job in Kubernetes.
Data Preparation:
Torc
first requests the creation of a Persistent Volume Claim (PVC) as specified by the user.Torc
then launches a sub-job/pod calledTIF
(Task Input Fetcher).TIF
downloads the necessary input data and mounts/places it onto the PVC.- Upon completion,
TIF
sends a message to a task-specific Redis Channel indicating that the input data is ready. Torc
listens to this channel and proceeds once notified.
Execution:
Torc
launchesTExAM
(Task Executor And Monitor).TExAM
is responsible for creating and launching the actual Task Executor pods (TE
).TExAM
ensures the data from the PVC (both input and space for output) is correctly mounted into the Task Executor pods.TExAM
monitors the lifecycle of all Task Executor pods.- The Task Executor pods perform the core work, reading input from and writing output to the PVC.
- Once all Task Executors have finished,
TExAM
signals completion via the Redis Channel. Torc
receives this notification.
Data Output:
Torc
launches the final sub-job/pod,TOF
(Task Output Fetcher).TOF
reads the resulting output data generated by the Task Executors from the PVC.TOF
uploads this data to the final User Output Location specified in the initial request.
Status and Logging (Ongoing):
- Throughout the process, both
Torc
andTExAM
periodically update the task's status and relevant logs (system logs, executor logs) in the central MongoDB (TaskDB
).
- Throughout the process, both