Skip to main content

Tasks data model

SCHEMA api's data model can be described with the following ER diagram, in crow notation:

SCHEMA api ER diagram

In essence, the ER diagram can be reduced to the following assertions:

  • Task is the first-class citizen of the Tasks API
  • A Task can have zero or more Executors and reversely each Executor is related to exactly one Task
  • An Executor can use zero or more Environment variables, while each Environment variable is bound to exactly one Executor
  • An Executor can be related to zero or at most one ExecutorOutputLog and each ExecutorOutputLog is related to exactly one Executor
  • A Task can have zero or more MountPoints. Each MountPoint on the other hand is related to exactly one Task
  • A Task can utilize zero or more Volumes while each Volume definition is connected to exactly one Task
  • Tasks can be tagged with zero or more Tags. Every Tag is related to one Task
  • A Task can optionally request one ResourceSet and each ResourceSet is being claimed by a specific Task
note

Based on the above relationship statements, a Task can exist without a corresponding Executor in the database. This means, that there can exist a submitted task that doesn't actually run a containerized process. This happens because it is impossible to enforce an "at-least-one relationship" in the database layer without relying on the use of database triggers.

Nevertheless, this is remedied on the level of the API. Incoming task requests are considered invalid if they do not define at least one executor.

Relational model

This page provides details about the database tables being created for the Tasks API. A certain description is given for each database column, that defines whether the column is used as private key (PK), its name and database type, if the column values are constrained to be unique, if the column can accept null values and a brief description of the column's use. Moreover, any column-specific database constraints are listed on their corresponding columns while multi-column database constraints at the end of each relation's section.

Task

PKNameTypeUnique?Can be NULL?DefaultAdditional constraintsDescription
🔑idBIGINTNext value in corresponding sequencePrimary key column defined by Django ORM
uuidUUIDA random UUID valueA UUID to be used by the users to refer to the task
api_task_idVARCHAR(255)''An ID assigned by the underlying task submission API (TESK in this case)
nameVARCHAR(255)''a) Not empty or whitespaceName of the task
descriptionTEXT''A description of the task
pendingBOOLEANTrueWhether the task execution is pending or it has completed
statusVARCHAR(30)SUBMITTED status is assigned to a new task by defaulta) Value must always be either one of Schema API's status descriptorsA string literal describing the current status of a task
submitted_atTIMESTAMP WITH TIMEZONEBy default, Django will always pass to this field the time of the creation of the recordThe time when the request was accepted and stored in the database
latest_updateTIMESTAMP WITH TIMEZONEnulla) Must always correspond to a time later than the one which the task was submitted (submitted_at)The time when the corresponding data and status of the task request was lastly updated

Each row of the table Task is also subject to these multi-column constraints:

  • If the status is SUBMITTED then the api_task_id can be blank. Otherwise, api_task_id must not be blank or whitespace.
  • pending must be false when the status is CANCELED,COMPLETED or ERROR. Otherwise, it must be true.

Executor

PKNameTypeUnique?Can be NULL?DefaultAdditional constraintsDescription
🔑idBIGINTNext value in corresponding sequencePrimary key column defined by Django ORM
orderSMALLINTnullA number that indicates the precedence of this executor among the rest of the executors of the corresponding task
commandJSONnullA JSON array that corresponds to the command array that can be run in a container
imageVARCHAR(255)''a) Not empty or whitespaceDocker image definition
stderrVARCHAR(255)''Path to a file in the container to redirect stderr
stdinVARCHAR(255)''Value to pass into container's stdin
stdoutVARCHAR(255)''Path to a file in the container to redirect stdout
workdirVARCHAR(255)''Path to the working directory in the container
task_idBIGINTnullForeign key to Task.idCorresponding task to which this executor belongs to

Each row of the table Executor is also subject to the multi-column constraint described below:

  • task_id and order are unique together

Env

PKNameTypeUnique?Can be NULL?DefaultAdditional constraintsDescription
🔑idBIGINTNext value in corresponding sequencePrimary key column defined by Django ORM
keyVARCHAR(255)''a) Not empty or whitespaceEnvironment variable name
valueTEXT''Environment variable value
executor_idBIGINTnullForeign key to Executor.idCorresponding executor in whose container this variable should exist

ExecutorOutputLog

PKNameTypeUnique?Can be NULL?DefaultAdditional constraintsDescription
🔑idBIGINTNext value in corresponding sequencePrimary key column defined by Django ORM
stdoutTEXT''Stdout recorded from a specific executor execution
stderrTEXT''Stderr recorded from a specific executor execution
executor_idBIGINTnullForeign key to Executor.idCorresponding Executor that produced this output log

MountPoint

PKNameTypeUnique?Can be NULL?DefaultAdditional constraintsDescription
🔑idBIGINTNext value in corresponding sequencePrimary key column defined by Django ORM
nameVARCHAR(255)''Optional name of the mount point
descriptionVARCHAR(255)''Optional description of the mount point
urlVARCHAR(255)''URL of the file or directory on the user's public file system home
pathVARCHAR(255)''a) Not empty or whitespaceCorresponding path inside the container
typeVARCHAR(255)''a) Value must always be either one of Schema API's filesystem entity descriptorsIndicates whether a file or a directory is being mounted
is_inputBOOLEANTrueIndicates whether it is an input mount point or an output mount point
contentTEXT''Value to set to an input mount point file
task_idBIGINTnullForeign key to Task.idCorresponding task that uses this mount point

Each row of the table MountPoint is also subject to the following multi-column constraint:

  • url can be blank or whitespace, only if the record is an input(is_input=True) file(type=FILE) mount point.

Volume

PKNameTypeUnique?Can be NULL?DefaultAdditional constraintsDescription
🔑idBIGINTNext value in corresponding sequencePrimary key column defined by Django ORM
pathVARCHAR(255)''a) Not empty or whitespacePath in the container to mount the volume
task_idBIGINTnullForeign key to Task.idCorresponding task that owns this volume

Tag

PKNameTypeUnique?Can be NULL?DefaultAdditional constraintsDescription
🔑idBIGINTNext value in corresponding sequencePrimary key column defined by Django ORM
keyVARCHAR(255)''a) Not empty or whitespaceTag name
valueVARCHAR(255)''Tag value
task_idBIGINTnullForeign key to Task.idCorresponding Task tagged with this tag

ResourceSet

PKNameTypeUnique?Can be NULL?DefaultAdditional constraintsDescription
🔑idBIGINTNext value in corresponding sequencePrimary key column defined by Django ORM
cpu_coresINTEGERnulla) CPU cores claim is greater than 0CPU cores claim
ram_gbFLOATnulla) RAM in GBs is greater than 0RAM claim in GBs
disk_gbFLOATnulla) Disk in GBs is greater than 0Disk claim in GBs
preemptibleBOOLEANnullWhether the task is allowed to run on preemptible compute instances. Exclusively used by TESK. Included for compatibility purposes
zonesJSONnullCompute zones in which to run the task. Exclusively used by TESK. Included for compatibility purposes
task_idBIGINTnullForeign key to Task.idCorresponding task for this resource set