Tasks data model
SCHEMA api's data model can be described with the following ER diagram, in crow notation:
In essence, the ER diagram can be reduced to the following assertions:
- Task is the first-class citizen of the Tasks API
- A Task can have zero or more Executors and reversely each Executor is related to exactly one Task
- An Executor can use zero or more Environment variables, while each Environment variable is bound to exactly one Executor
- An Executor can be related to zero or at most one ExecutorOutputLog and each ExecutorOutputLog is related to exactly one Executor
- A Task can have zero or more MountPoints. Each MountPoint on the other hand is related to exactly one Task
- A Task can utilize zero or more Volumes while each Volume definition is connected to exactly one Task
- Tasks can be tagged with zero or more Tags. Every Tag is related to one Task
- A Task can optionally request one ResourceSet and each ResourceSet is being claimed by a specific Task
Based on the above relationship statements, a Task can exist without a corresponding Executor in the database. This means, that there can exist a submitted task that doesn't actually run a containerized process. This happens because it is impossible to enforce an "at-least-one relationship" in the database layer without relying on the use of database triggers.
Nevertheless, this is remedied on the level of the API. Incoming task requests are considered invalid if they do not define at least one executor.
Relational model
This page provides details about the database tables being created for the Tasks API. A certain description is given for each database column, that defines whether the column is used as private key (PK), its name and database type, if the column values are constrained to be unique, if the column can accept null values and a brief description of the column's use. Moreover, any column-specific database constraints are listed on their corresponding columns while multi-column database constraints at the end of each relation's section.
Task
PK | Name | Type | Unique? | Can be NULL? | Default | Additional constraints | Description |
---|---|---|---|---|---|---|---|
🔑 | id | BIGINT | ✅ | Next value in corresponding sequence | Primary key column defined by Django ORM | ||
uuid | UUID | ✅ | A random UUID value | A UUID to be used by the users to refer to the task | |||
api_task_id | VARCHAR(255) | '' | An ID assigned by the underlying task submission API (TESK in this case) | ||||
name | VARCHAR(255) | '' | a) Not empty or whitespace | Name of the task | |||
description | TEXT | '' | A description of the task | ||||
pending | BOOLEAN | True | Whether the task execution is pending or it has completed | ||||
status | VARCHAR(30) | SUBMITTED status is assigned to a new task by default | a) Value must always be either one of Schema API's status descriptors | A string literal describing the current status of a task | |||
submitted_at | TIMESTAMP WITH TIMEZONE | By default, Django will always pass to this field the time of the creation of the record | The time when the request was accepted and stored in the database | ||||
latest_update | TIMESTAMP WITH TIMEZONE | ✅ | null | a) Must always correspond to a time later than the one which the task was submitted (submitted_at ) | The time when the corresponding data and status of the task request was lastly updated |
Each row of the table Task is also subject to these multi-column constraints:
- If the
status
isSUBMITTED
then theapi_task_id
can be blank. Otherwise,api_task_id
must not be blank or whitespace. pending
must befalse
when the status isCANCELED
,COMPLETED
orERROR
. Otherwise, it must betrue
.
Executor
PK | Name | Type | Unique? | Can be NULL? | Default | Additional constraints | Description |
---|---|---|---|---|---|---|---|
🔑 | id | BIGINT | ✅ | Next value in corresponding sequence | Primary key column defined by Django ORM | ||
order | SMALLINT | null | A number that indicates the precedence of this executor among the rest of the executors of the corresponding task | ||||
command | JSON | null | A JSON array that corresponds to the command array that can be run in a container | ||||
image | VARCHAR(255) | '' | a) Not empty or whitespace | Docker image definition | |||
stderr | VARCHAR(255) | '' | Path to a file in the container to redirect stderr | ||||
stdin | VARCHAR(255) | '' | Value to pass into container's stdin | ||||
stdout | VARCHAR(255) | '' | Path to a file in the container to redirect stdout | ||||
workdir | VARCHAR(255) | '' | Path to the working directory in the container | ||||
task_id | BIGINT | null | Foreign key to Task.id | Corresponding task to which this executor belongs to |
Each row of the table Executor is also subject to the multi-column constraint described below:
task_id
andorder
are unique together
Env
PK | Name | Type | Unique? | Can be NULL? | Default | Additional constraints | Description |
---|---|---|---|---|---|---|---|
🔑 | id | BIGINT | ✅ | Next value in corresponding sequence | Primary key column defined by Django ORM | ||
key | VARCHAR(255) | '' | a) Not empty or whitespace | Environment variable name | |||
value | TEXT | '' | Environment variable value | ||||
executor_id | BIGINT | null | Foreign key to Executor.id | Corresponding executor in whose container this variable should exist |
ExecutorOutputLog
PK | Name | Type | Unique? | Can be NULL? | Default | Additional constraints | Description |
---|---|---|---|---|---|---|---|
🔑 | id | BIGINT | ✅ | Next value in corresponding sequence | Primary key column defined by Django ORM | ||
stdout | TEXT | '' | Stdout recorded from a specific executor execution | ||||
stderr | TEXT | '' | Stderr recorded from a specific executor execution | ||||
executor_id | BIGINT | ✅ | null | Foreign key to Executor.id | Corresponding Executor that produced this output log |
MountPoint
PK | Name | Type | Unique? | Can be NULL? | Default | Additional constraints | Description |
---|---|---|---|---|---|---|---|
🔑 | id | BIGINT | ✅ | Next value in corresponding sequence | Primary key column defined by Django ORM | ||
name | VARCHAR(255) | '' | Optional name of the mount point | ||||
description | VARCHAR(255) | '' | Optional description of the mount point | ||||
url | VARCHAR(255) | '' | URL of the file or directory on the user's public file system home | ||||
path | VARCHAR(255) | '' | a) Not empty or whitespace | Corresponding path inside the container | |||
type | VARCHAR(255) | '' | a) Value must always be either one of Schema API's filesystem entity descriptors | Indicates whether a file or a directory is being mounted | |||
is_input | BOOLEAN | True | Indicates whether it is an input mount point or an output mount point | ||||
content | TEXT | '' | Value to set to an input mount point file | ||||
task_id | BIGINT | null | Foreign key to Task.id | Corresponding task that uses this mount point |
Each row of the table MountPoint is also subject to the following multi-column constraint:
url
can be blank or whitespace, only if the record is an input(is_input=True
) file(type=FILE
) mount point.
Volume
PK | Name | Type | Unique? | Can be NULL? | Default | Additional constraints | Description |
---|---|---|---|---|---|---|---|
🔑 | id | BIGINT | ✅ | Next value in corresponding sequence | Primary key column defined by Django ORM | ||
path | VARCHAR(255) | '' | a) Not empty or whitespace | Path in the container to mount the volume | |||
task_id | BIGINT | null | Foreign key to Task.id | Corresponding task that owns this volume |
Tag
PK | Name | Type | Unique? | Can be NULL? | Default | Additional constraints | Description |
---|---|---|---|---|---|---|---|
🔑 | id | BIGINT | ✅ | Next value in corresponding sequence | Primary key column defined by Django ORM | ||
key | VARCHAR(255) | '' | a) Not empty or whitespace | Tag name | |||
value | VARCHAR(255) | '' | Tag value | ||||
task_id | BIGINT | null | Foreign key to Task.id | Corresponding Task tagged with this tag |
ResourceSet
PK | Name | Type | Unique? | Can be NULL? | Default | Additional constraints | Description |
---|---|---|---|---|---|---|---|
🔑 | id | BIGINT | ✅ | Next value in corresponding sequence | Primary key column defined by Django ORM | ||
cpu_cores | INTEGER | ✅ | null | a) CPU cores claim is greater than 0 | CPU cores claim | ||
ram_gb | FLOAT | ✅ | null | a) RAM in GBs is greater than 0 | RAM claim in GBs | ||
disk_gb | FLOAT | ✅ | null | a) Disk in GBs is greater than 0 | Disk claim in GBs | ||
preemptible | BOOLEAN | ✅ | null | Whether the task is allowed to run on preemptible compute instances. Exclusively used by TESK. Included for compatibility purposes | |||
zones | JSON | ✅ | null | Compute zones in which to run the task. Exclusively used by TESK. Included for compatibility purposes | |||
task_id | BIGINT | ✅ | null | Foreign key to Task.id | Corresponding task for this resource set |