Tasks data model

SCHEMA api's data model can be described with the following ER diagram, in crow notation:

In essence, the ER diagram can be reduced to the following assertions:

Task is the first-class citizen of the Tasks API
A Task can have zero or more Executors and reversely each Executor is related to exactly one Task
An Executor can use zero or more Environment variables, while each Environment variable is bound to exactly one Executor
An Executor can be related to zero or at most one ExecutorOutputLog and each ExecutorOutputLog is related to exactly one Executor
A Task can have zero or more MountPoints. Each MountPoint on the other hand is related to exactly one Task
A Task can utilize zero or more Volumes while each Volume definition is connected to exactly one Task
Tasks can be tagged with zero or more Tags. Every Tag is related to one Task
A Task can optionally request one ResourceSet and each ResourceSet is being claimed by a specific Task

note

Based on the above relationship statements, a Task can exist without a corresponding Executor in the database. This means, that there can exist a submitted task that doesn't actually run a containerized process. This happens because it is impossible to enforce an "at-least-one relationship" in the database layer without relying on the use of database triggers.

Nevertheless, this is remedied on the level of the API. Incoming task requests are considered invalid if they do not define at least one executor.

Relational model

This page provides details about the database tables being created for the Tasks API. A certain description is given for each database column, that defines whether the column is used as private key (PK), its name and database type, if the column values are constrained to be unique, if the column can accept null values and a brief description of the column's use. Moreover, any column-specific database constraints are listed on their corresponding columns while multi-column database constraints at the end of each relation's section.

Task

PK	Name	Type	Unique?	Can be NULL?	Default	Additional constraints	Description
🔑	id	BIGINT	✅		Next value in corresponding sequence		Primary key column defined by Django ORM
	uuid	UUID	✅		A random UUID value		A UUID to be used by the users to refer to the task
	api_task_id	VARCHAR(255)			`''`		An ID assigned by the underlying task submission API (TESK in this case)
	name	VARCHAR(255)			`''`	a) Not empty or whitespace	Name of the task
	description	TEXT			`''`		A description of the task
	pending	BOOLEAN			`True`		Whether the task execution is pending or it has completed
	status	VARCHAR(30)			`SUBMITTED` status is assigned to a new task by default	a) Value must always be either one of Schema API's status descriptors	A string literal describing the current status of a task
	submitted_at	TIMESTAMP WITH TIMEZONE			By default, Django will always pass to this field the time of the creation of the record		The time when the request was accepted and stored in the database
	latest_update	TIMESTAMP WITH TIMEZONE		✅	`null`	a) Must always correspond to a time later than the one which the task was submitted (`submitted_at`)	The time when the corresponding data and status of the task request was lastly updated

Each row of the table Task is also subject to these multi-column constraints:

If the status is SUBMITTED then the api_task_id can be blank. Otherwise, api_task_id must not be blank or whitespace.
pending must be false when the status is CANCELED,COMPLETED or ERROR. Otherwise, it must be true.

Executor

PK	Name	Type	Unique?	Default	Additional constraints	Description
🔑	id	BIGINT	✅	Next value in corresponding sequence		Primary key column defined by Django ORM
	order	SMALLINT		`null`		A number that indicates the precedence of this executor among the rest of the executors of the corresponding task
	command	JSON		`null`		A JSON array that corresponds to the command array that can be run in a container
	image	VARCHAR(255)		`''`	a) Not empty or whitespace	Docker image definition
	stderr	VARCHAR(255)		`''`		Path to a file in the container to redirect stderr
	stdin	VARCHAR(255)		`''`		Value to pass into container's stdin
	stdout	VARCHAR(255)		`''`		Path to a file in the container to redirect stdout
	workdir	VARCHAR(255)		`''`		Path to the working directory in the container
	task_id	BIGINT		`null`	Foreign key to Task.`id`	Corresponding task to which this executor belongs to

Each row of the table Executor is also subject to the multi-column constraint described below:

task_id and order are unique together

Env

PK	Name	Type	Unique?	Default	Additional constraints	Description
🔑	id	BIGINT	✅	Next value in corresponding sequence		Primary key column defined by Django ORM
	key	VARCHAR(255)		`''`	a) Not empty or whitespace	Environment variable name
	value	TEXT		`''`		Environment variable value
	executor_id	BIGINT		`null`	Foreign key to Executor.`id`	Corresponding executor in whose container this variable should exist

ExecutorOutputLog

PK	Name	Type	Unique?	Default	Additional constraints	Description
🔑	id	BIGINT	✅	Next value in corresponding sequence		Primary key column defined by Django ORM
	stdout	TEXT		`''`		Stdout recorded from a specific executor execution
	stderr	TEXT		`''`		Stderr recorded from a specific executor execution
	executor_id	BIGINT	✅	`null`	Foreign key to Executor.`id`	Corresponding Executor that produced this output log

MountPoint

PK	Name	Type	Unique?	Default	Additional constraints	Description
🔑	id	BIGINT	✅	Next value in corresponding sequence		Primary key column defined by Django ORM
	name	VARCHAR(255)		`''`		Optional name of the mount point
	description	VARCHAR(255)		`''`		Optional description of the mount point
	url	VARCHAR(255)		`''`		URL of the file or directory on the user's public file system home
	path	VARCHAR(255)		`''`	a) Not empty or whitespace	Corresponding path inside the container
	type	VARCHAR(255)		`''`	a) Value must always be either one of Schema API's filesystem entity descriptors	Indicates whether a file or a directory is being mounted
	is_input	BOOLEAN		`True`		Indicates whether it is an input mount point or an output mount point
	content	TEXT		`''`		Value to set to an input mount point file
	task_id	BIGINT		`null`	Foreign key to Task.`id`	Corresponding task that uses this mount point

Each row of the table MountPoint is also subject to the following multi-column constraint:

url can be blank or whitespace, only if the record is an input(is_input=True) file(type=FILE) mount point.

Volume

PK	Name	Type	Unique?	Default	Additional constraints	Description
🔑	id	BIGINT	✅	Next value in corresponding sequence		Primary key column defined by Django ORM
	path	VARCHAR(255)		`''`	a) Not empty or whitespace	Path in the container to mount the volume
	task_id	BIGINT		`null`	Foreign key to Task.`id`	Corresponding task that owns this volume

Tag

PK	Name	Type	Unique?	Default	Additional constraints	Description
🔑	id	BIGINT	✅	Next value in corresponding sequence		Primary key column defined by Django ORM
	key	VARCHAR(255)		`''`	a) Not empty or whitespace	Tag name
	value	VARCHAR(255)		`''`		Tag value
	task_id	BIGINT		`null`	Foreign key to Task.`id`	Corresponding Task tagged with this tag

ResourceSet

PK	Name	Type	Unique?	Can be NULL?	Default	Additional constraints	Description
🔑	id	BIGINT	✅		Next value in corresponding sequence		Primary key column defined by Django ORM
	cpu_cores	INTEGER		✅	`null`	a) CPU cores claim is greater than 0	CPU cores claim
	ram_gb	FLOAT		✅	`null`	a) RAM in GBs is greater than 0	RAM claim in GBs
	disk_gb	FLOAT		✅	`null`	a) Disk in GBs is greater than 0	Disk claim in GBs
	preemptible	BOOLEAN		✅	`null`		Whether the task is allowed to run on preemptible compute instances. Exclusively used by TESK. Included for compatibility purposes
	zones	JSON		✅	`null`		Compute zones in which to run the task. Exclusively used by TESK. Included for compatibility purposes
	task_id	BIGINT	✅		`null`	Foreign key to Task.`id`	Corresponding task for this resource set

Relational model​

Task​

Executor​

Env​

ExecutorOutputLog​

MountPoint​

Volume​

Tag​

ResourceSet​