Commit Graph

40 Commits

Author SHA1 Message Date
Girish Ramakrishnan
12e073e8cf use node: prefix for requires
mostly because code is being autogenerated by all the AI stuff using
this prefix. it's also used in the stack trace.
2025-08-14 12:55:35 +05:30
Girish Ramakrishnan
7e0803c4b4 clean up task locks 2025-07-18 18:12:07 +02:00
Girish Ramakrishnan
0aca6c2588 locks: rename lock types to make it clearer 2025-07-18 13:40:15 +02:00
Girish Ramakrishnan
17ce6f7291 apptaskmanager: release lock and clear active task on task failure 2025-07-17 14:54:47 +02:00
Girish Ramakrishnan
4d94700375 remove dead comment 2025-06-19 16:12:46 +02:00
Girish Ramakrishnan
d9c104613c tasks: rework the startTask API
it is now async. change was required to reset the pending flag
2025-06-17 19:32:46 +02:00
Girish Ramakrishnan
9c7e9e25ca scheduler: respect cloudron timezone setting 2025-01-02 10:11:14 +01:00
Girish Ramakrishnan
d456f91921 tasks: fix active status 2024-12-12 19:09:55 +01:00
Girish Ramakrishnan
35be854997 apptaskmanager: do not schedule tasks until infra ready 2024-12-09 14:46:03 +01:00
Girish Ramakrishnan
bb392207ea remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc

The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.

Various inter process locks were explored:

* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.

* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js

* sqlite/redis were options but introduce additional deps

* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.

* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.

* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.

Phew!
2024-12-07 20:41:22 +01:00
Girish Ramakrishnan
a8059c49e9 lint 2024-06-27 16:50:31 +02:00
Girish Ramakrishnan
79af6c1a68 On dashboard or email location change, reconfigure immediately 2023-08-21 18:34:07 +05:30
Girish Ramakrishnan
070f6e5de3 move startup logic to platform.js 2023-08-12 22:25:46 +05:30
Girish Ramakrishnan
7709e155e0 more async'ification 2021-09-07 11:21:06 -07:00
Girish Ramakrishnan
8d3790d890 Fix grammar 2021-08-24 09:38:51 -07:00
Girish Ramakrishnan
e59d0e878d merge taskdb into tasks.js 2021-07-14 10:37:12 -07:00
Girish Ramakrishnan
097a7d6b60 sftp: rework appdata and volume mounting logic
this tries to solve two issues:

* the current approach mounts the data directories of apps/volumes individually.
this causes a problem with volume mounts that mount after the container is started i.e not
network time/delay but systemd ordering. With CIFS, the mount is a hostname. This requires
unbound to be running but unbound can only start after docker because it wants to bind to
the docker network. one way to fix is to not start sftp automatically and only start sftp
container in the box code. This results in the sftp container attaching itself of the
directory before mounting and it appears empty. (on the host, the directory will appear
to have mount data!)

* every time apptask runs we keep rebuilding this sftp container. this results in much race.

the fix is: mount the parent directory of apps and volumes. in addition, then any specialized appdata
paths and volume paths are mounted individually. this greatly minimized rebuilding and also since we don't rely
on binding to the mount point itself. the child directories can mount in leisure. this limits the race
issue to only no-op volume mounts.

part of #789
2021-06-24 16:51:58 -07:00
Girish Ramakrishnan
4cba5ca405 sftp: only rebuild when app task queue is empty
when multiple apptasks are scheduled, we end up with a sequence like this:
    - task1 finishes
    - task2 (uninstall) removes  appdata directory
    - sftp rebuild (from task1 finish)
    - task2 fails because sftp rebuild created empty appdata directory

a fix is to delay the sftp rebuild until all tasks are done. of course,
the same race is still there, if a user initiated another task immediately
but this seems unlikely. if that happens often, we can further add a sftpRebuildInProgress
flag inside apptaskmanager.
2021-03-16 18:29:01 -07:00
Girish Ramakrishnan
e2c342f242 apptaskmanager: Fix crash 2021-01-30 21:16:41 -08:00
Girish Ramakrishnan
9f9575f46a Fixes to service configuration
restart service does not rebuild automatically, we should add a route
for that. we need to figure where to scale services etc if we randomly
create containers like that.
2021-01-21 17:41:22 -08:00
Girish Ramakrishnan
a184012205 apptask: set the memory limit based on the backup config
fixes #759
2021-01-06 15:26:51 -08:00
Girish Ramakrishnan
1b92ce08aa scheduler: suspend/resume jobs when apptask is active
the cron job container was holding on to the volume any container changes.
2020-11-25 22:16:38 -08:00
Girish Ramakrishnan
2c9efea733 Use debug instead of console.error 2020-10-30 11:07:51 -07:00
Johannes Zellner
4f9cb9a8a1 sftp.rebuild does not need options anymore 2020-08-19 19:08:12 +02:00
Johannes Zellner
ec5129d25b Rebuild sftp addon after an apptask 2020-08-19 18:23:44 +02:00
Girish Ramakrishnan
885e90e810 add a todo 2020-08-11 12:57:37 -07:00
Girish Ramakrishnan
99f989c384 run apptask and backup task with a nice
A child process inherits whatever nice value is held by the parent at the time that it is forked
2020-08-06 16:46:39 -07:00
Girish Ramakrishnan
887cbb0b22 make percent non-zero 2019-12-18 09:33:44 -08:00
Girish Ramakrishnan
30eccfb54b Use BoxError instead of Error in all places
This moves everything other than the addon code and some 'done' logic
2019-12-04 11:02:54 -08:00
Girish Ramakrishnan
1856fc05d9 Add timeout for apptask as well 2019-10-14 14:16:15 -07:00
Girish Ramakrishnan
85c13cae58 Fix platform update logic 2019-09-24 21:21:49 -07:00
Girish Ramakrishnan
dde81ee847 lint 2019-09-24 19:50:24 -07:00
Girish Ramakrishnan
1a061e4446 Only check installationState to resume tasks
also, make resumeTasks go via app logic to capture end of task
2019-09-24 00:37:29 -07:00
Girish Ramakrishnan
579eacb644 Better pending state check 2019-09-19 16:42:49 -07:00
Girish Ramakrishnan
f52c5b584e Fix crash when resuming stopped apps 2019-09-19 16:40:38 -07:00
Girish Ramakrishnan
b1380819ba debug taskId 2019-09-03 16:06:28 -07:00
Girish Ramakrishnan
dd0fb8292c Move state enums to the model code 2019-08-30 13:21:51 -07:00
Girish Ramakrishnan
1faee00764 Better progress text when waiting for other tasks
Fixes #630
2019-08-28 22:13:50 -07:00
Girish Ramakrishnan
a40505e2ee Remove pause flag, we already have platform lock 2019-08-28 22:13:50 -07:00
Girish Ramakrishnan
9f1210202a port taskmanager to use tasks 2019-08-28 15:17:53 -07:00