Commit Graph

75 Commits

Author SHA1 Message Date
Girish Ramakrishnan
1cac2f6170 add timestamp to the log 2025-09-16 17:58:30 +02:00
Girish Ramakrishnan
19682ec21b tgz: integrity check 2025-08-15 21:23:39 +05:30
Girish Ramakrishnan
12e073e8cf use node: prefix for requires
mostly because code is being autogenerated by all the AI stuff using
this prefix. it's also used in the stack trace.
2025-08-14 12:55:35 +05:30
Girish Ramakrishnan
fc4da4408c backups: fix app restore with tgz 2025-07-25 13:39:09 +02:00
Girish Ramakrishnan
4f608bdc5f Fix tasks test 2025-07-18 20:55:54 +02:00
Girish Ramakrishnan
48559d3358 tasks: distinguish runtime crash vs task error in worker 2025-07-18 20:02:06 +02:00
Girish Ramakrishnan
c10593e4ac tasks: remove the prefix for invokation lookup 2025-07-18 14:33:26 +02:00
Girish Ramakrishnan
0fa281083e apps: backup is not a state anymore
this is launched as a separate task
2025-07-18 14:14:54 +02:00
Girish Ramakrishnan
7047ee9391 shell: add timeout logic and rework error handling
what's important:

* if task code ran, it exits with 0. this code is regardless of (error, result)
  * when it exited cleanly, we will get the values from the database

* if task timed out, the box code kills it and it has a flag tracking timedOut. we can
  ignore exit code in this case.

* if task code was stopped, box code will send SIGTERM which ideally it will handle and end with 70.

* if task code crashed and it caught the exception, it will return 50

* if task code crashed and node nuked us, it will exit with 1

* if task code was killed with some unhandleabe signal, taskworker.sh will return the signal (9=SIGKILL)
2025-07-17 12:44:24 +02:00
Girish Ramakrishnan
5539f74bea system: add disk usage task 2025-07-17 00:09:50 +02:00
Girish Ramakrishnan
b42be9899e tasks: add completed flag
in some cases, the tasks are setting percent to 100 and crashing later
2025-07-16 15:40:46 +02:00
Girish Ramakrishnan
2db99e7807 refactor: rename updater functions to have box in them 2025-06-20 19:04:55 +02:00
Girish Ramakrishnan
a9f474b24d taskworker: better logs on signal 2025-06-17 22:30:34 +02:00
Girish Ramakrishnan
4770b32287 tasks: add pending field
this indicates if a task is scheduled. previously, we relied
on task.progress being 1
2025-06-17 17:00:21 +02:00
Girish Ramakrishnan
69cb8c5a0a task: can use env -S in ubuntu 20 and above 2025-06-16 14:18:49 +02:00
Girish Ramakrishnan
d2e3b80517 taskworker: add debug 2024-12-16 15:17:35 +01:00
Girish Ramakrishnan
bb392207ea remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc

The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.

Various inter process locks were explored:

* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.

* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js

* sqlite/redis were options but introduce additional deps

* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.

* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.

* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.

Phew!
2024-12-07 20:41:22 +01:00
Girish Ramakrishnan
fc2786b07f taskworker: fix programming error 2024-11-01 16:15:32 +01:00
Girish Ramakrishnan
4a207395ca middleground in timeout
DO BLR droplets still fail with 1s timeout!
2024-10-31 10:22:55 +01:00
Girish Ramakrishnan
2df983a1cf lower timeout 2024-10-31 09:50:20 +01:00
Girish Ramakrishnan
03e17aea22 taskworker: refactor 2024-10-31 09:46:36 +01:00
Girish Ramakrishnan
aefa481c43 network: fix premature connection closures with node 20 and above
the happy eyeballs implementation in node is buggy. ipv4 and ipv6 connections
are made in parallel and whichever responds first is chosen. when there is no
ipv6 (immediately errors with ENETUNREACH/EHOSTUNREACH) and when ipv4 is > 250ms,
the code erroneously times out.

see also https://github.com/nodejs/node/issues/54359

reproduction for those servers:

const options = {
  hostname: 'www.cloudron.io', port: 80, path: '/', method: 'HEAD',
  // family: 4, // uncomment to make it work
};

const req = require('http').request(options, (res) => {
  console.log('statusCode:', res.statusCode);
  res.on('data', () => {}); // drain
});

req.on('socket', (socket) => console.log('Socket assigned to request', socket););
req.on('error', (e) => console.error(e));
req.end();
2024-10-31 09:38:40 +01:00
Girish Ramakrishnan
3dcd0975f7 test: fix various routes tests
* system/disks routes is gone
* provision routes now return 405 instead of 409 when re-setup/re-activated
2024-06-03 19:27:23 +02:00
Girish Ramakrishnan
4acbb7136a proper task name for dashboard change 2023-08-14 10:45:12 +05:30
Girish Ramakrishnan
e723c3c19b move dashboard change routes under dashboard/ 2023-08-13 10:06:01 +05:30
Girish Ramakrishnan
eee49a8291 move dashboard setting into dashboard.js 2023-08-11 21:04:10 +05:30
Girish Ramakrishnan
946e5caacb split mail and mailserver
mail = all the per-domain code
mailserver = all the mail server level code
2023-08-04 20:54:39 +05:30
Girish Ramakrishnan
23f0eba1bd dyndns: run as a task
this lets us display logs
2023-07-08 21:21:06 +05:30
Girish Ramakrishnan
f83295372b updater: combine installer logs into the task file 2023-05-15 19:09:40 +02:00
Girish Ramakrishnan
edb6ed91fe add disk usage task 2022-10-12 10:26:21 +02:00
Girish Ramakrishnan
dd8f710605 Fix failing test 2022-04-28 18:03:36 -07:00
Girish Ramakrishnan
26f9635a38 taskworker: only support async workers 2022-04-15 17:40:46 -05:00
Girish Ramakrishnan
ad3dbe8daa mail: keep mail backups separately from box backups
part of #717
2021-09-26 21:47:24 -07:00
Girish Ramakrishnan
3090307c1d tasks: remove superfluous update code 2021-09-23 17:44:41 -07:00
Girish Ramakrishnan
9a2ed4f2c8 apptask: asyncify 2021-09-16 17:25:05 -07:00
Girish Ramakrishnan
22231a93c0 Ensure logs are flushed before crash 2021-08-30 22:01:34 -07:00
Girish Ramakrishnan
77f5cb183b merge appdb.js into apps.js 2021-08-23 15:35:38 -07:00
Girish Ramakrishnan
1052889795 taskworkers can be async or take a callback 2021-08-23 15:20:14 -07:00
Girish Ramakrishnan
5bcf1bc47b merge domaindb.js into domains.js 2021-08-16 14:41:42 -07:00
Girish Ramakrishnan
0b8d9df6e7 taskworker: print exceptions 2021-07-26 22:11:25 -07:00
Girish Ramakrishnan
004e812d60 merge backupdb into backups.js 2021-07-14 15:10:45 -07:00
Girish Ramakrishnan
ac70350531 tasks.get returns null on not found 2021-07-14 10:59:49 -07:00
Girish Ramakrishnan
e59d0e878d merge taskdb into tasks.js 2021-07-14 10:37:12 -07:00
Girish Ramakrishnan
a5e34cf775 delete certs that have long expired (6 months)
fixes #783
2021-05-18 13:37:35 -07:00
Johannes Zellner
a43e804ee2 Revert "taskworker: put the arg in shebang line"
Not supported on ubuntu 18

This reverts commit e6edc4e999.
2021-05-14 10:51:37 +02:00
Johannes Zellner
170efbcb5e Remove unused require 2021-05-14 10:47:54 +02:00
Girish Ramakrishnan
f927b9b5b2 make taskworker console.* log to file and not stdout
this is similar to code in box.js
2021-05-13 22:49:47 -07:00
Girish Ramakrishnan
e6edc4e999 taskworker: put the arg in shebang line
otherwise, it gets passed as an arg to the script and is visible in process.argv!
2021-05-13 22:49:15 -07:00
Girish Ramakrishnan
131711ef5c mysql: bump connection limit to 200 2021-04-09 10:55:31 -07:00
Girish Ramakrishnan
70fbcf8ce4 add route to sync dns records
merge the mail dns route with this one as well

fixes #737
2021-02-24 22:37:59 -08:00