Migrate codebase from CommonJS to ES Modules
- Convert all require()/module.exports to import/export across 260+ files
- Add "type": "module" to package.json to enable ESM by default
- Add migrations/package.json with "type": "commonjs" to keep db-migrate compatible
- Convert eslint.config.js to ESM with sourceType: "module"
- Replace __dirname/__filename with import.meta.dirname/import.meta.filename
- Replace require.main === module with process.argv[1] === import.meta.filename
- Remove 'use strict' directives (implicit in ESM)
- Convert dynamic require() in switch statements to static import lookup maps
(dns.js, domains.js, backupformats.js, backupsites.js, network.js)
- Extract self-referencing exports.CONSTANT patterns into standalone const
declarations (apps.js, services.js, locks.js, users.js, mail.js, etc.)
- Lazify SERVICES object in services.js to avoid circular dependency TDZ issues
- Add clearMailQueue() to mailer.js for ESM-safe queue clearing in tests
- Add _setMockApp() to ldapserver.js for ESM-safe test mocking
- Add _setMockResolve() wrapper to dig.js for ESM-safe DNS mocking in tests
- Convert backupupload.js to use dynamic imports so --check exits before
loading the module graph (which requires BOX_ENV)
- Update check-install to use ESM import for infra_version.js
- Convert scripts/ (hotfix, release, remote_hotfix.js, find-unused-translations)
- All 1315 tests passing
Migration stats (AI-assisted using Cursor with Claude):
- Wall clock time: ~3-4 hours
- Assistant completions: ~80-100
- Estimated token usage: ~1-2M tokens
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-14 09:53:14 +01:00
import assert from 'node:assert' ;
import BoxError from './boxerror.js' ;
2026-02-14 15:43:24 +01:00
import database from './database.js' ;
2026-03-12 22:55:28 +05:30
import logger from './logger.js' ;
2026-03-27 11:39:38 +01:00
import retry from './retry.js' ;
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
2026-03-12 23:23:23 +05:30
const { log } = logger ( 'locks' ) ;
Migrate codebase from CommonJS to ES Modules
- Convert all require()/module.exports to import/export across 260+ files
- Add "type": "module" to package.json to enable ESM by default
- Add migrations/package.json with "type": "commonjs" to keep db-migrate compatible
- Convert eslint.config.js to ESM with sourceType: "module"
- Replace __dirname/__filename with import.meta.dirname/import.meta.filename
- Replace require.main === module with process.argv[1] === import.meta.filename
- Remove 'use strict' directives (implicit in ESM)
- Convert dynamic require() in switch statements to static import lookup maps
(dns.js, domains.js, backupformats.js, backupsites.js, network.js)
- Extract self-referencing exports.CONSTANT patterns into standalone const
declarations (apps.js, services.js, locks.js, users.js, mail.js, etc.)
- Lazify SERVICES object in services.js to avoid circular dependency TDZ issues
- Add clearMailQueue() to mailer.js for ESM-safe queue clearing in tests
- Add _setMockApp() to ldapserver.js for ESM-safe test mocking
- Add _setMockResolve() wrapper to dig.js for ESM-safe DNS mocking in tests
- Convert backupupload.js to use dynamic imports so --check exits before
loading the module graph (which requires BOX_ENV)
- Update check-install to use ESM import for infra_version.js
- Convert scripts/ (hotfix, release, remote_hotfix.js, find-unused-translations)
- All 1315 tests passing
Migration stats (AI-assisted using Cursor with Claude):
- Wall clock time: ~3-4 hours
- Assistant completions: ~80-100
- Estimated token usage: ~1-2M tokens
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-14 09:53:14 +01:00
const TYPE _APP _TASK _PREFIX = 'app_task_' ;
const TYPE _APP _BACKUP _PREFIX = 'app_backup_' ;
const TYPE _BOX _UPDATE = 'box_update' ;
const TYPE _BOX _UPDATE _TASK = 'box_update_task' ;
const TYPE _FULL _BACKUP _TASK _PREFIX = 'full_backup_task_' ;
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
let gTaskId = null ;
function setTaskId ( taskId ) {
assert . strictEqual ( typeof taskId , 'string' ) ;
gTaskId = taskId ;
}
async function read ( ) {
const result = await database . query ( 'SELECT version, dataJson FROM locks' ) ;
return { version : result [ 0 ] . version , data : JSON . parse ( result [ 0 ] . dataJson ) } ;
}
async function write ( value ) {
assert . strictEqual ( typeof value . version , 'number' ) ;
assert . strictEqual ( typeof value . data , 'object' ) ;
const result = await database . query ( 'UPDATE locks SET dataJson=?, version=version+1 WHERE id=? AND version=?' , [ JSON . stringify ( value . data ) , 'platform' , value . version ] ) ;
if ( result . affectedRows !== 1 ) throw new BoxError ( BoxError . CONFLICT , 'Someone updated before we did' ) ;
2026-03-12 22:55:28 +05:30
log ( ` write: current locks: ${ JSON . stringify ( value . data ) } ` ) ;
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
}
function canAcquire ( data , type ) {
assert . strictEqual ( typeof data , 'object' ) ;
assert . strictEqual ( typeof type , 'string' ) ;
2025-07-25 14:46:55 +02:00
if ( type in data ) return new BoxError ( BoxError . BAD _STATE , ` Locked by ${ data [ type ] } ` ) ;
Migrate codebase from CommonJS to ES Modules
- Convert all require()/module.exports to import/export across 260+ files
- Add "type": "module" to package.json to enable ESM by default
- Add migrations/package.json with "type": "commonjs" to keep db-migrate compatible
- Convert eslint.config.js to ESM with sourceType: "module"
- Replace __dirname/__filename with import.meta.dirname/import.meta.filename
- Replace require.main === module with process.argv[1] === import.meta.filename
- Remove 'use strict' directives (implicit in ESM)
- Convert dynamic require() in switch statements to static import lookup maps
(dns.js, domains.js, backupformats.js, backupsites.js, network.js)
- Extract self-referencing exports.CONSTANT patterns into standalone const
declarations (apps.js, services.js, locks.js, users.js, mail.js, etc.)
- Lazify SERVICES object in services.js to avoid circular dependency TDZ issues
- Add clearMailQueue() to mailer.js for ESM-safe queue clearing in tests
- Add _setMockApp() to ldapserver.js for ESM-safe test mocking
- Add _setMockResolve() wrapper to dig.js for ESM-safe DNS mocking in tests
- Convert backupupload.js to use dynamic imports so --check exits before
loading the module graph (which requires BOX_ENV)
- Update check-install to use ESM import for infra_version.js
- Convert scripts/ (hotfix, release, remote_hotfix.js, find-unused-translations)
- All 1315 tests passing
Migration stats (AI-assisted using Cursor with Claude):
- Wall clock time: ~3-4 hours
- Assistant completions: ~80-100
- Estimated token usage: ~1-2M tokens
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-14 09:53:14 +01:00
if ( type === TYPE _BOX _UPDATE ) {
if ( Object . keys ( data ) . some ( k => k . startsWith ( TYPE _APP _TASK _PREFIX ) ) ) return new BoxError ( BoxError . BAD _STATE , 'One or more app tasks are active' ) ;
if ( Object . keys ( data ) . some ( k => k . startsWith ( TYPE _APP _BACKUP _PREFIX ) ) ) return new BoxError ( BoxError . BAD _STATE , 'One or more app backups are active' ) ;
} else if ( type . startsWith ( TYPE _APP _TASK _PREFIX ) ) {
if ( TYPE _BOX _UPDATE in data ) return new BoxError ( BoxError . BAD _STATE , 'Update is active' ) ;
} else if ( type . startsWith ( TYPE _FULL _BACKUP _TASK _PREFIX ) ) {
if ( TYPE _BOX _UPDATE _TASK in data ) return new BoxError ( BoxError . BAD _STATE , 'Update task is active' ) ;
} else if ( type === TYPE _BOX _UPDATE _TASK ) {
if ( Object . keys ( data ) . some ( k => k . startsWith ( TYPE _FULL _BACKUP _TASK _PREFIX ) ) ) return new BoxError ( BoxError . BAD _STATE , 'One or more backup tasks is active' ) ;
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
}
2025-07-25 14:46:55 +02:00
// TYPE_APP_BACKUP_PREFIX , TYPE_MAIL_SERVER_RESTART can co-run with everything except themselves
2025-07-18 10:56:52 +02:00
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
return null ;
}
async function acquire ( type ) {
assert . strictEqual ( typeof type , 'string' ) ;
2026-03-27 11:39:38 +01:00
await retry ( { times : Number . MAX _SAFE _INTEGER , interval : 100 , log , retry : ( error ) => error . reason === BoxError . CONFLICT } , async ( ) => {
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
const { version , data } = await read ( ) ;
const error = canAcquire ( data , type ) ;
if ( error ) throw error ;
data [ type ] = gTaskId ;
await write ( { version , data } ) ;
2026-03-12 22:55:28 +05:30
log ( ` acquire: ${ type } ` ) ;
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
} ) ;
}
async function wait ( type ) {
assert . strictEqual ( typeof type , 'string' ) ;
2026-03-27 11:39:38 +01:00
await retry ( { times : Number . MAX _SAFE _INTEGER , interval : 10000 , log } , async ( ) => await acquire ( type ) ) ;
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
}
async function release ( type ) {
assert . strictEqual ( typeof type , 'string' ) ;
2026-03-27 11:39:38 +01:00
await retry ( { times : Number . MAX _SAFE _INTEGER , interval : 100 , log , retry : ( error ) => error . reason === BoxError . CONFLICT } , async ( ) => {
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
const { version , data } = await read ( ) ;
if ( ! ( type in data ) ) throw new BoxError ( BoxError . BAD _STATE , ` Lock ${ type } was never acquired ` ) ;
if ( data [ type ] !== gTaskId ) throw new BoxError ( BoxError . BAD _STATE , ` Task ${ gTaskId } attempted to release lock ${ type } acquired by ${ data [ type ] } ` ) ;
delete data [ type ] ;
await write ( { version , data } ) ;
2026-03-12 22:55:28 +05:30
log ( ` release: ${ type } ` ) ;
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
} ) ;
}
async function releaseAll ( ) {
await database . query ( 'DELETE FROM locks' ) ;
await database . query ( 'INSERT INTO locks (id, dataJson) VALUES (?, ?)' , [ 'platform' , JSON . stringify ( { } ) ] ) ;
2026-03-12 22:55:28 +05:30
log ( 'releaseAll: all locks released' ) ;
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
}
2025-07-18 18:11:56 +02:00
// identify programming errors in tasks that forgot to clean up locks
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
async function releaseByTaskId ( taskId ) {
assert . strictEqual ( typeof taskId , 'string' ) ;
2026-03-27 11:39:38 +01:00
await retry ( { times : Number . MAX _SAFE _INTEGER , interval : 100 , log , retry : ( error ) => error . reason === BoxError . CONFLICT } , async ( ) => {
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
const { version , data } = await read ( ) ;
for ( const type of Object . keys ( data ) ) {
if ( data [ type ] === taskId ) {
2026-03-12 22:55:28 +05:30
log ( ` releaseByTaskId: task ${ taskId } forgot to unlock ${ type } ` ) ;
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
delete data [ type ] ;
}
}
await write ( { version , data } ) ;
2026-03-12 22:55:28 +05:30
log ( ` releaseByTaskId: ${ taskId } ` ) ;
remove global lock
Currently, the update/apptask/fullbackup/platformstart take a
global lock and cannot run in parallel. This causes situations
where when a user tries to trigger an apptask, it says "waiting for
backup to finish..." etc
The solution is to let them run in parallel. We need a lock at the
app level as app operations running in parallel would be bad (tm).
In addition, the update task needs a lock just for the update part.
We also need multi-process locks. Running tasks as processes is core
to our "kill" strategy.
Various inter process locks were explored:
* node's IPC mechanism with process.send(). But this only works for direct node.js
children. taskworker is run via sudo and the IPC does not work.
* File lock using O_EXCL. Basic ideas to create lock files. While file creation
can be done atomically, it becomes complicated to clean up lock files when
the tasks crash. We need a way to know what locks were held by the crashing task.
flock and friends are not built-into node.js
* sqlite/redis were options but introduce additional deps
* Settled on MySQL based locking. Initial plan was to have row locks
or table locks. Each row is a kind of lock. While implementing, it was found that
we need many types of locks (and not just update lock and app locks). For example,
we need locks for each task type, so that only one task type is active at a time.
* Instead of rows, we can just lock table and have a json blob in it. This hit a road
block that LOCK TABLE is per session and our db layer cannot handle this easily! i.e
when issing two db.query() it might use two different connections from the pool. We have to
expose the connection, release connection etc.
* Next idea was atomic blob update of the blob checking if old blob was same. This approach,
was finally refined into a version field.
Phew!
2024-12-07 14:35:45 +01:00
} ) ;
}
2026-02-14 15:43:24 +01:00
export default {
setTaskId ,
acquire ,
wait ,
release ,
releaseAll ,
releaseByTaskId ,
TYPE _APP _TASK _PREFIX ,
TYPE _APP _BACKUP _PREFIX ,
TYPE _BOX _UPDATE ,
TYPE _BOX _UPDATE _TASK ,
TYPE _FULL _BACKUP _TASK _PREFIX ,
TYPE _MAIL _SERVER _RESTART : 'mail_restart' ,
} ;