when we stop an app, the service containers are stopped. they
start running again on reboot.
correct restart policy is "unless-stopped" for all the containers.
with our new retagging approach, the Digest ID remains <null> because
this is only set by docker if truly fetched from the registry.
this means that redis container always gets removed...
Currently, we allocate 50% as RAM and 50% as swap. The manifest is
usually quite conservative on memory values. This means that we set
up a system where the app is applying memory pressure almost immediately.
This then swaps things randomly and increases cpu usage (kswapd shows
up in the profile).
To rethink the whole situation: we should not cap apps with a swap limit at all.
The memory hard limit is what is important. By redefining memoryLimit , we are
doubling every container's memory and it's good that we over allocate this.
docker is using a extra udp port for every container. when there is
a lot of containers, a lot of random udp ports get used up. this causes
problems when installing apps that require contiguous port ranges
this tries to solve two issues:
* the current approach mounts the data directories of apps/volumes individually.
this causes a problem with volume mounts that mount after the container is started i.e not
network time/delay but systemd ordering. With CIFS, the mount is a hostname. This requires
unbound to be running but unbound can only start after docker because it wants to bind to
the docker network. one way to fix is to not start sftp automatically and only start sftp
container in the box code. This results in the sftp container attaching itself of the
directory before mounting and it appears empty. (on the host, the directory will appear
to have mount data!)
* every time apptask runs we keep rebuilding this sftp container. this results in much race.
the fix is: mount the parent directory of apps and volumes. in addition, then any specialized appdata
paths and volume paths are mounted individually. this greatly minimized rebuilding and also since we don't rely
on binding to the mount point itself. the child directories can mount in leisure. this limits the race
issue to only no-op volume mounts.
part of #789
when multiple apptasks are scheduled, we end up with a sequence like this:
- task1 finishes
- task2 (uninstall) removes appdata directory
- sftp rebuild (from task1 finish)
- task2 fails because sftp rebuild created empty appdata directory
a fix is to delay the sftp rebuild until all tasks are done. of course,
the same race is still there, if a user initiated another task immediately
but this seems unlikely. if that happens often, we can further add a sftpRebuildInProgress
flag inside apptaskmanager.