this tries to solve two issues:
* the current approach mounts the data directories of apps/volumes individually.
this causes a problem with volume mounts that mount after the container is started i.e not
network time/delay but systemd ordering. With CIFS, the mount is a hostname. This requires
unbound to be running but unbound can only start after docker because it wants to bind to
the docker network. one way to fix is to not start sftp automatically and only start sftp
container in the box code. This results in the sftp container attaching itself of the
directory before mounting and it appears empty. (on the host, the directory will appear
to have mount data!)
* every time apptask runs we keep rebuilding this sftp container. this results in much race.
the fix is: mount the parent directory of apps and volumes. in addition, then any specialized appdata
paths and volume paths are mounted individually. this greatly minimized rebuilding and also since we don't rely
on binding to the mount point itself. the child directories can mount in leisure. this limits the race
issue to only no-op volume mounts.
part of #789
when multiple apptasks are scheduled, we end up with a sequence like this:
- task1 finishes
- task2 (uninstall) removes appdata directory
- sftp rebuild (from task1 finish)
- task2 fails because sftp rebuild created empty appdata directory
a fix is to delay the sftp rebuild until all tasks are done. of course,
the same race is still there, if a user initiated another task immediately
but this seems unlikely. if that happens often, we can further add a sftpRebuildInProgress
flag inside apptaskmanager.
restart service does not rebuild automatically, we should add a route
for that. we need to figure where to scale services etc if we randomly
create containers like that.
bd9c664b1a tried to remove it and use
the system resolver. However, we found that debian has a quirk that it adds
it adds the fqdn as 127.0.1.1. This means that the docker containers
resolve the my.example.com domain to that and can't connect.
This affects any apps doing a turn test (CLOUDRON_TURN/STUN_SERVER)
and also apps like SOGo which use the mail server hostname directly (since
they require proper certs).
https://www.debian.org/doc/manuals/debian-reference/ch05.en.html#_the_hostname_resolution
So, the solution is to go back to unbound, now that port 53 binding is specially
handled anyway in docker.js
It's all very complicated.
Approach 1: Simple move unbound to not listen on 0.0.0.0 and only the internal
ones. However, docker has no way to bind only to the "public" interface.
Approach 2: Move the internal unbound to some other port. This required a PR
for haraka - https://github.com/haraka/Haraka/pull/2863 . This works and we use
systemd-resolved by default. However, it turns out systemd-resolved with hog the
lo and thus docker cannot bind again to port 53.
Approach 3: Get rid of systemd-resolved and try to put the dns server list in
/etc/resolv.conf. This is surprisingly hard because the DNS listing can come from
DHCP or netplan or wherever. We can hardcode some public DNS servers but this seems
not a good idea for privacy.
Approach 4: So maybe we don't move the unbound away to different port after all.
However, all the work for approach 2 is done and it's quite nice that the default
resolver is used with the default dns server of the network (probably a caching
server + also maybe has some home network firewalled dns).
So, the final solution is to bind to the make docker bind to the IP explicity.
It's unclear what will happen if the IP changes, maybe it needs a restart.
when restoring, the platform starts first and the sftp container
goes and creates app data dirs with root permission. this prevents
the app restore logic from downloading the backup since it expects
yellowtent perm