not 100% sure because of missing log timestamps, but mysql restarts after the box
has started up. As seen from logs below, we try to mark the apps for restart on
platform update. But this failed because mysql was restarting at that time.
This ended up with e2e test failing.
box:apps restartAppsUsingAddons: marking nc4801.autoupdatetest.domain.io for restart
box:apps restartAppsUsingAddons: error marking nc4801.autoupdatetest.domain.io for restart: {"name":"BoxError","reason":"Database Error","details":{"fatal":true,"code":"PROTOCOL_CONNECTION_LOST"},"message":"Connection lost: The server closed the connection.","nestedError":{"fatal":true,"code":"PROTOCOL_CONNECTION_LOST"}}
box:apps restartAppsUsingAddons: marking wekan1398.autoupdatetest.domain.io for restart
box:database Connection 51 error: Connection lost: The server closed the connection. PROTOCOL_CONNECTION_LOST
box:database Connection 52 error: Connection lost: The server closed the connection. PROTOCOL_CONNECTION_LOST
Box GET /api/v1/cloudron/status 500 Internal Server Error connect ECONNREFUSED 127.0.0.1:3306 41.251 ms - 217
bd9c664b1a tried to remove it and use
the system resolver. However, we found that debian has a quirk that it adds
it adds the fqdn as 127.0.1.1. This means that the docker containers
resolve the my.example.com domain to that and can't connect.
This affects any apps doing a turn test (CLOUDRON_TURN/STUN_SERVER)
and also apps like SOGo which use the mail server hostname directly (since
they require proper certs).
https://www.debian.org/doc/manuals/debian-reference/ch05.en.html#_the_hostname_resolution
So, the solution is to go back to unbound, now that port 53 binding is specially
handled anyway in docker.js
We removed httpPort with the assumption that docker allocated IPs
and kept them as long as the container is around. This turned out
to be not true because the IP changes on even container restart.
So we now allocate IPs statically. The iprange makes sure we don't
overlap with addons and other CI app or JupyterHub apps.
https://github.com/moby/moby/issues/6743https://github.com/moby/moby/pull/19001
It's all very complicated.
Approach 1: Simple move unbound to not listen on 0.0.0.0 and only the internal
ones. However, docker has no way to bind only to the "public" interface.
Approach 2: Move the internal unbound to some other port. This required a PR
for haraka - https://github.com/haraka/Haraka/pull/2863 . This works and we use
systemd-resolved by default. However, it turns out systemd-resolved with hog the
lo and thus docker cannot bind again to port 53.
Approach 3: Get rid of systemd-resolved and try to put the dns server list in
/etc/resolv.conf. This is surprisingly hard because the DNS listing can come from
DHCP or netplan or wherever. We can hardcode some public DNS servers but this seems
not a good idea for privacy.
Approach 4: So maybe we don't move the unbound away to different port after all.
However, all the work for approach 2 is done and it's quite nice that the default
resolver is used with the default dns server of the network (probably a caching
server + also maybe has some home network firewalled dns).
So, the final solution is to bind to the make docker bind to the IP explicity.
It's unclear what will happen if the IP changes, maybe it needs a restart.