Commit Graph

26 Commits

Author SHA1 Message Date
Girish Ramakrishnan
ca4e1e207c return cpuCount from app/service graphs as well 2022-10-13 22:38:44 +02:00
Girish Ramakrishnan
1872cea763 graphs: do not average cpu use
Show like htop/top: cpu core count * 100
2022-10-13 22:36:20 +02:00
Girish Ramakrishnan
4015afc69c graphs: send service graphs 2022-10-13 20:52:22 +02:00
Girish Ramakrishnan
b5da4143c9 graphs: add app response in system graphs 2022-10-12 22:08:10 +02:00
Girish Ramakrishnan
62d68e2733 graphs: remove the disk info 2022-10-12 10:30:02 +02:00
Johannes Zellner
cbaf86b8c7 Use counter values for docker stats in collectd and grafana queries 2022-10-11 19:06:40 +02:00
Girish Ramakrishnan
9d35756db5 graphs: just query graphite IP instead of localhost mapping 2022-10-11 12:44:37 +02:00
Girish Ramakrishnan
22790fd9b7 system: include only ram info for graphs
app graphs only contain ram info since that is what docker stats provides
2022-10-11 11:54:06 +02:00
Girish Ramakrishnan
3caffdb4e1 Rework app stats
Previously, the du plugin was collecting data every 20 seconds but
carbon was configured to only keep data every 12 hours causing much
confusion.

In the process of reworking this, it was determined:

* No need to collect disk usage info over time. Not sure how that is useful
* Instead, collect CPU/Network/Block info over time. We get this now from docker stats
* We also collect info about the services (addon containers)
* No need to reconfigure collectd for each app change anymore since there is no per
app collectd configuration anymore.
2022-10-10 21:13:26 +02:00
Johannes Zellner
1c07ec219c Do not query disk usage for apps without localstorage 2022-09-16 17:10:07 +02:00
Johannes Zellner
554dec640a Rework system graphs api 2022-09-15 16:07:08 +02:00
Girish Ramakrishnan
d176ff2582 graphs: move system graph queries to the backend 2022-09-15 12:40:52 +02:00
Girish Ramakrishnan
bd7ee437a8 collectd: fix memory stat collection configuration
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/memory.html#usage-in-bytes says
this is the most efficient approach for v1. It says RSS+CACHE(+SWAP) is the more accurate value.
Elsewhere in the note in https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/memory.html#stat-file,
it says "‘rss + mapped_file” will give you resident set size of cgroup." Overall, it's not clear how
to compute the values so we just use the file.

https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html is better. https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#memory
says the values are separated out.
2022-09-14 18:15:26 +02:00
Johannes Zellner
bead9589a1 Move app graphs graphite query to backend 2022-09-14 14:39:28 +02:00
Girish Ramakrishnan
9f9575f46a Fixes to service configuration
restart service does not rebuild automatically, we should add a route
for that. we need to figure where to scale services etc if we randomly
create containers like that.
2021-01-21 17:41:22 -08:00
Girish Ramakrishnan
cd1b46848e Fix bug where graphite and sftp are not incrementally upgraded 2021-01-21 12:00:23 -08:00
Girish Ramakrishnan
1363e02603 graphite: bump up memory limit 2020-12-04 10:59:06 -08:00
Girish Ramakrishnan
e511b70d8f bring back resolvconf and unbound DNS
bd9c664b1a tried to remove it and use
the system resolver. However, we found that debian has a quirk that it adds
it adds the fqdn as 127.0.1.1. This means that the docker containers
resolve the my.example.com domain to that and can't connect.

This affects any apps doing a turn test (CLOUDRON_TURN/STUN_SERVER)
and also apps like SOGo which use the mail server hostname directly (since
they require proper certs).

https://www.debian.org/doc/manuals/debian-reference/ch05.en.html#_the_hostname_resolution

So, the solution is to go back to unbound, now that port 53 binding is specially
handled anyway in docker.js
2020-11-25 10:02:43 -08:00
Girish Ramakrishnan
bd9c664b1a Free up port 53
It's all very complicated.

Approach 1: Simple move unbound to not listen on 0.0.0.0 and only the internal
ones. However, docker has no way to bind only to the "public" interface.

Approach 2: Move the internal unbound to some other port. This required a PR
for haraka - https://github.com/haraka/Haraka/pull/2863 . This works and we use
systemd-resolved by default. However, it turns out systemd-resolved with hog the
lo and thus docker cannot bind again to port 53.

Approach 3: Get rid of systemd-resolved and try to put the dns server list in
/etc/resolv.conf. This is surprisingly hard because the DNS listing can come from
DHCP or netplan or wherever. We can hardcode some public DNS servers but this seems
not a good idea for privacy.

Approach 4: So maybe we don't move the unbound away to different port after all.
However, all the work for approach 2 is done and it's quite nice that the default
resolver is used with the default dns server of the network (probably a caching
server + also maybe has some home network firewalled dns).

So, the final solution is to bind to the make docker bind to the IP explicity.
It's unclear what will happen if the IP changes, maybe it needs a restart.
2020-11-18 23:25:56 -08:00
Girish Ramakrishnan
d089444441 db upgrade: stop containers only after exporting
we cannot export if the containers were nuked in the platform logic.
for this reason, move the removal near the place where they get started.
2020-07-30 15:28:53 -07:00
Girish Ramakrishnan
7c24d9c6c6 Give graphite more memory 2020-06-22 09:55:01 -07:00
Girish Ramakrishnan
4c1e967dad give containers a hostname
this only affects the hostname and not the network name/alias
2019-06-01 10:02:26 -07:00
Girish Ramakrishnan
9b4fffde29 Use shell.exec instead of shell.execSync 2018-11-23 11:18:45 -08:00
Girish Ramakrishnan
1b1945e1f5 Move out graphite from port 8000
Port 8000 is used by esxi management service (!)
2018-11-17 19:14:21 -08:00
Girish Ramakrishnan
78ac1d2a12 Add isCloudronManaged label to containers managed by cloudron 2018-11-10 19:00:03 -08:00
Girish Ramakrishnan
045cfeeb0d Move the addon startup logic to addons.js
Moved the graphite logic to new graphs.js

The settings code now does change notification itself. Over time,
it makes sense to just having settings code do this for everything
and not have this change listener logic. This lets us:
* Maybe the settings can only return based on final handler result
* All dependant modules otherwise have to "init"ed to listen on startup
* Easier to test those handlers without having to actually change the
  setting (since they will now be in "exports" naturally)

Also, maybe someday with this abstraction we can allow apps to have their
own isolated databases etc
2018-10-16 14:40:29 -07:00