Commit graph

12 commits

Author SHA1 Message Date
Sanne Raymaekers
3b3ffe0d08 internal/prometheus: more human-readable time buckets
Let's make them slightly easier to query and reason about. From 100 ms
to 24 hours.
2024-07-04 11:19:25 +02:00
Sanne Raymaekers
a25e0f4adb prometheus:: add arch label to dequeue metrics
Only add the arch label for osbuild job types, as the finish metrics
behave similarly. Having arch labels on dequeue metrics for any other
job type (but not on the finish metrics) would produce weird results.
2023-03-09 18:47:57 +01:00
Jakub Rusz
3cdfa9d7f0 internal/prometheus: add more buckets for job durations
We were hitting the limit on stage, let's increase it.
2023-02-08 12:33:10 +01:00
Gianluca Zuccarelli
8e82b223af prometheus: move constants to a single file
Move the constants to a single file and export them.
These can then later be used externally for future use
with the ocm metrics.
2022-11-30 11:14:29 +01:00
Gianluca Zuccarelli
9f4e765657 metrics: build jobs arch label
Add the architecture label to build jobs
which will enable filtering and monitoring
build jobs by architecture. Build job results
contain the `arch` field in the results struct,
this is then used to pass to the metrics, where
there is a value, otherwise it is set to an
empty string.
2022-07-27 13:37:14 +02:00
Chloe Kaubisch
873798514b prometheus: add tenant label
Include a tenant label for all prometheus metrics. Modify
jobstatus function in the worker accordingly to return channel
so it can be passed to prometheus.
2022-06-07 16:35:03 +02:00
Tom Gundersen
4eeaebd40b prometheus/job: measure time spent pending rather than queued
We are interested in the time it takes from a job could be dequeued
until it is, but if a job has dependencies that are not yet finished, it
cannot be dequeued.

Change the logic to measure the time since the last dependency was
dequeued rather than when the job was queued.

The purpose of this metric is to have an alert fire in case we have too
few workers processing jobs.
2022-05-14 17:47:38 +01:00
Gianluca Zuccarelli
80f24dbd61 metrics: change job metrics namespace
Currently the job metrics are namespaced with the composer
subsystem, i.e. `composer_worker`. Since we plan to split
the components to their own namespaces in app interface,
the worker subsystem should be split too.
2022-02-08 15:57:12 +01:00
Gianluca Zuccarelli
290472dfdf metrics: add worker error metrics
This commit introduces the collection of error
metrics since it is now possible to differentiate
between internal errors and user input errors.
Additionally, the error status is reported for
job duration metrics.
2022-02-03 23:40:42 +00:00
Gianluca Zuccarelli
bce12b7bea metrics: extract metric collection
Refactor the current metric collection to make use
of re-usable functions, since some of the same queries
are repeated. This will also make it easier to move
the collection of metrics from the job queue.
2022-02-03 23:40:42 +00:00
Gianluca Zuccarelli
e165db63ea metrics: add additional buckets
The change between the 32s bucket and the 64s bucket is too drastic
for measuring the duration of depsolve jobs. At present, 90% of the
depsolve jobs have a duration inbetween 32s and 64s, making the 32s
bucket too sensitive and the 64s bucket not sensitive enough.
2021-12-15 19:53:11 +00:00
Gianluca Zuccarelli
1a709eda5c metrics: add initial job metrics
Add job metrics to track the number of
pending/running jobs, the duration of
the jobs and how long the jobs spent in
the job queue.
2021-12-08 21:49:43 +00:00