Florian Schüller
30198922a5
templates/dashboards: increase timespan readability
...
Also introduces "50min." as we use this now and
shorten some titles to see which charts are affected by
the `target_duration`.
2025-03-05 10:27:54 +01:00
Florian Schüller
1f8da3bd83
templates/dashboards/grafana: enable shared crosshair
...
shared crosshair makes it easier to see the affected time
in other charts/panels
2024-11-06 11:55:22 +01:00
Florian Schüller
3ff8308389
templates/dashboards/grafana: introduce number of pending jobs
2024-11-06 11:55:22 +01:00
Sanne Raymaekers
786f44e7e7
templates/dashboards: human readable job duration targets
...
Also makes the default 40m, which is the new slo target for osbuild
jobs.
2024-07-04 12:46:19 +02:00
Sanne Raymaekers
55439fc6d3
templates/dashboards: remove active worker count
...
It's misleading since it counts the amount of workers that have
registered to the current composer pods, it doesn't actually keep track
of the active workers.
Remove it and keep the worker-api stats as a proxy for active workers.
2024-06-12 17:20:01 +02:00
Sanne Raymaekers
c886d6c1f5
templates/dashboards: fix community-stage tenant variable
...
A space is necessary before and after the colon separating the key and
the value.
[skip ci]
2024-05-08 12:59:34 +02:00
Sanne Raymaekers
e607f3b629
dashboards/worker-general: bump version
2024-04-22 13:05:39 +02:00
Sanne Raymaekers
f6acb31dd8
dashboards/worker-general: add community-stage tenant
2024-04-22 13:05:39 +02:00
Sanne Raymaekers
2eea99d008
dashboards/worker-general: min intervals and multi tooltip mode
2024-04-22 13:05:39 +02:00
Sanne Raymaekers
10d2e272a4
dashboards/worker-general: add active worker count
2024-04-22 13:05:39 +02:00
Sanne Raymaekers
95ae8ed917
dashboards/worker-general: fix tenant query
2024-04-22 13:05:39 +02:00
Sanne Raymaekers
ac9f4a2c81
dashboards/worker-general: update schema
2024-04-22 13:05:39 +02:00
Sanne Raymaekers
44426bb48f
templates/dashboards: add community stage service to orgs
2024-02-05 11:38:53 +01:00
Sanne Raymaekers
bf3ff40a65
dashboards: drop interval from composer dashboard and fix slo
...
The latency budget remaining used $__range instead of the 28d constant.
2023-10-03 11:48:37 +02:00
Sanne Raymaekers
f05a5b59f3
dashboards: drop API section from worker job stats dashboard
...
Renames the worker dashboard to worker job stats dashboard.
Drops the interval variable and relies solely on $__range and
$__rate_interval.
2023-10-03 11:48:37 +02:00
Sanne Raymaekers
1475e216d2
dashboards: add worker api dashboard
...
Also this one is made without a separate interval variable, instead
relying on $__rate_interval and $__interval.
2023-10-03 11:48:37 +02:00
Sanne Raymaekers
33f9a6726e
dashboards: fix composer dash request rate errors
2023-10-02 18:50:37 +02:00
Sanne Raymaekers
715bdba1bf
dashboards/worker: default to showing the past 6 hours
...
The worker dashboards contains slow queries, running these on 28 days of
data take a very long time (and they often time out).
2023-08-24 17:01:23 +02:00
Sanne Raymaekers
a2c07ea83a
templates/dashboards: rework composer dashboard
...
splits the board into 3 sections:
- SLO
- API throughput
- API latency
It's also possible to filter by tenant. And some colours were adjusted
to improve readability.
2023-06-30 11:06:51 +02:00
Sanne Raymaekers
a2a3a2602c
templates/dashboards/worker: add arch label to job wait duration
...
Display the wait duration of jobs per architecture.
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
b13865d361
templates/dashboards/worker: edit thresholds
...
95th percentile duration is now a fixed colour, as it's tricky to get
dynamic thresholds based on the job type.
Budget remaining thresholds are now only green at infinity, turn yellow
below 4 weeks, and turn red when budget consumption would only last 3
weeks (out of 4).
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
63d5132aa6
templates/dashboards/worker: change panel alignment
...
This aligns vertical dividers between panels across rows.
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
865bb98034
templates/dashboards/worker: bump version
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
5a9f8d3457
templates/dashboards/worker: show request throughput per path
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
26a521f54d
templates/dashboards/worker: use jobtype variable for job stats
...
This removes the rows of panels per job type, and uses the jobtype
variable.
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
5d2f84cb9e
templates/dashboards/worker: add target duration
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
0b7e94b097
templates/dashboard/worker: add job type variable
2023-03-21 12:34:09 +01:00
Gianluca Zuccarelli
5aae10c951
templates/dashboards: update worker queries
...
The workers now use a new metric to record all
http requests. This commit updates the worker dashboard
to use the new `image_builder_worker_request_count`
query.
2023-01-09 16:52:16 +01:00
Gianluca Zuccarelli
50237e3797
templates/dashboards: update composer queries
...
osbuild-composer now uses a new metric to record all
http requests. This commit updates the composer dashboard
to use the new `image_builder_composer_request_count`
query.
2023-01-09 16:52:16 +01:00
Sanne Raymaekers
b5d1c8866a
templates/dashboards: Bump worker dashboard version
2022-09-14 19:43:47 +02:00
Sanne Raymaekers
db978c32bd
templates/dashboards: Fix tenant name to org id mapping
...
The crc stage tenant and fedora stage tenant were mixed up.
2022-09-14 19:43:47 +02:00
Sanne Raymaekers
cb38a92a39
templates/dashboards: Expand job wait duration panels
2022-09-14 19:43:47 +02:00
Gianluca Zuccarelli
1fb6a574cb
templates: filter worker dashboard on arch
...
Add the ability to filter the build job
types by architecture using the `arch`
dropdown.
2022-08-03 13:38:52 +02:00
Sanne Raymaekers
14208d872b
templates/dashboards: Add brew tenants
...
Also:
- Gives tenants a nice display name.
- Makes "All" the default
2022-08-01 21:45:06 +01:00
Sanne Raymaekers
9347a30775
templates/dashboards: Drop arch from osbuild jobtype
...
This changed in #2845 , and the dashboards stopped working properly as
they were looking for `osbuild+:arch`.
Keep the glob however, to also capture older metrics. The glob can be
removed after 1 month, as that's how long metrics are stored.
2022-08-01 13:37:28 +02:00
Chloe Kaubisch
86971ca312
templates: update dashboards to include tenant
...
Add a tenant variable to the composer dashboard, with the option
to select multiple tenants. Add tenant filter to queries accordingly.
link to dashboard: https://grafana.stage.devshift.net/d/image-builder-worker-with-tenant/image-builder-worker?orgId=1
2022-07-18 18:55:13 +02:00
Sanne Raymaekers
edcc0866b3
templates/dashboards: Bump dashboard versions
...
[skip ci]
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
01e2caf95e
templates/dashboards: Set default timerange to 28 days
...
All our SLOs apply to a 28d period. The default state of the board
should reflect that.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
be6f6f04b8
templates/dashboards: Rename composer latency titles
...
These measure latency across all requests, not just compose requests.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
c4d529be5c
templates/dashboards: Add thresholds to duration/latency graphs
...
Show the threshold where we have an SLO target.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
2da910d3e4
templates/dashboards: Bump duration/latency gauges to 95p
...
This reflects the SLO target of 95%.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
4eb4894c3a
templates/dashboards: Reverse order in duration/latency graphs
...
In these graphs p99 isn't very important. If 1% of jobs are slow that's
fine. The p50 and p95 slices are the important ones, so reorder and
recolor the duration graphs to reflect this.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
060d3ae85d
templates/dashboards: Bump worker latency slo variable to 0.95
...
This reflects the actual SLO target of 95%.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
16491149fc
templates/dashboards: Reduce the interval
...
The interval dictates the granularity of the graphs. As the interval
decreases, spikes and dips become more pronounced. 28 days as an
interval doesn't actually show much, reduce this to 6h by default which
is a happy medium.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
8a51b5db39
templates/dashboards: Remove max from compose req success budget
...
Values over 100% are useful as those actually impact the error budget.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
eded793788
templates/dashboards: Remove max from build error rate budget
...
Values over 100% are useful as those actually impact the error budget.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
c1a44b6813
templates/dashboards: Bump grafana schema version
...
This makes the following diffs smaller.
2022-05-17 19:06:25 +02:00
Gianluca Zuccarelli
19e2fb7fb5
template: composer dashboard queries
...
Tidy up the queries for the composer dashboard
and making them more readable in grafana. Additionally
add some fallback values for when empty query results
are returned from prometheus.
2022-03-14 16:11:05 +01:00
Gianluca Zuccarelli
1f2fd8cb76
templates: worker depsolve error display
...
Fix the display of the depsolve error rate
panel. The panel had an incorrect min value of
3 (or 300%).
2022-03-14 16:11:05 +01:00
Gianluca Zuccarelli
8e8d99336f
templates/worker: fix depsolve error rate
...
The depsolve error rate had the incorrect query
and was returning the error rate for the build
jobs. This has now been fixed.
2022-02-22 19:55:14 +00:00