debian-forge-composer

Author	SHA1	Message	Date
Sanne Raymaekers	2837b2a3ad	prometheus: split off request timing information into separate mw Tracks the worker api in addition to the composer api.	2023-06-28 15:08:37 +02:00
Sanne Raymaekers	06038b2af6	internal/prometheus: add tenant to http and status metrics	2023-06-28 15:08:37 +02:00
Sanne Raymaekers	a25e0f4adb	prometheus:: add arch label to dequeue metrics Only add the arch label for osbuild job types, as the finish metrics behave similarly. Having arch labels on dequeue metrics for any other job type (but not on the finish metrics) would produce weird results.	2023-03-09 18:47:57 +01:00
Jakub Rusz	3cdfa9d7f0	internal/prometheus: add more buckets for job durations We were hitting the limit on stage, let's increase it.	2023-02-08 12:33:10 +01:00
Gianluca Zuccarelli	25faf5ab60	internal/prometheus: remove compose fail metrics We have switched how 5xx errors are being recorded internally and we are now recording all failures for all endpoints. As a result, a dedicated metric only for compose failures is no longer required.	2023-01-12 12:55:01 +01:00
Gianluca Zuccarelli	5457b9fba2	metrics: update status metrics label Openshift overrides the `service` label for all metrics in the cluster. Update the label from `service` to `subsystem` for the status metrics query. This helps us differentiate between requests from composer and the worker server.	2022-12-02 09:25:40 +01:00
Gianluca Zuccarelli	8756ea717d	prometheus: middleware to record 5xx errors Create a custom middleware function to measure 5xx requests for all composer & worker routes and not just the `/composer` endpoint. The result is a prometheus metric that contains info on the request status code, path & method. A helper function has been added to clean the dynamic parameters in the path routes to reduce metric cardinality	2022-11-30 11:14:29 +01:00
Gianluca Zuccarelli	33e53398a6	prometheus: add status metrics Add a helper function to register the same metrics for both the worker and composer - the only difference being the subsystem name. The function checks if the metric has already been registered and, if so, returns the already registered metric.	2022-11-30 11:14:29 +01:00
Gianluca Zuccarelli	8e82b223af	prometheus: move constants to a single file Move the constants to a single file and export them. These can then later be used externally for future use with the ocm metrics.	2022-11-30 11:14:29 +01:00
Gianluca Zuccarelli	9f4e765657	metrics: build jobs arch label Add the architecture label to build jobs which will enable filtering and monitoring build jobs by architecture. Build job results contain the `arch` field in the results struct, this is then used to pass to the metrics, where there is a value, otherwise it is set to an empty string.	2022-07-27 13:37:14 +02:00
Chloe Kaubisch	873798514b	prometheus: add tenant label Include a tenant label for all prometheus metrics. Modify jobstatus function in the worker accordingly to return channel so it can be passed to prometheus.	2022-06-07 16:35:03 +02:00
Tom Gundersen	4eeaebd40b	prometheus/job: measure time spent pending rather than queued We are interested in the time it takes from a job could be dequeued until it is, but if a job has dependencies that are not yet finished, it cannot be dequeued. Change the logic to measure the time since the last dependency was dequeued rather than when the job was queued. The purpose of this metric is to have an alert fire in case we have too few workers processing jobs.	2022-05-14 17:47:38 +01:00
Gianluca Zuccarelli	80f24dbd61	metrics: change job metrics namespace Currently the job metrics are namespaced with the composer subsystem, i.e. `composer_worker`. Since we plan to split the components to their own namespaces in app interface, the worker subsystem should be split too.	2022-02-08 15:57:12 +01:00
Gianluca Zuccarelli	290472dfdf	metrics: add worker error metrics This commit introduces the collection of error metrics since it is now possible to differentiate between internal errors and user input errors. Additionally, the error status is reported for job duration metrics.	2022-02-03 23:40:42 +00:00
Gianluca Zuccarelli	bce12b7bea	metrics: extract metric collection Refactor the current metric collection to make use of re-usable functions, since some of the same queries are repeated. This will also make it easier to move the collection of metrics from the job queue.	2022-02-03 23:40:42 +00:00
Gianluca Zuccarelli	e165db63ea	metrics: add additional buckets The change between the 32s bucket and the 64s bucket is too drastic for measuring the duration of depsolve jobs. At present, 90% of the depsolve jobs have a duration inbetween 32s and 64s, making the 32s bucket too sensitive and the 64s bucket not sensitive enough.	2021-12-15 19:53:11 +00:00
Gianluca Zuccarelli	1a709eda5c	metrics: add initial job metrics Add job metrics to track the number of pending/running jobs, the duration of the jobs and how long the jobs spent in the job queue.	2021-12-08 21:49:43 +00:00
Gianluca Zuccarelli	91f2457363	metrics: add prometheus namespaces Make use of the prometheus namespace and subsystem to give the metrics a consistent namespaces in openshift.	2021-11-19 22:48:25 +01:00
Gianluca Zuccarelli	f8199ec41d	prometheus: add middleware function Add middleware function to track request count and measure the latency of compose requests.	2021-10-29 20:36:18 +01:00
Gianluca Zuccarelli	dfa6a48f5d	prometheus: compose latency metric Add metric to measure the latency of requests made to the composer cloud api.	2021-10-29 20:36:18 +01:00
Chloe Kaubisch	f749078b0d	prometheus: update metrics Change the name of total https requests to be more specific. Add a new counter for failed compose requests.	2021-10-29 17:09:45 +01:00
Chloe Kaubisch	4c800f29a7	worker: add metrics use prometheus to gather metrics	2021-07-23 21:54:28 +02:00

22 commits