We are interested in the time it takes from a job could be dequeued
until it is, but if a job has dependencies that are not yet finished, it
cannot be dequeued.
Change the logic to measure the time since the last dependency was
dequeued rather than when the job was queued.
The purpose of this metric is to have an alert fire in case we have too
few workers processing jobs.
Currently the job metrics are namespaced with the composer
subsystem, i.e. `composer_worker`. Since we plan to split
the components to their own namespaces in app interface,
the worker subsystem should be split too.
This commit introduces the collection of error
metrics since it is now possible to differentiate
between internal errors and user input errors.
Additionally, the error status is reported for
job duration metrics.
Refactor the current metric collection to make use
of re-usable functions, since some of the same queries
are repeated. This will also make it easier to move
the collection of metrics from the job queue.
The change between the 32s bucket and the 64s bucket is too drastic
for measuring the duration of depsolve jobs. At present, 90% of the
depsolve jobs have a duration inbetween 32s and 64s, making the 32s
bucket too sensitive and the 64s bucket not sensitive enough.