Commit graph

4112 commits

Author SHA1 Message Date
Sanne Raymaekers
71c78991a6 cloudapi: Drop bucket from composer config
This value is set in the worker config. In future it might also be
passed through the api to upload into target accounts, but it should
never be set in composer.
2022-06-01 12:03:12 +02:00
Christian Kellner
c039a91b61 distro/rhel90: enable and configure NetworkManager-cloud-setup
Package was already installed, but we needed to enable the timer and
service and set the correct env variable via a drop-in to enable the
Azure cloud.
2022-05-31 10:22:22 +01:00
Christian Kellner
5c1530ee53 distro/rhel90: blacklist skx_edac,intel_cstate kernel modules on azure
Disabled by the MSFT images, we follow suit (really means we don't
exactly know why and should find out).
2022-05-31 10:22:22 +01:00
Christian Kellner
3b798edecb distro/rhel86: install and enable NetworkManager-cloud-setup
Install the package, enable timer and service and set the correct
env variable via a drop-in to enable the Azure cloud.
2022-05-31 10:22:22 +01:00
Christian Kellner
dc0ee05bc3 distro/rhel86: blacklist skx_edac,intel_cstate kernel modules on azure
Disabled by the MSFT images, we follow suit (really means we don't
exactly know why and should find out).
2022-05-31 10:22:22 +01:00
Christian Kellner
921c67cf1b distro/rhel90: compress azure-rhui images
Those images are forced to be 64GiB in size but mostly consist of zeros.
This makes them hard to handle, e.g. uploading to brew takes a forever.
The vhdPipelines is converted to a function returning the pipelinesFunc
and it has a single argument `compress` that will add the compression
pipeline bits if `true`. Will return exactly the old pipeline in case
of `false`.
2022-05-27 18:19:51 +02:00
Christian Kellner
5c90abdd0a distro/rhel86: compress azure-rhui images
Those images are forced to be 64GiB in size but mostly consist of zeros.
This makes them hard to handle, e.g. uploading to brew takes a forever.
The vhdPipelines is converted to a function returning the pipelinesFunc
and it has a single argument `compress` that will add the compression
pipeline bits if `true`. Will return exactly the old pipeline in case
of `false`.
2022-05-27 18:19:51 +02:00
Ondřej Budai
34fb2b6001 templates: add Fedora prod tenant to the ACL
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-27 17:19:19 +01:00
Sanne Raymaekers
973b209060 templates/composer: Add resources requests/limits to db migration 2022-05-27 15:09:42 +02:00
Sanne Raymaekers
b91400fd92 templates/composer: Add podAntiAffinity rule based on hostname
Linter output:
Specify anti-affinity in your pod specification to ensure that the
orchestrator attempts to schedule replicas on different nodes. Using
podAntiAffinity, specify a labelSelector that matches pods for the
deployment, and set the topologyKey to kubernetes.io/hostname.
2022-05-27 15:09:42 +02:00
Sanne Raymaekers
2208cb1122 .github: Add kube-linter check 2022-05-27 15:09:42 +02:00
Ygal Blum
8407c97d96 Upload to HTTPS S3 - Support self signed certificate
API
---
Allow the user to pass the CA public certification or skip the verification

AWSCloud
--------
Restore the old version of newAwsFromCreds for access to AWS
Create a new method newAwsFromCredsWithEndpoint for Generic S3 which sets the endpoint and optionally overrides the CA Bundle or skips the SSL certificate verification

jobimpl-osbuild
---------------
Update with the new parameters

osbuild-upload-generic-s3
-------------------------
Add ca-bunlde and skip-ssl-verification flags

tests
-----
Split the tests into http, https with certificate and https skip certificate check
Create a new base test for S3 over HTTPS for secure and insecure
Move the generic S3 test to tools to reuse for secure and insecure connections
All S3 tests now use the aws cli tool
Update the libvirt test to be able to download over HTTPS
Update the RPM spec

Kill container with sudo
2022-05-26 13:46:00 +03:00
Achilleas Koutsou
cd49c932a2 test: add prominent message in test script cleanup functions
When a test script fails in CI, it's often difficult to pinpoint the
exact line in the log where the script failed and the cleanup() function
(trapped on EXIT) begins.

Adding a prominent line (with greenprint where available) at the start
of the cleanup function will make reading logs of failed jobs a lot
easier.
2022-05-25 22:10:27 +02:00
Achilleas Koutsou
3667766661 test/old-worker: change user and package verification check
Something odd is happening with the package check and it keeps failing
mysteriously even though the package is clearly in the list.
Changing the verification method to extract `passwd` and `packages` from
the image info file into separate files and grepping those seems to
work.
2022-05-25 13:23:20 +02:00
Ondřej Budai
ea36377925 terraform: bump to a version that does spot fleets
This should help with errors that we are seeing recently about not enough
capacity.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-25 11:59:32 +02:00
Tomas Hozza
31ff2a2283 tests/gcp: pick machine type from those available in the zone
Do not rely on the default machine type when creating a GCE instance,
but rather list the available machine types in the zone and pick from
them. Test cases will pick the smallest machine type which name matches
the `^n\d-standard-\d$` regular expression.

This should prevent CI failures like
https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/jobs/2497043942#L2930
2022-05-25 09:51:37 +02:00
Christian Kellner
4c7bf735fe distro/rhel90: install nm-cloud-setup for azure-rhui
Install the "NetworkManager-cloud-setup" on Azure Marketplace images.
2022-05-23 11:02:18 +02:00
Christian Kellner
ec8a8bb22a distro/rhel90: properly set grub2 config from ImageConfig
We need to actually set the grub2 configuration if there is one. Doh.
2022-05-23 11:02:18 +02:00
schutzbot
d493f3b510 Post release version bump
[skip ci]
2022-05-19 20:26:12 +00:00
Sanne Raymaekers
7529382890 go.mod: Update openshift-online/ocm-sdk-go
This requires golang-jwt/jwt/v4.
2022-05-19 22:18:42 +02:00
Achilleas Koutsou
56a7059b40 gitlab: limit old-worker-new-composer to 8.5 GA
The test script stops if it's not running on GA, so let's not deploy the
rest of the machines anyway.
2022-05-19 20:03:24 +02:00
Achilleas Koutsou
472d550227 test: use cloud API for old-worker-new-composer
Reasons for this change:
- Mixed versions of composer and worker aren't a realistic use-case for
  the weldr API (on prem) but we do run mixed versions in hosted IB, so
  this test is closer to real world scenarios.
- The cloud API runs depsolve jobs in the worker, whereas the weldr API
  runs them in composer.  By testing the cloud API we also test the
  backwards compatibility of the depsolve job.

The change requires osbuild-worker v51 or newer to be able to handle
depsolve and manifest jobs on the worker as well as depsolve chains.
2022-05-19 20:03:24 +02:00
Achilleas Koutsou
b38e5f85c3 test/regression-old-worker-new-composer: clean whitespace
Clean trailing whitespace from test script.
2022-05-19 20:03:24 +02:00
Simon Steinbeiss
da453062e1 Post release version bump
[skip ci]
2022-05-19 11:48:42 +02:00
Christian Kellner
4e9e438b75 distro/rhel90: add support for azure marketplace
Add support for building images for the Azure marketplace: add a
new image type "azure-rhui" that can be used to build images
tailored to the Azure marketplace.
This code is based on the corresponding image type in 8.6.

NB: does not have systemd-resovled (following RHEL 9 defaults)
2022-05-19 11:22:47 +02:00
Christian Kellner
8ee19af1d0 test-case-generators/repos: remove duplicated rhel-91 block
We really only can have one. The one that was used for the generation
of the manifests is kept and the other one removed (although it has
newer repositories).
2022-05-19 11:22:47 +02:00
Christian Kellner
6e2cb208bf test/data/manifests: regenerate rhel-91
It seems rhel-91 qcow2 customize images are out of sync because commit
2beb707 removed the core group from the `format-request-map.json` and
some these said manifests were generated between that commit and the
one that added it back 1ff36bce9.
2022-05-19 11:22:47 +02:00
Sanne Raymaekers
5658cadcae shutzbot: Add sanne@redhat.com ssh key to CI's authorized_keys
[skip ci]
2022-05-18 13:28:11 +02:00
Sanne Raymaekers
edcc0866b3 templates/dashboards: Bump dashboard versions
[skip ci]
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
01e2caf95e templates/dashboards: Set default timerange to 28 days
All our SLOs apply to a 28d period. The default state of the board
should reflect that.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
be6f6f04b8 templates/dashboards: Rename composer latency titles
These measure latency across all requests, not just compose requests.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
c4d529be5c templates/dashboards: Add thresholds to duration/latency graphs
Show the threshold where we have an SLO target.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
2da910d3e4 templates/dashboards: Bump duration/latency gauges to 95p
This reflects the SLO target of 95%.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
4eb4894c3a templates/dashboards: Reverse order in duration/latency graphs
In these graphs p99 isn't very important. If 1% of jobs are slow that's
fine. The p50 and p95 slices are the important ones, so reorder and
recolor the duration graphs to reflect this.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
060d3ae85d templates/dashboards: Bump worker latency slo variable to 0.95
This reflects the actual SLO target of 95%.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
16491149fc templates/dashboards: Reduce the interval
The interval dictates the granularity of the graphs. As the interval
decreases, spikes and dips become more pronounced. 28 days as an
interval doesn't actually show much, reduce this to 6h by default which
is a happy medium.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
8a51b5db39 templates/dashboards: Remove max from compose req success budget
Values over 100% are useful as those actually impact the error budget.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
eded793788 templates/dashboards: Remove max from build error rate budget
Values over 100% are useful as those actually impact the error budget.
2022-05-17 19:06:25 +02:00
Sanne Raymaekers
c1a44b6813 templates/dashboards: Bump grafana schema version
This makes the following diffs smaller.
2022-05-17 19:06:25 +02:00
Juan Abia
031b67566b scheduled-cloud-cleaner: remove storage account skip
scheduled cloud cleaner is skipping the default storage account for a
resource group, as this images should get removed. There can be a
situation where this images are not removed and forgotten here. Remove
this skip condition so scc checks also in this storage account.
2022-05-17 16:37:18 +02:00
Xiaofeng Wang
a6e2755fad test: Add running podman with non-root test
Bug BZ#2078937 has been fixed by osbuild PR#1013. Test should be
updated to test the fix and avoid regression
2022-05-17 21:25:49 +08:00
Tomas Hozza
1017aee438 cloud-cleaner: clean up GCE instances in all regions and zones
Since the `api.sh` test case is using random GCE zone from a random GCE
region which name starts with the `GCP_REGION` CI environment variable.
Since the used region name is not known to the `cloud-cleaner`, it has
to iterate over all potential GCE regions and their zones. We can not
simply filter the VM instance name a list of instances, because any
`instances` API call requires a zone name to be provided.

Add a new internal `cloud/gcp` package method to list existing GCE
regions based on a provided filter.
2022-05-17 12:18:12 +02:00
Tomas Hozza
18dfa9d9c9 Improve GCP test cases to pick regions with available quota
We currently use a single GCP Compute region when spinning up VMs using
the imported GCE image. As a result, we are often hitting the
'IN_USE_ADDRESSES' quota limit when there are multiple CI jobs running.
Google does not allow us to increase the quota limit any more.

Change the GCP test cases to use the CI `GCP_REGION` variable to list
all GCE regions with available quota and pick a random one from the
list. The `GCP_REGION` value is used as the region name prefix when
filtering available regions. This means that if you specify an exact GCE
region, such as `us-west1`, you'll always get the same region, but if a
GCP multi-region is used, such as `us`, then a random region prefixed
with 'us' will be used.
2022-05-17 12:18:12 +02:00
Jakub Rusz
f0f0873d6e ci: run all scripts in after_script regarless of failure
We want to run all of the scripts in after_script even if some of them
fail. In aws we have rhui repos in the images and we don't use them on
GA RHEL so ci_details.sh fails there and cloud_cleaner does not run.
2022-05-17 11:20:57 +02:00
Christian Kellner
5983c295b3 distro/rhel86: ignore SRIOV interface via new udev rule on azure-rhui
Add a new udev rule that ignores the SRIOV network interface. See the
supplied comment for details why.
2022-05-16 15:46:46 +02:00
Christian Kellner
9d5787a475 distro: add support udev rules to image config
Add support for defining udev rules via the recently added udev.rules
stage to the image configs and all pipelines support it.
2022-05-16 15:46:46 +02:00
Christian Kellner
e08fd989ed osbuild2: add udev.rules stage
The `org.osbuild.udev.rules` stage creates custom udev rules files.
This is a full implementation of the stage and includes information
about valid operators and keys.
A small test suit to test the basic functionality and validation is
included.
2022-05-16 15:46:46 +02:00
Chloe Kaubisch
13c79294b6 cloudapi: validate input
Validate incoming requests with openapi3. Remove unsupported
uuid format from the openapi spec. Similarly, change url to uri as
uri is a supported format and url is not.

Co-authored-by: Ondřej Budai <obudai@redhat.com>
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-16 13:20:46 +02:00
Ondřej Budai
f616becf39 cloudapi/test: add task_id to the compose request
It's actually required by the schema.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-16 13:20:46 +02:00
Ondřej Budai
00d602efc3 cloudapi: make UploadOptions anyOf
oneOf means that the body is valid against exactly ONE schema. There's an
issue with AWS EC2 upload options though: It requires region and
share_with_accounts fields. Such a request is also valid AWS S3 upload though
(this one only require region). This means that AWS EC2 upload options will be
always valid against two schemas which violates the oneOf rule.

Let's switch to anyOf and explain this in the openAPI spec.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-16 13:20:46 +02:00