Commit graph

57 commits

Author SHA1 Message Date
Sanne Raymaekers
2a621521a8 osbuildexecutor/aws.ec2: set hostname of executor via cloud-init
This way much more of the journal will be captured under the new
hostname.
2024-06-25 10:58:10 +02:00
Sanne Raymaekers
ae4467ab0d internal/awscloud: retry CreateFleet
When receiving the "UnfillableCapacity" error from CreateFleet, retry
the request with an OnDemand instance.
2024-06-24 12:50:37 +02:00
Sanne Raymaekers
2e31ea50aa cloud/awscloud: use instance requirements when creating secure instance 2024-06-14 10:59:58 +02:00
Sanne Raymaekers
314ed4b527 cloud/awscloud: allow internet access on secure instance again
The executor is timing out and there are no logs. This will require some
further work. Remove the restriction for now.
2024-03-20 14:58:25 +01:00
Sanne Raymaekers
79b5b736e9 cloud/awscloud: restrict network egress for secure instance
The security instance should no longer have any internet access.
2024-03-19 17:07:30 +01:00
Tomáš Hozza
e7743f17ec Worker: allow configuring executor CloudWatch group
We need the ability to use different CloudWatch group for the
osbuild-executor on Fedora workers in staging and production
environment.

Extend the worker confguration to allow configuring the CloudWatch group
name used by the osbuild-executor. Extend the secure instance code to
instruct cloud-init via user data to create /tmp/cloud_init_vars file
with the CloudWatch group name in the osbuild-executor instance, to make
it possible for the executor to configure its logging differently based
on the value.

Cover new changes by unit tests.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-03-08 13:13:44 +01:00
Sanne Raymaekers
040eec4089 osbuild-worker: allow adding key to aws.ec2 executor
This is useful during testing to set up the executor machine.
2024-03-01 19:20:51 +01:00
Amelia Crate
b3bb851863 Tag rhel 9.2+ with SEV_LIVE_MIGRATABLE_V2
SEV-SNP capable kernels containing commit ac3f9c9f are compatible.
SEV_LIVE_MIGRATABLE indicated compatibility with an older version of SEV live migration, without ac3f9c9f.
See: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=ac3f9c9f1b37edaa7d1a9b908bc79d843955a1a2
2024-02-22 15:45:39 +01:00
Sanne Raymaekers
5025ec31d3 cloud/awscloud: describe security groups using filters
Using the group names option only works for the default VPC, the workers
are not running in the default VPC. For non-default VPCs filters should
be used.
2024-02-20 15:23:52 +01:00
Sanne Raymaekers
7fce482baa cloud/awscloud: create secure instance in the same subnet
This reduces network costs as transferring data between AZs is not free.
2024-02-16 15:21:20 +01:00
Sanne Raymaekers
ee6b198b0a cloud/awscloud: remove restricting egress rule from SG
The machine still needs to be able to fetch sources, so just keep the
default 0.0.0.0/0 rule.
2024-02-15 14:23:18 +01:00
Sanne Raymaekers
8e6717fa1b cloud/awscloud: take instance type from host
InstanceRequirements is very flakey, the create fleet request fails
almost consistently with the same error.

To continue with testing use a fixed instance type for now. As a
followup we can expand the instance type selection logic or figure out
what was wrong with the InstanceRequirements.
2024-02-14 18:15:25 +01:00
Sanne Raymaekers
8a1d66a0bd cloud/awscloud: max 4 overrides are allowed when creating a fleet
```
InvalidParameterValue: Your request contains more than the maximum allowed number of InstanceRequirements (4)
```
2024-02-14 15:24:42 +01:00
Sanne Raymaekers
7fd150b938 cloud/awscloud: specify subnets when creating secure instance
For non-default VPCs, AWS needs the subnets it can launch the instance
in, otherwise it will try to launch the instance in the default VPC,
even if the supplied security groups are attached to a non-default VPC.

Furthermore there can only be 1 subnet specified per availability zone,
so query the subnets in the VPC of the host (as the instance needs to be
launched in the same network), and pick 1 of the VPC's subnets per AZ.
2024-02-14 13:45:52 +01:00
Sanne Raymaekers
a2fb1bfc61 cloud/awscloud: add userdata to secure instance
This way the `worker-initialization.service` knows to spin up the
builder instead of the worker.
2024-02-14 09:54:11 +01:00
Sanne Raymaekers
3db88960c2 cloud/awscloud: add ability to run a secure instance to awscloud
This instance can only contact the host, and requires this host to be
running on AWS itself with the appropriate IAM role.
2024-02-14 09:54:11 +01:00
Sanne Raymaekers
05a45ed233 cloud/awscloud: add ec2metadata client 2024-02-14 09:54:11 +01:00
Tomáš Hozza
8ba1976b02 internal/cloud/gcp/compute: keep legacy Guest OS Features for el9.0
The SEV-SNP support was added since RHEL-9.1, so we need to keep the
original Guest OS Feature set when importing RHEL-9.0 images to GCP.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2023-08-21 16:57:33 +02:00
Timothée Ravier
4173e5d768 internal/cloud/gcp/compute: Add SEV_SNP_CAPABLE Guest OS Feature
See: https://github.com/coreos/coreos-assembler/pull/3547
See: https://cloud.google.com/blog/products/identity-security/rsa-snp-vm-more-confidential
See: https://issues.redhat.com/browse/COS-2343
2023-08-21 16:57:33 +02:00
Ondřej Budai
cac9327b44 update to go 1.19
UBI and the oldest support Fedora (37) now all have go 1.19, so we are
cleared to switch.

gofmt now reformats comments in certain cases, so that explains the formatting
changes in this commit.
See https://go.dev/doc/go1.19#go-doc

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2023-07-21 19:18:00 +02:00
Tomáš Hozza
0292725ce4 internal/GCP: remove all remaining uses of cloudbuild
Some uses of `cloudbuild` GCP API have been left in our internal cloud
API implementation for GCP. We do not use `cloudbuild` to import GCE
images into GCP any more.

Do not request the `cloudbuild` authentication scope when getting new
GCP client.

Update vendored packages accordingly.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2023-05-24 19:28:06 +02:00
dependabot[bot]
60e55b5ed3 build(deps): bump cloud.google.com/go/compute from 1.10.0 to 1.19.3
Bumps [cloud.google.com/go/compute](https://github.com/googleapis/google-cloud-go) from 1.10.0 to 1.19.3.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/documentai/CHANGES.md)
- [Commits](https://github.com/googleapis/google-cloud-go/compare/kms/v1.10.0...compute/v1.19.3)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/compute
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Migrated to the new version by following
https://github.com/googleapis/google-cloud-go/blob/main/migration.md

Co-authored-by: Tomáš Hozza <thozza@redhat.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2023-05-22 11:51:42 +02:00
Tomáš Hozza
e13f0a1ae2 AWS: allow specifying the AMI boot mode when registering the image
When the AMI is being registered from a snapshot, the caller can
optionally specify the boot mode of the AMI. If no boot mode is
specified, then the default behavior is to use the boot type of the
instance that is launched from the AMI.

The default behavior (no boot type specified) is preserved after this
change.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2023-05-19 13:24:39 +02:00
Brian C. Lane
7a4bb863dd Update deprecated io/ioutil functions
ioutil has been deprecated since go 1.16, this fixes all of the
deprecated functions we are using:

ioutil.ReadFile -> os.ReadFile
ioutil.ReadAll -> io.ReadAll
ioutil.WriteFile -> os.WriteFile
ioutil.TempFile -> os.CreateTemp
ioutil.TempDir -> os.MkdirTemp

All of the above are a simple name change, the function arguments and
results are exactly the same as before.

ioutil.ReadDir -> os.ReadDir

now returns a os.DirEntry but the IsDir and Name functions work the
same. The difference is that the FileInfo must be retrieved with the
Info() function which can also return an error.

These were identified by running:
golangci-lint run --build-tags=integration ./...
2023-03-07 09:22:23 -08:00
Sanne Raymaekers
2e3dd16220 osbuild-service-maintenance: clean up all regions
Since we started cloning images to different regions, the maintenance
script should clean up all of these regions.
2023-01-25 14:20:51 +01:00
Ondřej Budai
b997142db0 common: merge all *ToPtr methods to one generic ToPtr
After introducing Go 1.18 to a project, it's required by law to convert at
least one method to a generic one.

Everyone hates IntToPtr, StringToPtr, BoolToPtr and Uint64ToPtr, so let's
convert them to the ultimate generic ToPtr one.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2023-01-09 14:03:18 +01:00
Colin Walters
a3a733a638 gcp: Cross-reference to coreos-assembler code
At the moment we have duplicate logic here; ideally of course
we consolidate (since both codebases are Go, perhaps we could
create a tiny little Go library for "RHEL GCP stuff"?) but
for now let's just cross-link for awareness.
2022-11-30 11:13:31 +01:00
Ondřej Budai
0e6c132ee6 awscloud: add option to mark S3 object as public
By setting the object's ACL to "public-read", anyone can download the object
even without authenticating with AWS.

The osbuild-upload-generic-s3 command got a new -public argument that
uses this new feature.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-09-19 22:56:36 +02:00
Ondřej Budai
381bce9ac0 awscloud: close the file after it's uploaded to S3
Oops, this was forgotten.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-09-19 22:56:36 +02:00
Brian C. Lane
7bae91eb9a gcp: Placeholder credential string is not a hardcoded credential 2022-09-15 03:57:40 -07:00
Sanne Raymaekers
099b34b301 worker: Define new jobs to handle copying and resharing of images
The copy job copies from one region to another. It does not preserve the
sharing on the ami and it's snapshot, that needs to be queued
separately.
2022-08-30 16:14:52 +02:00
Juan Abia
f6fa5ccca1 remove cloud cleaner
scheduled cloud cleaner now uses a new method to remove azure resources.
So cloud cleaner code is not used
2022-07-01 17:47:44 +02:00
Ygal Blum
8407c97d96 Upload to HTTPS S3 - Support self signed certificate
API
---
Allow the user to pass the CA public certification or skip the verification

AWSCloud
--------
Restore the old version of newAwsFromCreds for access to AWS
Create a new method newAwsFromCredsWithEndpoint for Generic S3 which sets the endpoint and optionally overrides the CA Bundle or skips the SSL certificate verification

jobimpl-osbuild
---------------
Update with the new parameters

osbuild-upload-generic-s3
-------------------------
Add ca-bunlde and skip-ssl-verification flags

tests
-----
Split the tests into http, https with certificate and https skip certificate check
Create a new base test for S3 over HTTPS for secure and insecure
Move the generic S3 test to tools to reuse for secure and insecure connections
All S3 tests now use the aws cli tool
Update the libvirt test to be able to download over HTTPS
Update the RPM spec

Kill container with sudo
2022-05-26 13:46:00 +03:00
Tomas Hozza
1017aee438 cloud-cleaner: clean up GCE instances in all regions and zones
Since the `api.sh` test case is using random GCE zone from a random GCE
region which name starts with the `GCP_REGION` CI environment variable.
Since the used region name is not known to the `cloud-cleaner`, it has
to iterate over all potential GCE regions and their zones. We can not
simply filter the VM instance name a list of instances, because any
`instances` API call requires a zone name to be provided.

Add a new internal `cloud/gcp` package method to list existing GCE
regions based on a provided filter.
2022-05-17 12:18:12 +02:00
Tomas Hozza
249661a948 worker: rework GCP credentials handling
Refactor the handling of GCP credentials in the worker to be equivalent
to what is done for AWS. The main idea is that the code decides which
credentials to use when processing each job. This change will allow
preferring credentials passed via upload `TargetOptions` with the job,
over the credentials configured in worker's configuration or the default
way of authenticating implemented by the Google library.

Move loading of GCP credentials to the internal `gcp` library into
`NewFromFile()` function accepting path to the file with credentials.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2022-04-14 19:07:31 +01:00
Tomas Hozza
82a0bfc46d internal/cloud/gcp: delete unused internal API
Delete all internal `cloud/gcp` API related to importing virtual images
to GCP using Cloud Build API. This API is no longer needed.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2022-04-14 19:07:31 +01:00
Tomas Hozza
264e554971 cloud/gcp: introduce ComputeImageInsert() method
Introduce a new `ComputeImageInsert()` method for importing images into
GCP. It uses the `compute.Images.Insert()` API [1], which has many
advantages over the currently used way of importing images using the
CloudBuild API. The advantages are mainly that the image is imported as
is and no additional cache files or VMs are created as part of the
import process. Therefore there is no need to do additional cleanup of
cache files after importing the image.

In addition, the import itself is approximately 30% faster for RHEL
images when using the `Insert()` call.

Nevertheless the `Insert()` call accepts only gzip-ed tarball with a RAW
disk, unlike the `Import()` call, which accepts basically any virtual
disk format.

[1] https://cloud.google.com/compute/docs/reference/rest/v1/images/insert

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2022-04-14 19:07:31 +01:00
Ygal Blum
bee14bf392 OSBuild - add support for generic S3 services
jobimpl-osbuild
---------------
Add GenericS3Creds to struct
Add method to create AWS with Endpoint for Generic S3 (with its own credentials file)
Move uploading to S3 and result handling to a separate method (along with the special VMDK handling)
adjust the AWS S3 case to the new method
Implement a new case for uploading to a generic S3 service

awscloud
--------
Add wrapper methods for endpoint support
Set the endpoint to the AWS session
Set s3ForcePathStyle to true if endpoint was set

Target
------
Define a new target type for the GenericS3Target and Options
Handle unmarshaling of the target options and result for the Generic S3

Weldr
-----
Add support for only uploading to AWS S3
Define new structures for AWS S3 and Generic S3 (based on AWS S3)
Handle unmarshaling of the providers settings' upload settings

main
----
Add a section in the main config for the Generic S3 service for credentials
If provided pass the credentials file name to the osbuild job implementation

Upload Utility
--------------
Add upload-generic-s3 utility

Makefile
------
Do not fail if the bin directory already exists

Tests
-----
Add test cases for both AWS and a generic S3 server
Add a generic s3_test.sh file for both test cases and add it to the tests RPM spec
Adjust the libvirt test case script to support already created images
GitLabCI - Extend the libvirt test case to include the two new tests
2022-04-07 15:01:01 +02:00
Sanne Raymaekers
c4ecbea510 internal/cloud: Allow aws creds from defaults
Defaults according to https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config:
Defaults to a chain of credential providers to search for credentials in
environment variables, shared credential file, and EC2 Instance Roles.

If nothing is specified fall back to whatever instance role.
2022-02-21 15:43:53 +01:00
Sanne Raymaekers
d492e8f702 cmd/osbuild-service-maintenance: GCP deletes by image name 2022-02-15 18:22:39 +01:00
Tomas Hozza
07a5745875 internal/cloud/gcp: use pkg.go.dev/cloud.google.com/go for Compute Engine
The internal GCP package used `pkg.go.dev/google.golang.org/api` [1] to
interact with Compute Engine API. Modify the package to use the new and
idiomatic `pkg.go.dev/cloud.google.com/go` [2] library for interacting
with the Compute Engine API. The new library have been already used to
interact with the Cloudbuild and Storage APIs. The new library was not
used for Compute Engine since the beginning, because at that time, it
didn't support Compute Engine.

Update go.mod and vendored packages.

[1] https://github.com/googleapis/google-api-go-client
[2] https://github.com/googleapis/google-cloud-go

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2022-02-03 15:35:28 +01:00
Diaa Sami
487e2d0669 internal/cloud: use logrus for logging
and log upload & sharing failures
2021-12-16 11:58:41 +00:00
Juan Abia
610db6563a gosec: G601 - Implicit memory aliasing in for loop
G601 warning doen't mean there's a vulnerabilty. But this code could
have unintended bugs. Disabling warnings locally.
2021-12-13 12:17:30 +02:00
Juan Abia
c8cf835db3 gosec: G401, G501 - Weak cryptographic primitive
azure, koji and gcp use md5 hashes. Gosec is not happy with it, so we
create exceptions for them (G401, G501).
2021-12-13 12:17:30 +02:00
sanne
c43ad2b22a osbuild-service-maintenance: Clean up expired images 2021-12-03 00:14:09 +00:00
dependabot[bot]
3ccdf85295 build(deps): bump github.com/golang/protobuf from 1.4.3 to 1.5.2
Bumps [github.com/golang/protobuf](https://github.com/golang/protobuf) from 1.4.3 to 1.5.2.
- [Release notes](https://github.com/golang/protobuf/releases)
- [Commits](https://github.com/golang/protobuf/compare/v1.4.3...v1.5.2)

---
updated-dependencies:
- dependency-name: github.com/golang/protobuf
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Ondřej: I also fixed a deprecated call.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2021-09-03 18:23:54 +02:00
Tomas Hozza
3a0540dff0 test/api.sh: randomize used GCP zone from the region
The `api.sh` test currently always defaults to "<REGION>-a" zone when
creating instance using the built image. The resources in a zone may get
exhausted and the solution is to use a different zone. Currently even a
CI job retry won't help with mitigation of such error during a CI run.

Modify `api.sh` to pick random GCP zone for a given region when creating
a compute instance. Use only GCP zones which are "UP".

The `cloud-cleaner` relied on the behavior of `api.sh` to always choose
the "<REGION>-a" zone. Guessing the chosen zone in `cloud-cleaner` is
not viable, but thankfully the instance name is by default unique for
the whole GCP project. Modify `cloud-cleaner` to iterate over all
available zones in the used region and try to delete the specific
instance in each of them.

Make `ComputeZonesInRegion` method from the `internal/cloud/gcp` package
exported and use it in `cloud-cleaner` for getting the list of available
zones in a region.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-07-16 10:14:30 +02:00
Tomas Hozza
5e591ccc3d GCP: Fix panic while parsing a specific build job log
The `cloudbuildResourcesFromBuildLog()` function from the internal GCP
package could cause panic while parsing Build job log which failed early
and didn't create any Compute Engine resources. The function relied on
the `Regexp.FindStringSubmatch()` method to always return a match
while being used on the build log. Accessing a member of a nil slice
would cause a panic in `osbuild-worker`, such as:

Stack trace of thread 185316:
 #0  0x0000564e5393b5e1 runtime.raise (osbuild-worker)
 #1  0x0000564e5391fa1e runtime.sigfwdgo (osbuild-worker)
 #2  0x0000564e5391e354 runtime.sigtrampgo (osbuild-worker)
 #3  0x0000564e5393b953 runtime.sigtramp (osbuild-worker)
 #4  0x00007f37e98e3b20 __restore_rt (libpthread.so.0)
 #5  0x0000564e5393b5e1 runtime.raise (osbuild-worker)
 #6  0x0000564e5391f5ea runtime.crash (osbuild-worker)
 #7  0x0000564e53909306 runtime.fatalpanic (osbuild-worker)
 #8  0x0000564e53908ca1 runtime.gopanic (osbuild-worker)
 #9  0x0000564e53906b65 runtime.goPanicIndex (osbuild-worker)
 #10 0x0000564e5420b36e github.com/osbuild/osbuild-composer/internal/cloud/gcp.cloudbuildResourcesFromBuildLog (osbuild-worker)
 #11 0x0000564e54209ebb github.com/osbuild/osbuild-composer/internal/cloud/gcp.(*GCP).CloudbuildBuildCleanup (osbuild-worker)
 #12 0x0000564e54b05a9b main.(*OSBuildJobImpl).Run (osbuild-worker)
 #13 0x0000564e54b08854 main.main (osbuild-worker)
 #14 0x0000564e5390b722 runtime.main (osbuild-worker)
 #15 0x0000564e53939a11 runtime.goexit (osbuild-worker)

Add a unit test testing this scenario.

Make the `cloudbuildResourcesFromBuildLog()` function more robust and
not blindly expect to find matches in the build log. As a result the
`cloudbuildBuildResources` struct instance returned from the function
may be empty. Subsequently make sure that the `CloudbuildBuildCleanup()`
method handles an empty `cloudbuildBuildResources` instance correctly.
Specifically the `storageCacheDir.bucket` may be an empty string and
thus won't exist. Ensure that this does not result in infinite loop by
checking for `storage.ErrBucketNotExist` while iterating the bucket
objects.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-04-29 14:48:50 +02:00
Tomas Hozza
27c5aafeca GCP: Specify and randomize GCE region used for image import
The GCP image import method currently use the Cloud Build API with
Google's Daisy workflow. This workflow creates multiple GCE resources
during its execution. Although the desired Region for the imported image
is specified as a workflow argument, this has no effect on the GCE
Zone used by the workflow for created resources. By default it seems
to default to "us-central1-a" Zone. As a result, there are common cases
of resources being exhausted in the default zone.

Add a method, which translates provided Google Storage Region to a GCE
Region, which is needed mainly for multi and dual Storage Regions.

Add a method, which returns a list of available GCE Zones for a given
GCE Region.

Modify the ComputeImageImport() method to translate the provided Google
Storage Region to list of corresponding GCE Regions. If the provided
Storage Region is not multi or dual Region, then the list contains only
a single item, the provided Region. Then pick a random Region from the
list. Subsequently get available GCE Zones within the Region and pick a
random one for use by the workflow. Specify the GCE Zone to use as a
build step argument.

This change should be completely transparent to the API user.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-04-29 09:53:29 +02:00
Tomas Hozza
c91f3b11f6 Rename all occurrences of "Compute Node" to "Compute Engine"
This is an error, there is no such thing as "Compute Node" in GCP.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-04-01 20:12:39 +02:00