Commit graph

10 commits

Author SHA1 Message Date
Tomas Hozza
5e591ccc3d GCP: Fix panic while parsing a specific build job log
The `cloudbuildResourcesFromBuildLog()` function from the internal GCP
package could cause panic while parsing Build job log which failed early
and didn't create any Compute Engine resources. The function relied on
the `Regexp.FindStringSubmatch()` method to always return a match
while being used on the build log. Accessing a member of a nil slice
would cause a panic in `osbuild-worker`, such as:

Stack trace of thread 185316:
 #0  0x0000564e5393b5e1 runtime.raise (osbuild-worker)
 #1  0x0000564e5391fa1e runtime.sigfwdgo (osbuild-worker)
 #2  0x0000564e5391e354 runtime.sigtrampgo (osbuild-worker)
 #3  0x0000564e5393b953 runtime.sigtramp (osbuild-worker)
 #4  0x00007f37e98e3b20 __restore_rt (libpthread.so.0)
 #5  0x0000564e5393b5e1 runtime.raise (osbuild-worker)
 #6  0x0000564e5391f5ea runtime.crash (osbuild-worker)
 #7  0x0000564e53909306 runtime.fatalpanic (osbuild-worker)
 #8  0x0000564e53908ca1 runtime.gopanic (osbuild-worker)
 #9  0x0000564e53906b65 runtime.goPanicIndex (osbuild-worker)
 #10 0x0000564e5420b36e github.com/osbuild/osbuild-composer/internal/cloud/gcp.cloudbuildResourcesFromBuildLog (osbuild-worker)
 #11 0x0000564e54209ebb github.com/osbuild/osbuild-composer/internal/cloud/gcp.(*GCP).CloudbuildBuildCleanup (osbuild-worker)
 #12 0x0000564e54b05a9b main.(*OSBuildJobImpl).Run (osbuild-worker)
 #13 0x0000564e54b08854 main.main (osbuild-worker)
 #14 0x0000564e5390b722 runtime.main (osbuild-worker)
 #15 0x0000564e53939a11 runtime.goexit (osbuild-worker)

Add a unit test testing this scenario.

Make the `cloudbuildResourcesFromBuildLog()` function more robust and
not blindly expect to find matches in the build log. As a result the
`cloudbuildBuildResources` struct instance returned from the function
may be empty. Subsequently make sure that the `CloudbuildBuildCleanup()`
method handles an empty `cloudbuildBuildResources` instance correctly.
Specifically the `storageCacheDir.bucket` may be an empty string and
thus won't exist. Ensure that this does not result in infinite loop by
checking for `storage.ErrBucketNotExist` while iterating the bucket
objects.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-04-29 14:48:50 +02:00
Tomas Hozza
27c5aafeca GCP: Specify and randomize GCE region used for image import
The GCP image import method currently use the Cloud Build API with
Google's Daisy workflow. This workflow creates multiple GCE resources
during its execution. Although the desired Region for the imported image
is specified as a workflow argument, this has no effect on the GCE
Zone used by the workflow for created resources. By default it seems
to default to "us-central1-a" Zone. As a result, there are common cases
of resources being exhausted in the default zone.

Add a method, which translates provided Google Storage Region to a GCE
Region, which is needed mainly for multi and dual Storage Regions.

Add a method, which returns a list of available GCE Zones for a given
GCE Region.

Modify the ComputeImageImport() method to translate the provided Google
Storage Region to list of corresponding GCE Regions. If the provided
Storage Region is not multi or dual Region, then the list contains only
a single item, the provided Region. Then pick a random Region from the
list. Subsequently get available GCE Zones within the Region and pick a
random one for use by the workflow. Specify the GCE Zone to use as a
build step argument.

This change should be completely transparent to the API user.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-04-29 09:53:29 +02:00
Tomas Hozza
c91f3b11f6 Rename all occurrences of "Compute Node" to "Compute Engine"
This is an error, there is no such thing as "Compute Node" in GCP.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-04-01 20:12:39 +02:00
Tomas Hozza
e799f752be GCP: clean up resources after canceled image import
Add method to fetch Cloudbuild job log.

Add method to parse Cloudbuild job log for created resources. Parsing is
specific to the Image import Cloudbuild job and its logs format. Add
unit tests for the parsing function.

Add method to clean up all resources (instances, disks, storage objects)
after a Cloudbuild job.

Modify the worker osbuild job implementation and also the GCP upload CLI
tool to use the new cleanup method CloudbuildBuildCleanup().

Keep the StorageImageImportCleanup() method, because it is still used by
the cloud-cleaner tool. There is no way for the cloud-cleaner to figure
out the Cloudbuild job ID to be able to call CloudbuildBuildCleanup()
instead.

Add methods to delete Compute instance and disk.

Add method to get Compute instance information. This is useful for
checking if the instance has been already deleted, or whether it still
exists.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-04-01 20:12:39 +02:00
Tomas Hozza
6d51d285cf GCP: accept context from the caller in all methods
Modify all relevant methods in the internal GCP library to accept
context from the caller.

Modify all places which call the internal GCP library methods to pass
the context.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-04-01 20:12:39 +02:00
Tomas Hozza
e04b75f3df cloud-cleaner: clean up GCP Storage objects based on metadata
Add StorageListObjectsByMetadata() to internal GCP library. The method
allows one to search specific Storage bucket for objects based on
provided metadata.

Extend cloud-cleaner to search for any Storage objects related to the
image import, using custom metadata set on the object. Delete all found
objects.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-03-15 16:48:40 +00:00
Tomas Hozza
e698080bc7 GCP: Set image name as custom metadata on uploaded image object
Extend StorageObjectUpload() to allow setting custom metadata on the
uploaded object.

Modify worker's osbuild job implementation and GCP CLI upload tool to
set the chosen image name as a custom metadata on the uploaded object.
This will make it possible to connect Storage objects to specific
images.

Add News entry about image name being added as metadata to uploaded GCP
Storage object as part of worker job.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-03-15 16:48:40 +00:00
Tomas Hozza
aa1d038b59 cloud-cleaner: clean up image and vm after GCP integration test
Extend internal GCP library to allow deleting Compute Node image and
instance. In addition provide function to load service account
credentials file content from the environment.

Change names used for GCP image and instance in `api.sh` integration
test to make them predictable. This is important, so that cloud-cleaner
can identify potentially left over resources and clean them up. Use the
same approach for generating predictable, but run-specific, test ID as
in GenerateCIArtifactName() from internal/test/helpers.go. Use SHA224
to generate a hash from the string, because it can contain characters
not allowed by GCP for resource name (specifically "_" e.g. in "x86_64").
SHA-224 was picked because it generates short enough output and it is
future proof for use in RHEL (unlike MD5 or SHA-1).

Refactor cloud-cleaner to clean up GCP resources and also to run cleanup
for each cloud in a separate goroutine.

Modify run_cloud_cleaner.sh to be able to run in environment in which
AZURE_CREDS is not defined.

Always run cloud-cleaner after integration tests for rhel8, rhel84 and
cs8, which test GCP.

Define DISTRO_CODE for each integration testing stage in Jenkinsfile.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-03-15 16:48:40 +00:00
Tomas Hozza
f9fe699564 GCP: split internal library based on functionality
Split the GCP library into multiple files:
- compute.go - code interacting mainly with the Compute Node resources
- storage.go - code interacting mainly with the Cloud Storage resources
- gcp.go - common code (e.g. authentication with GCP)

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-03-15 16:48:40 +00:00
Tomas Hozza
075373a51e internal: Move GCP library to internal/cloud
The internal GCP library was originally placed into `internal/upload`
directory, since its purpose was mainly to upload and import built
images to GCP.

Functionality for other cloud-provider-specific libraries is broader,
however scattered around the `internal/` directory based on purpose (e.g. in
`internal/boot` and `internal/upload`). Since all parts of provider-specific
library usually share some common pieces (e.g. authentication), it makes
sense to consolidate them into a single package (e.g. in
`internal/cloud/<provider>`).

Create `internal/cloud` directory, where all cloud-provider-specific
internal libraries should be consolidated. Start with GCP.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-03-15 16:48:40 +00:00