#248 Need a way to clean up dangling test environments
Closed: Fixed 5 years ago Opened 5 years ago by rayson.

In our C3I pipelines, every time a pull-request is created or merged, a temporary test environment is deployed with the PR change to verify if it can be delivered. Although we have pipeline code to destroy the environment after a test run is stopped, sometime network or other issues may prevent you from deleting the environment successfully (e.g. https://jenkins-waiverdb-test.cloud.paas.upshift.redhat.com/job/waiverdb-test/job/waiverdb-test-waiverdb-dev-integration-test/17/console).

How do you deal with this issue when using rcm-tools jenkins? Should we create a cron job or something like that to regularly clean up unused test environment?


How about retry cleanup after an interval?

@lholecek The jenkins slave is running as a pod on OpenShift. It seems to me that I can't catch the exception of that error. And after the disconnection, the pod will be terminated, not sure if I can re-launch a new pod.

I will investigate more about that.

Another thought is adding some labels to the OpenShift resources (DeploymentConfig, Routes, Services, etc). Then every time before create a new test environment, it will do the cleanup for previous builds.

Another thought is adding some labels to the OpenShift resources (DeploymentConfig, Routes, Services, etc). Then every time before create a new test environment, it will do the cleanup for previous builds.

Can there be more pipelines for PRs running concurrently? If so, it could be more difficult to find unused test pods.

@lholecek Yes, it is possible.
I am checking if there is api to check a build is still running or not, or we can check the created time of that environment: For example, if it is created 2 hours ago, we will delete it due to 'timeout'.

Login to comment on this ticket.

Metadata
Related Pull Requests
  • #251 Merged 5 years ago