#1 Limit caching for repomd.xml file
Merged 3 years ago by dustymabe. Opened 3 years ago by dustymabe.
releng/ dustymabe/archive-repo-manager dusty-caching  into  main

file modified
+1
@@ -8,6 +8,7 @@ 

                     findutils        \

                     fedora-messaging \

                     koji             \

+                    awscli           \

                     s3fs-fuse        \

                     rsync            \

                     createrepo_c     \

file modified
+9 -2
@@ -44,8 +44,12 @@ 

   export S3BUCKET=dustymabe-archive-repo-poc

   export AWSACCESSKEYID=

   export AWSSECRETACCESSKEY=

+  export AWS_ACCESS_KEY_ID=

+  export AWS_SECRET_ACCESS_KEY=

  podman build -t archive-repo-manager .

  podman run -it --rm             \

+     -e AWS_ACCESS_KEY_ID        \

+     -e AWS_SECRET_ACCESS_KEY    \

      -e AWSACCESSKEYID           \

      -e AWSSECRETACCESSKEY       \

      -e S3BUCKET                 \
@@ -54,14 +58,17 @@ 

      archive-repo-manager

  ```

  

+ The two sets of env vars for AWS credentials is because the `aws` CLI

+ uses one form and s3fs uses another.

+ 

  If you'd like you can add `--entrypoint=/bin/bash`. Then you can do

  the s3fs mount and run /usr/local/lib/archive_repo_manager.py directly.

  

  

  # Rough notes for creating a bucket in S3

  

- Set up credentials. One way is to use the AWS_ACCESS_KEY_ID and

- AWS_SECRET_ACCESS_KEY environment variables.

+ Set up credentials. One way is to use the `AWS_ACCESS_KEY_ID` and

+ `AWS_SECRET_ACCESS_KEY` environment variables.

  

  Then create the bucket:

  

file modified
+21 -2
@@ -135,8 +135,27 @@ 

                  ./

          echo "createrepo read $(wc -l < $readpkglist) rpms from disk (s3)"

          rm $readpkglist

-         # Copy the resulting repodata back to the target

-         rsync -avh --delete "${outputdir}/repodata/" ./repodata/

+ 

+         # Copy the new repodata files to the target

+         rsync -avh --exclude 'repomd.xml' "${outputdir}/repodata/" './repodata/'

+         # Copy the repomd separately in order to force no caching for that object

+         if [ -n "${S3BUCKET:-}" -a -n "${AWS_ACCESS_KEY_ID:-}" -a -n "${AWS_SECRET_ACCESS_KEY:-}" ]; then

+             aws s3 cp                              \

+                 --cache-control max-age=60         \

+                 "${outputdir}/repodata/repomd.xml" \

+                 "s3://${S3BUCKET}/fedora/${release}/${arch}/repodata/repomd.xml"

+         else

+             if [ -n "${S3BUCKET:-}" ]; then

+                 echo -n "ERROR: You're running against an S3 BUCKET, " 1>&2

+                 echo    "but no creds for 'aws s3 cp' operation."      1>&2

+             fi

+             # If no creds then maybe we're doing development on a local filesystem

+             # Just copy the file.

+             cp -av "${outputdir}/repodata/repomd.xml" './repodata/repomd.xml'

+         fi

+         # Delete files that are no longer needed

+         rsync -avh --exclude 'repomd.xml' --delete "${outputdir}/repodata/" './repodata/'

+ 

          popd >/dev/null

      done

  

We hit issues where cloudfront had cached the repomd.xml file and
it had become out of date and was causing issues because the repo
metadata files it pointed to were deleted (because there was a new
set of files that had been created).

That seems like an expensive way to do cp :)
There's --remove-destination which would be useful here.

Minor: inconsistent quoting of that last arg compared to the earlier invocations.

Is there a way to make sure we always hit this case in production? E.g. error out if the bucket name matches the prod bucket and these conditions fail.

rebased onto 8846b96

3 years ago

rebased onto 22ca919

3 years ago

pushed up a new version trying to address comments

rebased onto 454b36e

3 years ago

Pull-Request has been merged by dustymabe

3 years ago