#2619 cli: use main requests.Session in download_file()
Closed 3 years ago by ktdreyer. Opened 3 years ago by ktdreyer.
ktdreyer/koji download-file-session  into  master

file modified
+5 -2
@@ -6849,10 +6849,13 @@ 

      # run the download

      for rpm in rpms:

          download_rpm(info, rpm, suboptions.topurl, sigkey=suboptions.key,

-                      quiet=suboptions.quiet, noprogress=suboptions.noprogress)

+                      quiet=suboptions.quiet, noprogress=suboptions.noprogress,

+                      session=session.rsession)

      for archive in archives:

          download_archive(info, archive, suboptions.topurl,

-                          quiet=suboptions.quiet, noprogress=suboptions.noprogress)

+                          quiet=suboptions.quiet,

+                          noprogress=suboptions.noprogress,

+                          session=session.rsession)

  

  

  def anon_handle_download_logs(options, session, args):

file modified
+18 -7
@@ -537,7 +537,7 @@ 

  

  

  def download_file(url, relpath, quiet=False, noprogress=False, size=None,

-                   num=None, filesize=None):

+                   num=None, filesize=None, session=None):

      """Download files from remote

  

      :param str url: URL to be downloaded
@@ -549,10 +549,17 @@ 

      :param int num: download index (printed in verbose mode)

      :param int filesize: expected file size, used for appending to file, no

                           other checks are performed, caller is responsible for

-                          checking, that resulting file is valid."""

+                          checking, that resulting file is valid.

+     :param session: requests.Session object to use for downloading. If None,

+                     this method uses a new session.

+     """

  

      if '/' in relpath:

          koji.ensuredir(os.path.dirname(relpath))

+ 

+     if not session:

+         session = requests.Session()

+ 

      if not quiet:

          if size and num:

              print(_("Downloading [%d/%d]: %s") % (num, size, relpath))
@@ -579,7 +586,7 @@ 

  

      try:

          # closing needs to be used for requests < 2.18.0

-         with closing(requests.get(url, headers=headers, stream=True)) as response:

+         with closing(session.get(url, headers=headers, stream=True)) as response:

              if response.status_code in (200, 416):  # full content provided or reaching behind EOF

                  # rewrite in such case

                  f.close()
@@ -603,7 +610,8 @@ 

          print('')

  

  

- def download_rpm(build, rpm, topurl, sigkey=None, quiet=False, noprogress=False):

+ def download_rpm(build, rpm, topurl, sigkey=None, quiet=False,

+                  noprogress=False, session=None):

      "Wrapper around download_file, do additional checks for rpm files"

      pi = koji.PathInfo(topdir=topurl)

      if sigkey:
@@ -615,7 +623,8 @@ 

      url = os.path.join(pi.build(build), fname)

      path = os.path.basename(fname)

  

-     download_file(url, path, quiet=quiet, noprogress=noprogress, filesize=filesize)

+     download_file(url, path, quiet=quiet, noprogress=noprogress,

+                   filesize=filesize, session=session)

  

      # size - we have stored size only for unsigned copies

      if not sigkey:
@@ -640,7 +649,8 @@ 

          error("Downloaded rpm %s doesn't match db, deleting" % path)

  

  

- def download_archive(build, archive, topurl, quiet=False, noprogress=False):

+ def download_archive(build, archive, topurl, quiet=False, noprogress=False,

+                      session=None):

      "Wrapper around download_file, do additional checks for archive files"

  

      pi = koji.PathInfo(topdir=topurl)
@@ -658,7 +668,8 @@ 

          # can't happen

          assert False  # pragma: no cover

  

-     download_file(url, path, quiet=quiet, noprogress=noprogress, filesize=archive['size'])

+     download_file(url, path, quiet=quiet, noprogress=noprogress,

+                   filesize=archive['size'], session=session)

  

      # check size

      if os.path.getsize(path) != archive['size']:

Prior to this change, we used new requests.Session objects in the download_file() method. This meant that we did not respect the serverca profile setting (for example), and we would inefficiently recreate a new HTTP session for every file we downloaded.

Update the CLI code to pass the main Koji session's rsession (requests Session) through to download_file(), so we can re-use our HTTP settings and connection.

Fixes: https://pagure.io/koji/issue/2618

maybe try a new topurl_ca option then fallback to serverca?

My purpose is to make Koji easier for newcomers.
Adding another configuration option is shifting the complexity to users. I would rather make "serverca" apply to all HTTPS connections that the Koji client performs.

Using session.rsession feels wrong here. This is an implementation detail of ClientSession and not really meant to be used this way. There is nowhere else in the code where this has been done.

The parameter name session is potentially confusing in the context of Koji, where "session" normally means a ClientSession instance. However, I really don't think that passing in a requests session is the right thing here. There is no cached session that it makes sense to use.

maybe try a new topurl_ca option then fallback to serverca?

I concur with Yu Ming here. Our download connections are fundamentally distinct from our xmlrpc connections. While using the same server for both is certainly a common case, it is not the only case and we need to handle the general case here.

Adding another configuration option is shifting the complexity to users.

I don't believe that is the case here. If the default for an unset topurl_ca is to fall back to the serverca value, then the new user with a simple set up only has to set the one value.

The name "topurl_ca" is a good example of how hard this will be for users and admins: it has an underscore, when "serverca" does not.

There are already dozens of configuration options in a Koji deployment, and I want to avoid adding more complexity to Koji configuration. If we don't support non-system-wide CAs for topurl, I think it's better to simply keep the feature out of Koji altogether and require that users trust the CA system-wide.

I don't think it's worth falling back here because we would need to catch the specific error from requests and then parse the error string from the exception to make sure that it was the exact CA error that we wanted to handle. That is some incredible complexity for QE to test in an integration test.

What is the resolution here? (I'm in favour of just documenting this as @ktdreyer writes here. I think that situation when system-wide CAs don't trust download server is not that common. Workaround with REQUESTS_CA_BUNDLE is ok to me, but not sure if it is a majority opinion.

Yeah, I'll close this, and we'll document it.

kojid has similar problems with HTTPS and non-system-wide certs. kojid uses koji.openRemoteFile(), and that method uses a temporary requests session as well. It's not possible to use an HTTPS topurl with kojid if the builder host does not trust the CA system-wide.

Pull-Request has been closed by ktdreyer

3 years ago