#13 Don't upload small text files to lookaside cache
Opened 8 years ago by araszka. Modified 7 years ago
araszka/rpkg bz767264  into  master

file modified
+14
@@ -22,8 +22,10 @@ 

  import xmlrpclib

  import pwd

  import koji

+ import subprocess

  

  OSBS_DEFAULT_CONF_FILE = "/etc/osbs/osbs.conf"

+ LOOKASIDE_MAX_FILE_SIZE = 100000

  

  class cliClient(object):

      """This is a client class for rpkg clients."""
@@ -613,6 +615,11 @@ 

                          'cache and remove any existing ones. The "sources" '

                          'and .gitignore files will be updated with the new '

                          'uploaded file(s).')

+         self.new_sources_parser.add_argument('--force',

+                                              default=False,

+                                              action='store_true',

+                                              help='Force the upload of small '

+                                                   'textual files')

          self.new_sources_parser.add_argument('files', nargs='+')

          self.new_sources_parser.set_defaults(

              command=self.new_sources, replace=True)
@@ -1133,6 +1140,13 @@ 

              if not os.path.isfile(file):

                  raise Exception('Path does not exist or is '

                                  'not a file: %s' % file)

+             if not self.args.force \

+                     and os.stat(file).st_size < LOOKASIDE_MAX_FILE_SIZE  \

+                     and 'text' in subprocess.check_output(['file', file]):

+                 raise Exception('The file %s is small and textual, '

+                                 'it should be added '

+                                 'to the git tree directly' % file)

+ 

          self.cmd.upload(self.args.files, replace=self.args.replace)

          self.log.info("Source upload succeeded. Don't forget to commit the "

                        "sources file")

no initial comment

Move the constant to a file-level variable (constant) so it can be redefined in the clients (like fedpkg) if needed.

I agree with pavol here

@lsedlar is this change still needed?

It is a nice to have feature. The whole split of dist-git and lookaside cache is that files that git can work with well (textual files, usually rather small) should go to dist-git and large binary blobs go to lookaside.

I'm not sure though if this patch in current form satisfies Pavol's requirement: how can the constant be redefined in e.g. fedpkg?

I'm not sure though if this patch in current form satisfies Pavol's requirement: how can the constant be redefined in e.g. fedpkg?

How about move it to fedpkg.conf?

It would be kind of difficult to determine a reasonable limitation, as some tarball may be small enough. For example, in this patch, LOOKASIDE_MAX_FILE_SIZE is set to 100000, whereas size of rpkg-1.45.tar.gz tarball is 70004. So, packager has to use --force, however, this tarball should be uploaded naturally.

The tarball is not a text file, so it should be possible to upload it without any extra arguments. The patch adds a call to file to determine what kind of file it is.

However, the call is not ideal. I think it should also use --brief argument to not output the input filename, and also maybe using --mime-type would be more reliable than the human friendly description.

Also, subprocess.check_output is new in Python 2.7. Assuming we care about 2.6 (do we?), it will need to be changed to something else.

--mime-type will not print ASCII text in output, instead, it outputs text/x-diff only. So, if use --mime-type, we have to list all file types that should be limited.

Also, subprocess.check_output is new in Python 2.7. Assuming we care about 2.6 (do we?), it will need to be changed to something else.

Yes, we need to care about 2.6.

I don't think listing everything is necessary: if should be enough to match against text/*.

I'm thinking that, if allowing to upload a small text with --force whatever, why it's necessary to limit. Some package maintainers sometimes put patches into lookaside, e.g. a comment to a similar limitation to fedpkg-push.

What problems this change could solve?

If the result would be "you are not allowed to upload a small text file to lookaside, but you can always bypass with option --force", it doesn't make sense probably most of the time.

I don't think listing everything is necessary: if should be enough to match against text/*.

Oh, right. :smile:

Metadata