#321 specfiles: rpkg srpm creates tarballs an insecure way
Closed 5 years ago Opened 5 years ago by praiskup.

I tried to create the tarball, and extract it, and this is the result:

$ du --max-depth=1 . -h
12K     ./conf
57M     ./coprs_frontend
16K     ./documentation
317M    ./docker-testing
4.0K    ./data
76K     ./docs
706M    .

Currently being in in pr_278, these are the results for me:

16K ./documentation
5.9M ./coprs_frontend
76K ./docs
12K ./conf
6.2M .

Isn't docker-testing something that we don't actually have in our repository? coprs-frontend as well seems to contain some extra bits.

Hello, did you figure out the cause? Is it a bug in rpkg or our specs or was that additional content some debugging "leftover"?

Isn't docker-testing something that we don't actually have in our repository?

Yes, but it is not tracked by git. It is neither .gitignored, simply temporary directory.

Is it a bug in rpkg or our specs or was that additional content some debugging "leftover"?

I haven't had a time to analyze, I don't know how it works... but I'd expect that only git-tracked files are packaged + maybe some whitelist.

Isn't docker-testing something that we don't actually have in our repository?

Yes, but it is not tracked by git. It is neither .gitignored, simply temporary directory.

Is it a bug in rpkg or our specs or was that additional content some debugging "leftover"?

I haven't had a time to analyze, I don't know how it works... but I'd expect that only git-tracked files are packaged + maybe some whitelist.

The git pack rpkg's macro packs even dirty stuff, which is not a part of the git index. Such srpm/rpm/tgz builds are then marked with the .wtree. suffix basically saying: this comes from dirty tree, anything can be there. It's mainly good just for local testing.

Metadata Update from @clime:
- Issue status updated to: Closed (was: Open)

5 years ago

IMO it is security related issue, and it should be fixed. Certainly we can pack files which are tracked but are "dirty" (changed locally and not yet committed) ... but this is different. So it's not possible to create sane tarball without cleaning up the directory tree?

IMO files not in git should not be present in tar file. Not if the dirty ones. BTW this is how tito behaved. It called git-archive, which packaged only things in git.

We could use git_archive macro which requires the tree to be clean before continuing. I prefer git_pack due to less restrictions on a user.

On Wednesday, June 27, 2018 4:18:33 PM CEST clime wrote:

clime added a new comment to an issue you are following:
` We could usegit_archivemacro which requires the tree to be clean before continuing. I prefergit_pack` due to less restrictions on a user.

If that's configurale, please use git_archive for security reasons. And for the
same reasons, please consider putting huge warning around the git_pack in
docs...

IMVHO, the actual behavior would be OK if that was triggered by rpkg option.

I think that git_pack is a useful feature for local testing, so I prefer it to stay in rpkg. However such packages shouldn't be uploaded/installed anywhere and people should know that. Some warning is a good idea.

Also, a thing that personally annoys me on git_pack is that git stash is not enough to clean the dirty changes and rpkg generates a .wtree. package, because I usually have some untracked trash files in the project directory (I know, that is my fault, but ...).

I think that by default some safer method should be used, but it would be nice to use some CLI parameter to switch to unsafe git_pack.

First of all, it is a really stupid thing to put sensitive data into your Git working tree. If you do that, you are asking for problems. I think we should make that clear from the beginning.

rpkg gives user a lot of freedom and yes, if you do things like described above, then you might get into trouble.

I personally only add files into the project that should be part of the project, which is pretty sane thing to do.

If you have some dirty files that you don't want to be tracked and that doesn't belong to .gitignore at the same time, then I would be curious what is the rationale for those files and having them there.

I think we should clear that use-case out first before changing the macro.

@frostyx Note that you can use git stash -u.

We still need to fix this. rpkg srpm can package security related stuff from git working directory, and in my case it also creates a ridiculously large tarballs.

I'm rather reopening to not forget about this problem.

Metadata Update from @praiskup:
- Issue status updated to: Open (was: Closed)

5 years ago

We still need to fix this. rpkg srpm can package security related stuff from git working directory, and in my case it also creates a ridiculously large tarballs.

I don't think there is really a problem. If rpkg srpm creates too large tarballs for you, it's likely that you have placed some large dirty files into a directory being packed. Otherwise, the srpm should be sized ~1.7MB e.g. for copr-frontend. Please, put the files into .gitignore if you regularly have them there or simply move them outside of the copr git tree.

I disagree. This is a really serious problem that anyone can face one day. Any sane
build system creates tarballs/srpms with precisely picked set of files (tito, autotools,
fedpkg, git archive, setuptools, etc.). The question is how do we plan to solve this.
I wouldn't mind if that was optional, and we changed the default.

I don't think that adding dirty files (otherwise tracked by git) is a huge problem, but
even for this we should have an option.

I disagree. This is a really serious problem that anyone can face one day. Any sane
build system creates tarballs/srpms with precisely picked set of files (tito, autotools,
fedpkg, git archive, setuptools, etc.).

The set is precisely defined. It is a working tree subdir for git_dir_pack macro.

The question is how do we plan to solve this.
I wouldn't mind if that was optional, and we changed the default.

rpkg does not offer such functionality.

I don't think that adding dirty files (otherwise tracked by git) is a huge problem, but
even for this we should have an option.

If you have some idea for new options, please, open an issue at https://pagure.io/rpkg-util.

On Copr side, we pretty much only have a choice between git_dir_archive and git_dir_pack macros unless we want to introduce a new custom macro.

I don't mind rpkg util, I tried to report there and submit PRs, but that project
is not friendly to me. I need to solve issue here in Copr (even if the solution was
migration back to tito).

The set is precisely defined. It is a working tree subdir for git_dir_pack macro.

It would be fine if the tool packaged only files which are tracked by git. The way it
works ATM is just insecure hack.

On Copr side, we pretty much only have a choice between git_dir_archive and git_dir_pack macros unless we want to introduce a new custom macro.

I do not know the difference, would git_dir_archive help? I can not use that:

On branch libmodulemd
Your branch is ahead of 'frostyx/libmodulemd' by 1 commit.
  (use "git push" to publish your local commits)

Untracked files:
  (use "git add <file>..." to include in what will be committed)
...
$ rpkg srpm
git_dir_archive: Your working tree is dirty. Commit first.
git_dir_archive failed with value 2

I don't mind rpkg util, I tried to report there and submit PRs, but that project
is not friendly to me.

I don't believe that would be the case.

I do not know the difference, would git_dir_archive help? I can not use that.

git_dir_archive: Your working tree is dirty. Commit first.
git_dir_archive failed with value 2

You need to commit your changes first.

I don't believe that would be the case.

Been there, tried that, and never more. That kind of project leadership sucks.

You need to commit your changes first.

As shown in the example, all the changes are committed (not pushed), except for
untracked files.

Even if we required to have everything .gitignored (which would be ridiculous), it doesn't work:

$ git status .
On branch libmodulemd
Your branch is ahead of 'frostyx/libmodulemd' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean
$ rpkg srpm
git_dir_archive: Your working tree is dirty. Commit first.
git_dir_archive failed with value 2

Even if we required to have everything .gitignored (which would be ridiculous), it doesn't work:
$ git status .
On branch libmodulemd
Your branch is ahead of 'frostyx/libmodulemd' by 1 commit.
(use "git push" to publish your local commits)

nothing to commit, working tree clean
$ rpkg srpm
git_dir_archive: Your working tree is dirty. Commit first.
git_dir_archive failed with value 2

Can you post output of git status --porcelain? What's your version of git?

On Wednesday, September 12, 2018 1:24:10 PM CEST clime wrote:

Can you post output of git status --porcelain? What's your version of git?

$ git status --porcelain
?? backend/tags
?? builder/copr-builder-0/
...
and many others, but nothing in ./frontend
...
$ rpm -q git
git-2.17.1-3.fc28.x86_64

On Wednesday, September 12, 2018 1:24:10 PM CEST clime wrote:

Can you post output of git status --porcelain? What's your version of git?

$ git status --porcelain
?? backend/tags
?? builder/copr-builder-0/
...
and many others, but nothing in ./frontend

that's the reason then. Your working tree is still dirty. Recognition of dirtiness for git_dir_archive is not aware of subdirs. It could be done though.

It is still very inconvenient, we need two things then:
- switch to git_dir_archive
- fix rpkg so it doesn't care about not-tracked content

I mean, even untracked content under ./frontend shouldn't block me to do a proper
release tarball/srpm.

I do believe that this should be solved in rpkg-util project. Not here.

And yes, if tool aims to a general audience then it should do its best to count with every situation and dirty git is IMHO the basic situation. E.g. that is reason why tito build (even test builds) from commit. If you pass to git-archive the commitsh then it builds from that tree and there cannot be any dirty file(s). If you allow to build from uncommited tree, then you can get anything and there is no way how to distinguish between wanted and unwanted stuff.

After reading through the code - the situation is that if we have in copr-frontend.spec:
Source0: {{{ git_dir_pack }}}
it will pack everything. Uncommitted files, dirty files. Everything.
Changing it to
Source0: {{{ git_dir_archive }}}
will pack only committed files. The commit does not need to be pushed.

@frostyx @dturecek What are your opinions? What do you prefer?

My opinion is still the same as in my previous comment. If we have to pick, then I prefer the safer macro.

But I still imagine that the usage could be like

rpkg local # git_dir_archive is used
rpkg local --some-attribute-to-use-pack # git_dir_pack is used

IDK whether it is possible (or desired) or not.

There is nothing unsafe about git_dir_pack. You just need to know it packs working tree, which is properly documented. I don't get your arguments about git_dir_pack being unsafe. Simply don't put shit into code source tree and you are ok.

Metadata Update from @clime:
- Issue status updated to: Closed (was: Open)

5 years ago

Sorry for impoliteness. I don't actually mind switching to git_dir_archive but I also don't accept the title of this issue, nor the comments. git_dir_pack is completely fine macro, simple and easy to use. It uses work tree, which means you need to keep the work tree clean. fedpkg srpm and others work on work tree as well when you build srpm.

  • it is insecure (it's not first time I can not explain to you what has security potential, and how to deal with that), git_dir_pack is a way to serious troubles and needs to be fixed
  • fedpkg srpm only packages predefined set of files (spec file + Sources + Patches + ... + maybe some rpm macro hacks that I'm sure nobody of us have seen in real Fedora package)

Login to comment on this ticket.

Metadata