#210 Taskotron creates huge numbers of ResultsDB groups with duplicate names
Opened 7 years ago by adamwill. Modified 6 years ago

Browsing through groups in ResultsDB, we find that Taskotron seems to create a new group for each of its tests (dist.rpmgrill, dist.rpmlint etc.) for each test execution. So there's a dist.rpmgrill group for every single dist.rpmgrill run on any package, and a dist.rpmlint group, and so on. It makes browsing the webUI more or less impossible and you can't tell what each group actually is from its name.

Are these groups actually useful? If so, could they get better names so we can at least tell what they are?


It occurs to me that this is exactly the sort of thing resultsdb_conventions could help with, so if we can agree on a convention, I can create some new classes for handling package results, update results etc. =)

The groups actually are usefull, but I agree that the naming could be better. The thing is, that the group UUIDs are provided by ExecDB, so we can tie the whole execution of a taskotron task together in a sensible way - i.e. all the results from the one taskotron job are in the group.
We sure could name them better (now we just take the task name as the identifier), but it really was not a priority, as the only stuff we really mostly care for is to be able to have a link like this: https://taskotron.fedoraproject.org/resultsdb/results?groups=32682314-e970-11e6-91d5-5254008e42f6 in execdb (the relevant execdb job is here, although the link to results is broken, as I forgot to update it with apiv2.0 in mind: https://taskotron.fedoraproject.org/execdb/jobs/32682314-e970-11e6-91d5-5254008e42f6 )

The group names are not really meant for human consumption, and they do their purpose (they group all results reported by a single depcheck execution, for example). But since there's Groups button in the #resultsdb frontend, I guess we could name them in somewhat prettier way. We talked about this with @jsedlak and our idea is to use testcasename-timestamp approach, so e.g. dist.depcheck-2017-02-03T10:28:17. That makes it pretty understandable that this is a group of depcheck results reported at that time. Thoughts? If there are no objections, I'll prepare a patch.

@kparal whatever works for you. I'd rather store the task name, than testcase name, though, as tasks can produce multiple testcases.

Is there a specific target for the test execution? If so, we could have a group named for that, like we did for the openQA / Autocloud results...

BTW: "The group names are not really meant for human consumption" - I don't see why they can't be? People need ways to query the data, after all. 'All the results in this sensibly-named group' seems like something that would be a useful query in various situations.

@adamwill thing is, that the group's description is not unique (And IMO should not be), so even though I don't disagree with the need for sensible naming, nor with the fact that searching the names should be possible, I'm pretty certain that direct "all the results in the group with this pretty name" is a good query. The "name" is really just the human-readable description, and the identifier is the UUID.
From my POW, users should not need to query the results by group-name, but by what was tested (I'm open to discussion thought) in the result (item). If a more complex dashboard is needed, it IMO should be another piece of code than resultsdb itself. ResultsDB should stay pretty agnostic to what the data "mean", because once we start to heavily rely on a scheme meaning something, it stops being agnostic, and becomes a targeted solution.
This is once again not disputing the fact, that targeted solutions are great. I just firmly believe that those should be a different system consuming the data from ResultsDB, and presenting them in a more sensible way.
Does this make sense? I'm not absolutely sure we are talking about the same thing here, so I maybe commenting on something completely different than you had in mind.

I'm not thinking about dashboards, exactly. The case I've been keeping in mind is the releng 'release gating' case: it should be very easy for releng to find the information needed to make the decision, without much need for clever 'systems' in the middle.

But I understand what you're saying. This is all firmly in the realm of the 'conventions' taskotron uses to report results, for sure, not policy about what group names mean enforced by ResultsDB (which is why it's filed on taskotron not RDB :>)

Well we could make groups follow the same pattern as test cases, so instead of UUIDs we would use unique groups IDs like compose.openqa.Fedora-Rawhide-20170207.n.0 and compose.autocloud.Fedora-Rawhide-20170207.n.0, etc. You could say that if you have the permissions to submit results into dist.depcheck or compose.openqa namespace, you can create a group with the same prefix. If you don't request a specific name, you'll get a random UUID, as currently.

The question is how much we want/need this. Adam seems to prefer human-readable names as IDs. That's definitely useful when developing/debugging, but is it useful for production purposes? I'd be quite worried if releng folks manually queried resultsdb to see which packages they can push. They should have a tool for that, and that tool can contain a well-known group UUID (documented by openqa task). Thoughts on this? Adam, can you elaborate on the expected releng use case?

! In #904#12930, @kparal wrote:
Well we could make groups follow the same pattern as test cases, so instead of UUIDs we would use unique groups IDs like compose.openqa.Fedora-Rawhide-20170207.n.0 and compose.autocloud.Fedora-Rawhide-20170207.n.0, etc. You could say that if you have the permissions to submit results into dist.depcheck or compose.openqa namespace, you can create a group with the same prefix. If you don't request a specific name, you'll get a random UUID, as currently.

Nah, I would not do that, honestly. On top of that, and maybe a missing part of information for you, @kparal is that you can create "know" UUID based on two seed values (namespace + value), and this is what @adamwill and @jsedlak are doing in OpenQA (and AFAIK the convetions) right now.

I'd be quite worried if releng folks manually queried resultsdb to see which packages they can push. They should have a tool for that, and that tool can contain a well-known group UUID (documented by openqa task).

This.

I'd even go so far to say that the tool should not rely on the group uuids, but on the actual results. The grouping is nice and all, but the data is in the result. The groups are a nice "shortcut" to "related results", but I would be worried if we started to put more meaning there. At least with how resultsdb works now.

Well, there isn't going to be a single known group releng wants to look at. Say they want to decide whether to release the 20170207.n.0 compose - they need all the test results for 20170207.n.0 , stat. They can't easily know what the UUID for that group is, but if it has a predictable name, they can easily find it. The way I made conventions work, anything that uses conventions to submit results for that compose will add them to a specific group with a predictable name. Well, in fact, now I think of it, the group also has a predictable UUID, but you need to do much more work to figure out what UUID to look for than you do to figure out what group name to look for.

If the answer is that processes like this shouldn't rely on the groups but instead just use targeted searches based on extradata keys or whatever, that's totally fine. But then I'm not sure what problem the groups are solving in the first place?

I understand now why you want to be able to query by group name, have them unique, and use consistent and predictable naming. And it makes sense. Originally we never imagined such use case, I believe, and simply used it to group results from a single task run together (which is sometimes useful when a human inspects the results, that's all). And waited for more use cases (Josef, correct me if I'm wrong). The way you want to use groups is exactly such a new use case, and I would personally probably have solved it with extradata, if I hadn't read this. But using groups can be a better way to tackle this, and maybe even less demanding on the database (Josef?).

There's one thing that I realized now, however, and I think should be mentioned. Using testcase names is secure in a sense only you can be configured to be allowed to write into your namespace, which prevents both malicious intent and errors. More concretely, the namespace idea is supposed to be safe, the current implementation is very rudimentary (but that can be improved). For example, task-depcheck is supposed to be able to write only to dist.depcheck* and nowhere else. No other task can submit results to dist.depcheck*. The same can be done with compose.openqa*, etc. However, groups are currently free-for-all. Anyone can create a group with any UUID and any name. There are no namespaces, permissions, nothing. That's because we haven't expected such need. So if releng started to rely on groups for update gating/compose building/etc, there are risks. I wouldn't exactly expect malicious intent, because there's little to gain by submitting fake results, but error/unintended consequences are more possible. You might not be the only one wanting to group your results by a group named 20170207.n.0, it's quite obvious to use that name for compose-related tasks. Any taskotron task can do that, and releng will then find several such named groups. It's also possible that some tasks will create a group with the same UUID you want to use - not very likely, unless they e.g. copy&paste your code into their own task, unsuspecting it can cause issues. I'm trying to demonstrate that groups are not currently completely reliable to be just your own, because they are not namespaced, unlike testcase names. A thing to consider (and perhaps improve, if we want to rely on well-known group UUIDs/names heavily). Do you consider this a problem?

Well, not a 'problem', no, not exactly. I mean, it's actually one of the things resultsdb_conventions is specifically intended to achieve: it allows (almost requires) different systems to report results to at least some of the same group(s). It establishes the convention 'all test results for a compose should go into a group named for that compose', and any reporter that buys into the conventions 'system' and uses the 'compose' convention (or a child of it) will put its results into that group. It's really almost the opposite of what you're talking about: conventions is explicitly a system for enabling different systems/submitters to collate / relate related results (in groups, and also by having similarly formatted extradata). Further, at least in my head, the conventions aren't tied to resultsdb_conventions-the-codebase, that's just a current implementation detail; it seems entirely reasonable to me that someone might build another submitter that doesn't actually use resultsdb_conventions (write it in Go or whatever you like), but does respect the same conventions.

The question of trust applies equally to the extradata, right? Even if we say 'you should search for all results with the appropriate extradata key and value', not 'you can take all the results from the group with the expected name', that's no more 'secure', right? Any result can apply any extradata key, I think.

I've been working all along with the implicit assumption that further filtering of the results will be necessary once you've found 'all the results for the compose' (or whatever it is you're trying to gate), however you do that. At a minimum, you're going to need to do dupe filtering (which is an interesting problem I have a whole thing about, but off-topic here). You're very likely also to only want to take tests that are considered 'significant' enough for the particular gating task (or whatever) you're trying to achieve; whether you do that via Policy Engine or whatever, it's still a further filtering step.

If we're only considering factors like 'not-important-enough-tests', 'dupes' and 'mistakes', I don't think we need to worry about how 'secure' group names or extradata values are. My initial inclination is not to worry about it unless it actually becomes a problem. I was kinda also assuming that once ResultsDB grows real auth, consumers will be able to know who (or what system) submitted a given result, which provides another opportunity for filtering.

I'm kinda sceptical that we're really going to have a big problem with malicious results. There are only going to be so many systems reporting results that we ultimately care about, I think, and we should be able to keep reasonable track of them and make sure they're doing the right things, so long as we have things like conventions to make it relatively easy. The most likely source of 'malicious' or completely-screwed-up results is probably human testers, if we manage to build the relval-ng thing (and counterparts for Test Day testing etc). If that proves to be a real problem, though, I think we're going to have reasonable mechanisms for filtering those results.

! In #904#12935, @adamwill wrote:
If we're only considering factors like 'not-important-enough-tests', 'dupes' and 'mistakes', I don't think we need to worry about how 'secure' group names or extradata values are. My initial inclination is not to worry about it unless it actually becomes a problem.

+1 I know where @kparal 's worries come from (he loves defensive design), but I think that we can be safely assuming no malice here.

On top of that, it is important to understand how the uuids are created (in conventions) - the cool thing here is that since you are providing the same 'seeds' to the uuid5() call, you can replicate the uuid based on inputs any time. That is how I advised @jsedlak to do it in OpenQA's reporting, and where @adamwill took it from for the conventions.
With this (and this new usecase) in mind, I'd love to add something like .../groups?uuid=(openqa, compose, foobar) query to the API, so you can, instead of providing the "raw" uuid, just give it the human-readable components, from which the uuid is actually created. And you'd be given the right group "automagically". Since with how the conventions work, this is basically the human-readable unique identifier.
I'll also add a way to search the group names (descriptions) similarly to how you can search results now (something in the likes of .../groups?description:like=openqa.compose.*), when you just want to do a "soft" search.

Makes sense?

Sure, it makes sense to me. I guess the only question is, does this become the One True Way of creating and identifying groups in resultsdb, or is it merely a courtesy feature in the resultsdb core for the benefit of things that respect this particular convention for naming groups and creating group UUIDs?

If we decide this is the One True Way, it almost feels weird to have the UUIDs at all any more, since they really provide nothing.

@adamwill absolutely just a courtesy - we are still talking conventions (like "you should provide an item" style convention, maybe, but still a convention), not hard rules, at least in the scope of the actual implementation.
Also, even if this was "the true way", I'd still see a reason to have "I don't care, and just want a random UUID" thing, and the UUID is also a nice identifier, as it has constant length and such things that only really matter for the machines, and humans don't care, but are important anyway.
Good thing about the actual implementation is, that there won't be collisions between the "random" and "specific" UUIDs by definition of how UUIDs work (different namespaces), so you don't even need to be concerned about "random group results" mixing up with the "proper" ones.

Login to comment on this ticket.

Metadata