#3005 Speedup untagged_builds query
Merged 3 years ago by tkopecek. Opened 3 years ago by tkopecek.
tkopecek/koji issue1636  into  master

file modified
+16 -17
@@ -8113,24 +8113,23 @@ 

      fields = ('build.id', 'package.name', 'build.version', 'build.release')

      aliases = ('id', 'name', 'version', 'release')

      st_complete = koji.BUILD_STATES['COMPLETE']

-     tables = ('build',)

-     joins = []

-     if name is None:

-         joins.append("""package ON package.id = build.pkg_id""")

-     else:

-         joins.append("""package ON package.name=%(name)s AND package.id = build.pkg_id""")

-     joins.append("""LEFT OUTER JOIN tag_listing ON tag_listing.build_id = build.id

-                     AND tag_listing.active = TRUE""")

-     clauses = ["tag_listing.tag_id IS NULL", "build.state = %(st_complete)i"]

-     # q = """SELECT build.id, package.name, build.version, build.release

-     # FROM build

-     #    JOIN package on package.id = build.pkg_id

-     #    LEFT OUTER JOIN tag_listing ON tag_listing.build_id = build.id

-     #        AND tag_listing.active IS TRUE

-     # WHERE tag_listing.tag_id IS NULL AND build.state = %(st_complete)i"""

-     # return _multiRow(q, locals(), aliases)

+     # following can be achieved with simple query but with

+     # linear complexity while this one will be parallelized to

+     # full number of workers giving at least 2x speedup

+     tables = ('build', 'package')

+     clauses = [

+         """NOT EXISTS

+              (SELECT 1 FROM tag_listing

+               WHERE tag_listing.build_id = build.id

+                 AND tag_listing.active IS TRUE)""",

+         "package.id = build.pkg_id",

+         "build.state = %(st_complete)i",

+     ]

+     if name is not None:

+         clauses.append('package.name = %(name)s')

+ 

      query = QueryProcessor(columns=fields, aliases=aliases, tables=tables,

-                            joins=joins, clauses=clauses, values=locals(),

+                            clauses=clauses, values=locals(),

                             opts=queryOpts)

      return query.iterate()

  

This doesn't honor the name or queryOpts params.

I'm not convinced we have to leave QueryProcessor behind (and I'm not sure we can reasonably honor queryOpts without it). I believe it will currently accept a complicated clause. All QP does is join them with AND and parentheses.

rebased onto 63b4f1b

3 years ago

Ah, you're right - it was older snippet and I've not examined it carefully. Rewritten to QP with all the options.

Metadata Update from @tkopecek:
- Pull-request tagged with: testing-ready

3 years ago

Metadata Update from @jcupova:
- Pull-request tagged with: testing-done

3 years ago

Commit b477fa5 fixes this pull-request

Pull-Request has been merged by tkopecek

3 years ago
Metadata