#1674 buildReferences's SQL has performance problems
Closed: Fixed 4 years ago by tkopecek. Opened 4 years ago by tkopecek.

Split from @1671 - SQL has two performance problems.
1) LIMIT cause forces pg to use inefficient plan with nested loops:

$ explain SELECT rpminfo.id, rpminfo.name, rpminfo.version, rpminfo.release, rpminfo.arch, rpminfo.build_id FROM buildroot_listing JOIN rpminfo ON rpminfo.buildroot_id = buildroot_listing.buildroot_id JOIN build on rpminfo.build_id = build.id  WHERE buildroot_listing.rpm_id = 545715 AND build.state = 2 limit 10;
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.01..117.20 rows=10 width=47)
   ->  Nested Loop  (cost=1.01..16431253.53 rows=1414152 width=47)
         ->  Nested Loop  (cost=0.43..7437976.45 rows=2295765 width=51)
               ->  Seq Scan on build  (cost=0.00..55558.40 rows=305736 width=4)
                     Filter: (state = 2)
               ->  Index Scan using rpminfo_build on rpminfo  (cost=0.43..22.08 rows=207 width=51)
                     Index Cond: (build_id = build.id)
         ->  Index Only Scan using buildroot_listing_buildroot_id_key on buildroot_listing  (cost=0.57..3.91 rows=1 width=4)
               Index Cond: ((buildroot_id = rpminfo.buildroot_id) AND (rpm_id = 545715))
(9 rows)

vs unlimited variant with merges:

$ explain SELECT rpminfo.id, rpminfo.name, rpminfo.version, rpminfo.release, rpminfo.arch, rpminfo.build_id FROM buildroot_listing JOIN rpminfo ON rpminfo.buildroot_id = buildroot_listing.buildroot_id JOIN build on rpminfo.build_id = build.id  WHERE buildroot_listing.rpm_id = 545715 AND build.state = 2 ;
                                                        QUERY PLAN                                                        
--------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=88718.99..528584.51 rows=1414152 width=47)
   Hash Cond: (rpminfo.build_id = build.id)
   ->  Merge Join  (cost=29338.89..397664.93 rows=4174397 width=47)
         Merge Cond: (rpminfo.buildroot_id = buildroot_listing.buildroot_id)
         ->  Index Scan using rpminfo_buildroot on rpminfo  (cost=0.43..334103.70 rows=6776805 width=51)
         ->  Sort  (cost=28860.18..28941.78 rows=32641 width=4)
               Sort Key: buildroot_listing.buildroot_id
               ->  Index Scan using buildroot_listing_rpms on buildroot_listing  (cost=0.57..26413.02 rows=32641 width=4)
                     Index Cond: (rpm_id = 545715)
   ->  Hash  (cost=55558.40..55558.40 rows=305736 width=4)
         ->  Seq Scan on build  (cost=0.00..55558.40 rows=305736 width=4)
               Filter: (state = 2)
(12 rows)

2) Even the unlimited call is almost 1000x slower then SQL with subselect

explain analyze SELECT rpminfo.id, rpminfo.name, rpminfo.version, rpminfo.release, rpminfo.arch, rpminfo.build_id FROM buildroot_listing JOIN rpminfo ON rpminfo.buildroot_id = buildroot_listing.buildroot_id JOIN build on rpminfo.build_id = build.id  WHERE buildroot_listing.rpm_id = 586920 AND build.state = 2 ;
                                                                             QUERY PLAN                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=88718.99..528584.51 rows=1414152 width=47) (actual time=288.820..291.204 rows=10 loops=1)
   Hash Cond: (rpminfo.build_id = build.id)
   ->  Merge Join  (cost=29338.89..397664.93 rows=4174397 width=47) (actual time=22.371..26.390 rows=115 loops=1)
         Merge Cond: (rpminfo.buildroot_id = buildroot_listing.buildroot_id)
         ->  Index Scan using rpminfo_buildroot on rpminfo  (cost=0.43..334103.70 rows=6776805 width=51) (actual time=0.009..18.889 rows=75816 loops=1)
         ->  Sort  (cost=28860.18..28941.78 rows=32641 width=4) (actual time=0.040..0.049 rows=105 loops=1)
               Sort Key: buildroot_listing.buildroot_id
               Sort Method: quicksort  Memory: 25kB
               ->  Index Scan using buildroot_listing_rpms on buildroot_listing  (cost=0.57..26413.02 rows=32641 width=4) (actual time=0.012..0.031 rows=15 loops=1)
                     Index Cond: (rpm_id = 586920)
   ->  Hash  (cost=55558.40..55558.40 rows=305736 width=4) (actual time=264.584..264.584 rows=296753 loops=1)
         Buckets: 32768  Batches: 1  Memory Usage: 10433kB
         ->  Seq Scan on build  (cost=0.00..55558.40 rows=305736 width=4) (actual time=0.033..203.177 rows=296753 loops=1)
               Filter: (state = 2)
               Rows Removed by Filter: 592671
 Planning time: 0.605 ms
 Execution time: 291.251 ms
(17 rows)

explain analyze SELECT rpminfo.id, rpminfo.name, rpminfo.version, rpminfo.release, rpminfo.arch, rpminfo.build_id FROM buildroot_listing JOIN rpminfo ON rpminfo.buildroot_id = buildroot_listing.buildroot_id JO^C build on rpminfo.build_id = build.id  WHERE buildroot_listing.rpm_id = 586920 AND build.state = 2 ;
brew=> explain analyze select rpminfo.id from rpminfo, build WHERE build.state = 2 AND rpminfo.build_id = build.id AND rpminfo.buildroot_id IN (SELECT DISTINCT buildroot_id FROM buildroot_listing where buildroot_listing.rpm_id = 586920) ;
                                                                             QUERY PLAN                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=26495.48..26572.04 rows=43 width=4) (actual time=0.095..0.449 rows=10 loops=1)
   ->  Nested Loop  (cost=26495.05..26509.77 rows=128 width=8) (actual time=0.051..0.176 rows=115 loops=1)
         ->  HashAggregate  (cost=26494.62..26494.63 rows=1 width=4) (actual time=0.040..0.041 rows=15 loops=1)
               Group Key: buildroot_listing.buildroot_id
               ->  Index Scan using buildroot_listing_rpms on buildroot_listing  (cost=0.57..26413.02 rows=32641 width=4) (actual time=0.014..0.033 rows=15 loops=1)
                     Index Cond: (rpm_id = 586920)
         ->  Index Scan using rpminfo_buildroot on rpminfo  (cost=0.43..13.85 rows=128 width=12) (actual time=0.005..0.007 rows=8 loops=15)
               Index Cond: (buildroot_id = buildroot_listing.buildroot_id)
   ->  Index Scan using build_pkey on build  (cost=0.42..0.48 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=115)
         Index Cond: (id = rpminfo.build_id)
         Filter: (state = 2)
         Rows Removed by Filter: 1
 Planning time: 0.493 ms
 Execution time: 0.497 ms
(14 rows)

Metadata Update from @tkopecek:
- Custom field Size adjusted to small
- Issue set to the milestone: 1.19

4 years ago

Metadata Update from @jcupova:
- Issue tagged with: testing-done

4 years ago

Metadata Update from @dgregor:
- Issue assigned to tkopecek

4 years ago

Login to comment on this ticket.

Metadata
Related Pull Requests
  • #1675 Merged 4 years ago