#6164 SQL request on db01 (prod) hangs for no obvious reason
Closed: Fixed 6 years ago Opened 6 years ago by abompard.

If I'm running the following query on the hyperkitty database:

hyperkitty=> SELECT date FROM "hyperkitty_email" WHERE mailinglist_id = 'x86@lists.fedoraproject.org' ORDER BY date ASC;

then everything works fine and I get one result. If I add "LIMIT 1" to this query, which becomes:

hyperkitty=> SELECT date FROM "hyperkitty_email" WHERE mailinglist_id = 'x86@lists.fedoraproject.org' ORDER BY date ASC LIMIT 1;

then the query hangs apparently forever and I have to kill it (Ctrl-C).

As if it needed to be weirder, it does not happen with another list (or I haven't found any other yet), if you replace "x86" with "test", it works fine.

No clue what's going on.


This is because there's no index that hits with the limit 1 case.

hyperkitty=# explain SELECT date FROM "hyperkitty_email" WHERE mailinglist_id = 'x86@lists.fedoraproject.org' ORDER BY date ASC;
                                               QUERY PLAN                                                
---------------------------------------------------------------------------------------------------------
 Sort  (cost=10335.24..10343.84 rows=3440 width=8)
   Sort Key: date
   ->  Bitmap Heap Scan on hyperkitty_email  (cost=210.20..10133.17 rows=3440 width=8)
         Recheck Cond: ((mailinglist_id)::text = 'x86@lists.fedoraproject.org'::text)
         ->  Bitmap Index Scan on hyperkitty_email_mailinglist_id  (cost=0.00..209.34 rows=3440 width=0)
               Index Cond: ((mailinglist_id)::text = 'x86@lists.fedoraproject.org'::text)
(6 rows)
hyperkitty=# explain SELECT date FROM "hyperkitty_email" WHERE mailinglist_id = 'x86@lists.fedoraproject.org' ORDER BY date ASC LIMIT 1;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..860.52 rows=1 width=8)
   ->  Index Scan using hyperkitty_email_date on hyperkitty_email  (cost=0.00..2960202.17 rows=3440 width=8)
         Filter: ((mailinglist_id)::text = 'x86@lists.fedoraproject.org'::text)
(3 rows)

I really didn't expect PostgreSQL to change its query plan so much just for adding a LIMIT clause, but yeah the slowdown looks obvious now.
For a reason I don't know, for LIMIT values of 12 and up, PostgreSQL uses the bitmap heap scan on the mailing list name first, but for lower values it scans the date index, which takes ages obviously.

Other people seem to have the same problem:
- https://dba.stackexchange.com/questions/60419/why-does-this-limit-make-the-postgres-planner-use-a-much-slower-index-scan-inste
- http://datamangling.com/2014/01/17/limit-1-and-performance-in-a-postgres-query/
- http://thebuild.com/blog/2014/11/18/when-limit-attacks/

There's apparently a way to trick the query planner by wrapping the query in a re-ordering query:

SELECT date FROM (SELECT date FROM hyperkitty_email WHERE mailinglist_id = 'x86@lists.fedoraproject.org' ORDER BY date ASC) AS q ORDER BY date LIMIT 1 ;

This doesn't look great but works. I'm kinda reluctant to change the code to accomodate PostgreSQL in this non-obvious way. The other option would be to run SET enable_indexscan = OFF; before the query and SET enable_indexscan = ON; after, only if the database driver is PostgreSQL. It doesn't look great either but it's more obvious that we're trying to workaround something in this specific database server.

Alright I'll think about it some more and take my pick.

It should be better now.

ok. Is there anything further to do here? If so, please re-open...

:negative_squared_cross_mark:

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

6 years ago

Login to comment on this ticket.

Metadata