#757 $ copr list-packages --with-latest-succeeded-build <project> takes too long
Closed: Fixed 2 years ago by praiskup. Opened 4 years ago by praiskup.


My guess: Serializing all the build information takes too much time. I'd be happy to just get build IDs.

Metadata Update from @frostyx:
- Issue assigned to frostyx

4 years ago

My guess: Serializing all the build information takes too much time. I'd be happy to just get build IDs.

Actually, the issue is IMHO different. The API endpoint for returning all packages uses Sqlalchemy ORM while the monitor page uses manually written SQL query. We once had similar performance issues even for other pages, like

https://copr.fedorainfracloud.org/coprs/<user>/<project>/builds/

and

https://copr.fedorainfracloud.org/coprs/<user>/<project>/packages/

and fixed them, so this should be fixable too.

I started working on this issue. It is not fixed yet but I want to share the progress.

I originally thought, that all the API queries are slower than their HTML counterparts. They are not but ... there is a but.

On a project with less than 100 builds

  • client.build_proxy.get_list("frostyx", "tracer") took 0m0.464s
  • curl http://127.0.0.1:5000/coprs/frostyx/tracer/builds/ took 0m0.137s

We could look into this but both are under one second so I am not sure if anyone even notices a difference. The results are similar when querying for packages 0.1s vs 0.2s.

Now moving to the big projects, I used iucar/cranas an example.

  • client.build_proxy.get_list("iucar", "cran") took minutes (explanation later)
  • curl http://127.0.0.1:5000/coprs/iucar/cran/builds/ took 0m30.205s

This looks like there is a huge performance issue in the API but we need to consider a fact, that the HTML output is streamed and outputs 1000 results at a time. I believe that curl ends after obtaining the first batch. So I would add these two measurements into consideration.

  • client.build_proxy.get_list("iucar", "cran", pagination={"limit": 1000}) took 0m2.422s
  • Loading http://127.0.0.1:5000/coprs/iucar/cran/builds/ in a browser took 30s to show the first batch of results and way longer to finish loading the entire page.

I wasn't able to get consistent times, so here are some attempts

  • Browser: 7m, 1m30s, 8m, 5m40s
  • API without pagination: 4m11.558s, 2m5.137s, 2m6.148s

I am not sure what we can conclude from these results but in general, the API doesn't slower to me.

See how to work with API pagination here https://python-copr.readthedocs.io/en/latest/client_v3/pagination.html

The mail archives don't currently work because of the infrastructure migration but I guess the following mail is linked in this issue

API feedback: It's faster to grep HTML

It suggests that for obtaining all packages with their last build it is much faster to

parse the HTML returned from the Monitor page.

rather than

copr list-packages --with-latest-succeeded-build <copr>

I am not able to load a monitor page for these huge projects so I am not sure if you switched to parsing packages page ... but anyway, the CLI command is indeed excruciatingly slow and from my POV it is the only API performance issue that needs to be fixed. I know even this issue title explicitly says that there is a problem with just that one command but I suspected it is a more general issue.

The reason why it is so slow is its naive implementation. It obtains the list of all packages and then goes and sends a new request for each of them to get their last build. We need to obtain everything in just one or two requests ...

Metadata Update from @praiskup:
- Issue priority set to: High

3 years ago

Seems like the --with-latest-build is still slow for large projects, but I haven't done the measurement so far...

Metadata Update from @praiskup:
- Issue status updated to: Open (was: Closed)

2 years ago

Metadata Update from @praiskup:
- Assignee reset

2 years ago

Metadata Update from @praiskup:
- Issue assigned to frostyx

2 years ago

re-reported here: https://bugzilla.redhat.com/show_bug.cgi?id=2036631
(the monitor cli command seems to work-around this)

Metadata Update from @praiskup:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Login to comment on this ticket.

Metadata
Related Pull Requests
  • #1433 Merged 3 years ago