#455 download-task command hits MemoryError on large files
Closed: Fixed 5 years ago Opened 5 years ago by rjones.

This is actually in "brewkoji" which is supposedly a front-end to koji, but there's no specific place for bugs against brewkoji.

$ brew download-task 13402507 --arch=x86_64
Downloading [1/16]: kernel-tools-libs-devel-3.10.0-679.el7.rwmj4.x86_64.rpm
Downloading [2/16]: kernel-tools-debuginfo-3.10.0-679.el7.rwmj4.x86_64.rpm
Downloading [3/16]: python-perf-3.10.0-679.el7.rwmj4.x86_64.rpm
Downloading [4/16]: kernel-headers-3.10.0-679.el7.rwmj4.x86_64.rpm
Downloading [5/16]: kernel-debuginfo-common-x86_64-3.10.0-679.el7.rwmj4.x86_64.rpm
Downloading [6/16]: kernel-3.10.0-679.el7.rwmj4.x86_64.rpm
Downloading [7/16]: kernel-debuginfo-3.10.0-679.el7.rwmj4.x86_64.rpm
Fault: <Fault 1: "<type 'exceptions.MemoryError'>: ">

This happens every time for that particular task.

The machine has 32 GB of RAM, so it's unlikely to be caused by running out of memory.

kernel 3.10.0-679.el7.x86_64

Looking around it seems like the bug could be caused by koji using an inefficient method to construct very long strings (eg. using += in a loop). Is there any way to get a more precise/informative stack trace?

Metadata Update from @mikem:
- Issue private status set to: True

5 years ago

Metadata Update from @mikem:
- Issue private status set to: False (was: True)

5 years ago

For future reference, if you want to file bugs specifically against Brew tools like brewkoji, you can use our internal Jira.

However, this is not a brewkoji bug, this error is happening on the hub.

EDIT: memory error is happening on hub due to configured limits. No xmlrpc call can return a large file in a single pass. The download-task cli handler needs to be adjusted

The problem is that the tool is using the downloadTaskOutput call, which returns the entire contents of the file, base64 encoded, in a single call. For these kernel files, this runs quickly into our configured rlimits on the hub.

Ah, it looks like the download-task command simply using downloadTaskOutput on the entire file. That's not going to work in general.

Probably the best solution here is for the client to calculate the url and perform a regular download (as download-build does).

Optionally, downloadTaskOutput could be used in chunks (as download-logs does)

Metadata Update from @mikem:
- Issue set to the milestone: 1.14

5 years ago

Login to comment on this ticket.