Have the content-encoding response header only set if the webserver itself is providing compression. When serving .tar.gz files from http://repos.fedorapeople.org/ the content-encoding: gzip is set. Consider the exact same file being served from http://repos.fedorapeople.org/ versus S3.
From http://repos.fedorapeople.org/
[bmbouter@localhost pulp_python]$ curl -I https://repos.fedorapeople.org/pulp/pulp/fixtures/python-pypi/packages/shelf_reader-0.1-py2-none-any.whl HTTP/1.1 200 OK Date: Mon, 13 Aug 2018 18:57:54 GMT Server: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips Strict-Transport-Security: max-age=31536000; includeSubDomains; preload Last-Modified: Mon, 13 Aug 2018 02:53:10 GMT ETag: "57b7-57348319d1186" Accept-Ranges: bytes Content-Length: 22455 Cache-Control: max-age=1800 Expires: Mon, 13 Aug 2018 19:27:54 GMT Vary: Accept-Encoding,User-Agent X-GitProject: (null) AppTime: D=266 AppServer: people02.fedoraproject.org
From S3, notice there is no content-encoding set:
[bmbouter@localhost pulp_python]$ curl -I https://files.pythonhosted.org/packages/77/e0/2156a3da94ee16466a5936394caf7e89873a9b46eed72a9912bc90e42dbf/shelf_reader-0.1-py2-none-any.whl HTTP/2 200 x-amz-id-2: B6mC7AYwpc9DeSHPmMvUZeAFmdDW2DYXui6R/W8rXZhMGO40eyQLIfB65WB4SVQD0ub4ADhrqXw= x-amz-request-id: 550F8F66900F994D last-modified: Thu, 19 May 2016 18:59:09 GMT etag: "69b867d206f1ff984651aeef25fc54f9" x-amz-version-id: 4TGoYTc_51lerne.zoXHNsLphLO4s7Xh content-type: application/octet-stream server: AmazonS3 cache-control: max-age=365000000, immutable, public accept-ranges: bytes date: Mon, 13 Aug 2018 18:58:45 GMT age: 1806714 x-served-by: cache-sea1025-SEA, cache-dca17732-DCA x-cache: HIT, HIT x-cache-hits: 1, 1 x-timer: S1534186725.016088,VS0,VE1 strict-transport-security: max-age=31536000; includeSubDomains; preload x-frame-options: deny x-xss-protection: 1; mode=block x-content-type-options: nosniff x-permitted-cross-domain-policies: none x-robots-header: noindex content-length: 22455
Interestingly, this blog post seems to describe this problem exactly: https://blogs.msdn.microsoft.com/wndp/2006/08/21/content-encoding-content-type/
When do you need this? (YYYY/MM/DD) When you can. Soon would be great because we currently have to sync from production mirrors instead.
When is this no longer needed or useful? (YYYY/MM/DD) N/A
If we cannot complete your request, what is the impact? We would have to move off of fedorapeople for our fixture data hosting needs I guess.
Metadata Update from @smooge: - Issue assigned to smooge
Commit 45626dc9e22a476e4fa7bd67705cd743c7e0300e made it so .tar.gz files on people were getting seen as text files. Commit dbd5d1419cf228e428dd549b42a13df82f9eda96 tries to fix using the logic from 4039e6bc32429c0e8014ba835f167f074b04d1e3 and others
Bug has been confirmed fix by reporter
Metadata Update from @smooge: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
I just tested a change and now I get the correct result:
[bmbouter@localhost pulp_python]$ curl -I https://repos.fedorapeople.org/pulp/pulp/fixtures/python-pypi/packages/shelf-reader-0.1.tar.gz HTTP/1.1 200 OK Date: Mon, 13 Aug 2018 20:30:07 GMT Server: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips Strict-Transport-Security: max-age=31536000; includeSubDomains; preload Last-Modified: Mon, 13 Aug 2018 02:53:10 GMT ETag: "4a99-57348319d1186" Accept-Ranges: bytes Content-Length: 19097 Cache-Control: max-age=1800 Expires: Mon, 13 Aug 2018 21:00:07 GMT X-GitProject: (null) AppTime: D=216 AppServer: people02.fedoraproject.org Content-Type: application/x-gzip
Metadata Update from @bmbouter: - Issue status updated to: Open (was: Closed)
Metadata Update from @bmbouter: - Issue status updated to: Closed (was: Open)
We're experiencing this same (or, similar) issue again. Here are the headers:
FedoraPeople
(pulp) [vagrant@pulp3 pulp]$ curl -I https://repos.fedorapeople.org/repos/pulp/pulp/fixtures/python-pypi/packages/Django-1.10.3-py2.py3-none-any.whl HTTP/1.1 200 OK Date: Fri, 04 Jan 2019 15:51:56 GMT Server: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips Strict-Transport-Security: max-age=31536000; includeSubDomains; preload Last-Modified: Sun, 30 Dec 2018 04:30:35 GMT ETag: "6844b7-57e35c1ef82a1" Accept-Ranges: bytes Content-Length: 6833335 Cache-Control: max-age=1800 Expires: Fri, 04 Jan 2019 16:21:56 GMT Vary: Accept-Encoding,User-Agent X-GitProject: (null) AppTime: D=181 AppServer: people02.fedoraproject.org
PyPI
(pulp) [vagrant@pulp3 pulp]$ curl -I https://files.pythonhosted.org/packages/b5/33/1ab8727270fa6b354545d8100fe15bc23c9b57950c49a72919f34216f167/pulpcore-3.0.0a1.tar.gz HTTP/2 200 x-amz-id-2: woBRRF6CjzELaipjNMIU6SKoFLuDRVJ6eaqfdfM53uzhYfa9h4OxWCxWc8LdoKqtwaXn6HBl1O8= x-amz-request-id: 61BE7FDA5AB5D7C3 last-modified: Tue, 26 Sep 2017 15:27:13 GMT etag: "c65450d831e33ef4fca83a3dc73b5c2d" x-amz-version-id: MGB_aefFFadyJxhzh0EH1oleOBQfPi0s content-type: binary/octet-stream server: AmazonS3 cache-control: max-age=365000000, immutable, public accept-ranges: bytes date: Fri, 04 Jan 2019 15:52:11 GMT age: 2804 x-served-by: cache-sea1030-SEA, cache-mdw17382-MDW x-cache: HIT, HIT x-cache-hits: 1, 1 x-timer: S1546617131.183972,VS0,VE2 strict-transport-security: max-age=31536000; includeSubDomains; preload x-frame-options: deny x-xss-protection: 1; mode=block x-content-type-options: nosniff x-permitted-cross-domain-policies: none x-robots-header: noindex content-length: 74559
I am not sure what the problem you are running into. The original problem was dealing with .tar.gz files and if I do a curl against a .tar.gz file it says:
[smooge@smoogen-laptop tmp]$ curl -I https://repos.fedorapeople.org/pulp/pulp/fixtures/python-pypi/packages/shelf-reader-0.1.tar.gz HTTP/1.1 200 OK Date: Sat, 05 Jan 2019 20:15:21 GMT Server: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips Strict-Transport-Security: max-age=31536000; includeSubDomains; preload Last-Modified: Sun, 30 Dec 2018 04:30:37 GMT ETag: "4a99-57e35c20c1709" Accept-Ranges: bytes Content-Length: 19097 Cache-Control: max-age=1800 Expires: Sat, 05 Jan 2019 20:45:21 GMT X-GitProject: (null) AppTime: D=44672 AppServer: people02.fedoraproject.org Content-Type: application/x-gzip
A .whl file is not listed in /etc/mime.types so it is going to be treated as a general file.
After more investigation I think you're right, I don't think it has anything to do with the headers. Whereas our previous issue was causing the file to be saved into the wrong format, I think what we're actually experiencing here is a corrupted download due to an SSL error.
I went digging in the logs and found these errors:
https://paste.fedoraproject.org/paste/lbsRHzrBH2j77FdoXqBPWg
We will continue investigating the root cause, but I believe this issue can be closed. Thanks @smooge
I will close this one. If something is related to what the .whl files are needing to be seen as by the webserver, please open a different ticket so we can track it appropriately.
Thanks @smooge for helping make the fedora infra great.
Log in to comment on this ticket.