When using spectool -g specfile to download sources, one of the sources is saved as gzipped file, although the content-type says plain text and is also expected to be plain text.
You can use the following spec to reproduce it: https://src.fedoraproject.org/rpms/tor/raw/c734b9e2bd65408ca3df4e591d83e68a22262f6d/f/tor.spec
Also you can see with curl, that content is clearly text/plain and is also fetched properly by curl.
But with spectool it ends up as the gzip compressed content.
$ podman run -it fedora:33 # dnf install -y wget rpmdevtools [...] [root@22149421e988 /]# wget https://src.fedoraproject.org/rpms/tor/raw/c734b9e2bd65408ca3df4e591d83e68a22262f6d/f/tor.spec --2021-01-23 10:18:27-- https://src.fedoraproject.org/rpms/tor/raw/c734b9e2bd65408ca3df4e591d83e68a22262f6d/f/tor.spec Resolving src.fedoraproject.org (src.fedoraproject.org)... 38.145.60.21, 38.145.60.20 Connecting to src.fedoraproject.org (src.fedoraproject.org)|38.145.60.21|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 29935 (29K) [text/plain] Saving to: ‘tor.spec’ tor.spec 100%[=============================================================>] 29.23K 151KB/s in 0.2s 2021-01-23 10:18:28 (151 KB/s) - ‘tor.spec’ saved [29935/29935] [root@22149421e988 /]# spectool -g tor.spec Downloading: https://dist.torproject.org/tor-0.4.5.4-rc.tar.gz 100% of 7.5 MiB |###############################################################################| Elapsed Time: 0:00:00 Time: 0:00:00 Downloaded: tor-0.4.5.4-rc.tar.gz Downloading: https://dist.torproject.org/tor-0.4.5.4-rc.tar.gz.asc 100% of 670.0 B |#################################################################################| Elapsed Time: 0:00:00 Time: 0:00:00 Downloaded: tor-0.4.5.4-rc.tar.gz.asc [root@22149421e988 /]# file tor-0.4.5.4-rc.tar.gz.asc tor-0.4.5.4-rc.tar.gz.asc: gzip compressed data, from Unix, original size modulo 2^32 833 [root@22149421e988 /]# zcat tor-0.4.5.4-rc.tar.gz.asc -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEegKzUh3HXFQroBVFav7m1J6StgEFAmALAOoACgkQav7m1J6S tgFw4w/6AzmpCbd6r7Xk1XtpXE9MnGoJFdWKAhyIoCWcPLB+LNjRERgFCWcnGXqg nkr0lPIrhvJ6T0k72Wkn8Tp9v4GlxIGxBew2KA2ImTNDw8Uf0wTDOqHQ5ulVdaEP fvV8dY91lOnXPK9sMjpobeK9zzFjzg5CQc0fUtrQNy9o4o9D2/gy1dz2ZTEYsPxX /UgDtyhoAD7T9CG9m3zUO5ORM38pKoPlFn3SGFz2Syv0gGTmaiMUniEZUT2y4Jtq 0S9lg631OVnRF672QkgIqV9Vn1JOSh3Ykhx9V7mEKLSUhgHYNllPP8ooy7C/zVUV vNi5cZJ4NEXL3kFELGXq85VXHn8yY8LDD2PuxPJz3qFscGSL2TkdZR3QjqP5cica QEzgT0z3Ga3eZ5GDvlPGrYh4fNpuBPP4pbsn+qSYSQSMz07xssbnDSs9ovF9gedg tcQhF1FnnV4XBd/m+4RJyBjvo84HRekibaFhSokcE56uw4a2CDU5i0ABdzQTUaXr lc6GONkxAEdMTCok61r0NlH5bBVwkMYEpw66C99MJtu2ZrrEOL0RNCGQwqsQveDO qNXL7Uj3JAUZYyBKM9cAWwF2lS6HVAFkaCnnynFSfxBsymHJUFvdJGtZdMBCfhCS 6AK95423J/jqYJHqdhGZPOjaKrtCHRqI7es29njyaFaykQ8juuw= =ELYZ -----END PGP SIGNATURE----- [root@22149421e988 /]# curl -v https://dist.torproject.org/tor-0.4.5.4-rc.tar.gz.asc * Trying 116.202.120.166:443... * Connected to dist.torproject.org (116.202.120.166) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: none * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.3 (IN), TLS handshake, CERT verify (15): * TLSv1.3 (IN), TLS handshake, Finished (20): * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.3 (OUT), TLS handshake, Finished (20): * SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 * ALPN, server accepted to use http/1.1 * Server certificate: * subject: CN=dist.torproject.org * start date: Nov 27 00:55:50 2020 GMT * expire date: Feb 25 00:55:50 2021 GMT * subjectAltName: host "dist.torproject.org" matched cert's "dist.torproject.org" * issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3 * SSL certificate verify ok. > GET /tor-0.4.5.4-rc.tar.gz.asc HTTP/1.1 > Host: dist.torproject.org > User-Agent: curl/7.71.1 > Accept: */* > * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * old SSL session ID is stale, removing * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK < Date: Sat, 23 Jan 2021 10:19:34 GMT < Server: Apache < X-Content-Type-Options: nosniff < X-Frame-Options: sameorigin < X-Xss-Protection: 1 < Referrer-Policy: no-referrer < Strict-Transport-Security: max-age=15768000; preload < Content-Security-Policy: default-src 'self'; < Last-Modified: Fri, 22 Jan 2021 16:45:45 GMT < ETag: "341-5b97feb7409ec" < Accept-Ranges: bytes < Content-Length: 833 < Cache-Control: max-age=3600 < Expires: Sat, 23 Jan 2021 11:19:34 GMT < Vary: Accept-Encoding < Content-Type: text/plain < -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEegKzUh3HXFQroBVFav7m1J6StgEFAmALAOoACgkQav7m1J6S tgFw4w/6AzmpCbd6r7Xk1XtpXE9MnGoJFdWKAhyIoCWcPLB+LNjRERgFCWcnGXqg nkr0lPIrhvJ6T0k72Wkn8Tp9v4GlxIGxBew2KA2ImTNDw8Uf0wTDOqHQ5ulVdaEP fvV8dY91lOnXPK9sMjpobeK9zzFjzg5CQc0fUtrQNy9o4o9D2/gy1dz2ZTEYsPxX /UgDtyhoAD7T9CG9m3zUO5ORM38pKoPlFn3SGFz2Syv0gGTmaiMUniEZUT2y4Jtq 0S9lg631OVnRF672QkgIqV9Vn1JOSh3Ykhx9V7mEKLSUhgHYNllPP8ooy7C/zVUV vNi5cZJ4NEXL3kFELGXq85VXHn8yY8LDD2PuxPJz3qFscGSL2TkdZR3QjqP5cica QEzgT0z3Ga3eZ5GDvlPGrYh4fNpuBPP4pbsn+qSYSQSMz07xssbnDSs9ovF9gedg tcQhF1FnnV4XBd/m+4RJyBjvo84HRekibaFhSokcE56uw4a2CDU5i0ABdzQTUaXr lc6GONkxAEdMTCok61r0NlH5bBVwkMYEpw66C99MJtu2ZrrEOL0RNCGQwqsQveDO qNXL7Uj3JAUZYyBKM9cAWwF2lS6HVAFkaCnnynFSfxBsymHJUFvdJGtZdMBCfhCS 6AK95423J/jqYJHqdhGZPOjaKrtCHRqI7es29njyaFaykQ8juuw= =ELYZ -----END PGP SIGNATURE----- * Connection #0 to host dist.torproject.org left intact [root@22149421e988 /]# rpm -qi rpmdevtools Name : rpmdevtools Version : 9.3 Release : 1.fc33 Architecture: noarch Install Date: Sat Jan 23 10:17:03 2021 Group : Unspecified Size : 222089 License : GPLv2+ and GPLv2 Signature : RSA/SHA256, Wed Jan 20 12:26:53 2021, Key ID 49fd77499570ff31 Source RPM : rpmdevtools-9.3-1.fc33.src.rpm Build Date : Wed Jan 20 12:10:23 2021 Build Host : buildvm-x86-07.iad2.fedoraproject.org Packager : Fedora Project Vendor : Fedora Project URL : https://pagure.io/rpmdevtools Bug URL : https://bugz.fedoraproject.org/rpmdevtools Summary : RPM Development Tools
What fresh hell is this? :cry:
Looks like this is fallout from fixing files that should be compressed being uncompressed during download ( #72 ) ... and it leads to files that should not be compressed staying compressed after download. Nice.
I see that the server sets Content-Encoding: gzip when downloading this .asc file ...
Content-Encoding: gzip
But how is spectool supposed to know what to do?
decode_content=True
I see that in the first case, Content-Encoding: gzip and Content-Type: application/x-gzip is set, and in the second, Content-Encoding: gzip and Content-Type: text/plain is set.
Content-Type: application/x-gzip
Content-Type: text/plain
I hope there's a smarter way to distinguish those than to check if the Content-Type header is set to a hard-coded list of known plain-text or compressed file formats :(
Content-Type
Does sending the request with Accept-Encoding: identity solve this problem?
Accept-Encoding: identity
@churchyard That was the advice I got on #fedora-python, but it did not solve #72 ... so I'm not sure what to do here. On the other hand we apparently need to work around weird server configurations that claim double-gzip-compression, and on the other hand we should successfully decompress gzip-encoded plain text. :cry:
I wonder how does curl do this.
I wonder if we should use curl (or wget) to do this :)
urlgrabber is a wrapper around pycurl, and that might work better?
urlgrabber
https://pagure.io/rpmdevtools/pull-request/77 fixes the problem for me. However, if the server sends gzipped content anyway, it will stay gzipped.
Isn't urlgrabber abandoned?
No, it's still maintained by @brejoc and myself.
Alright, sorry for the confusion.
Log in to comment on this ticket.