PR#1119: Read the files to be hashed as binary to ensure end of lines are not converted - fm-orchestrator

fm-orchestrator

#1119 Read the files to be hashed as binary to ensure end of lines are not converted

Merged 5 years ago by mprahl. Opened 5 years ago by mprahl.

handle-windows-new-lines into master

Read the files to be hashed as binary to ensure end of lines are not converted

mprahl • 5 years ago

67ebe16

module_build_service/builder/KojiContentGenerator.py

file modified

+4 -4

		`@@ -404,8 +404,8 @@`
		`# parse it to get the Modulemd instance.`
		`mmd_path = os.path.join(output_path, mmd_filename)`
		`try:`
		`- with open(mmd_path) as mmd_f:`
		`- data = mmd_f.read()`
		`+ with open(mmd_path, 'rb') as mmd_f:`
		`+ data = mmd_f.read().decode('utf-8')`
		`mmd = Modulemd.Module().new_from_string(data)`
		`ret['filename'] = mmd_filename`
		`ret['filesize'] = len(data)`
		`@@ -452,8 +452,8 @@`

		`try:`
		`log_path = os.path.join(output_path, "build.log")`
		`- with open(log_path) as build_log:`
		`- checksum = hashlib.md5(build_log.read().encode('utf-8')).hexdigest()`
		`+ with open(log_path, 'rb') as build_log:`
		`+ checksum = hashlib.md5(build_log.read()).hexdigest()`
		`stat = os.stat(log_path)`
		`ret.append(`
		`{`

mprahl commented 5 years ago

When encoutering a Windows end of line (^M), io.open and open in Python 3
will convert those to UNIX end of lines by default. When reading logs
to compute the checksum, it's important those new lines aren't converted,
to ensure the checksum is correct. This caused issues in Fedora staging
because when cloning down a repo, the repoSpanner output had Windows end
of lines, and this would end up in build.log. The solution is to just read
it as binary so that Python doesn't perform these conversions.