#1119 Read the files to be hashed as binary to ensure end of lines are not converted
Merged 5 years ago by mprahl. Opened 5 years ago by mprahl.

@@ -404,8 +404,8 @@ 

          # parse it to get the Modulemd instance.

          mmd_path = os.path.join(output_path, mmd_filename)

          try:

-             with open(mmd_path) as mmd_f:

-                 data = mmd_f.read()

+             with open(mmd_path, 'rb') as mmd_f:

+                 data = mmd_f.read().decode('utf-8')

                  mmd = Modulemd.Module().new_from_string(data)

                  ret['filename'] = mmd_filename

                  ret['filesize'] = len(data)
@@ -452,8 +452,8 @@ 

  

          try:

              log_path = os.path.join(output_path, "build.log")

-             with open(log_path) as build_log:

-                 checksum = hashlib.md5(build_log.read().encode('utf-8')).hexdigest()

+             with open(log_path, 'rb') as build_log:

+                 checksum = hashlib.md5(build_log.read()).hexdigest()

              stat = os.stat(log_path)

              ret.append(

                  {

When encoutering a Windows end of line (^M), io.open and open in Python 3
will convert those to UNIX end of lines by default. When reading logs
to compute the checksum, it's important those new lines aren't converted,
to ensure the checksum is correct. This caused issues in Fedora staging
because when cloning down a repo, the repoSpanner output had Windows end
of lines, and this would end up in build.log. The solution is to just read
it as binary so that Python doesn't perform these conversions.

rebased onto 67ebe16

5 years ago

Disregard the rebase, I just fixed a typo in the commit message.

FYI @jkaluza, this is what was causing builds to fail in Fedora stage.

Pull-Request has been merged by mprahl

5 years ago