PR#1160: Fix the way how KojiContentGenerator computes filesize. - fm-orchestrator

fm-orchestrator

#1160 Fix the way how KojiContentGenerator computes filesize.

Merged 5 years ago by jkaluza. Opened 5 years ago by jkaluza.

jkaluza/fm-orchestrator utf into master

Fix the way how KojiContentGenerator computes filesize.

Jan Kaluza • 5 years ago

0408204

module_build_service/builder/KojiContentGenerator.py

file modified

+4 -3

		`@@ -407,11 +407,12 @@`
		`mmd_path = os.path.join(output_path, mmd_filename)`
		`try:`
		`with open(mmd_path, 'rb') as mmd_f:`
		`- data = to_text_type(mmd_f.read())`
		`+ raw_data = mmd_f.read()`
		`+ data = to_text_type(raw_data)`
		`mmd = Modulemd.Module().new_from_string(data)`
		`ret['filename'] = mmd_filename`
		`- ret['filesize'] = len(data)`
		`- ret['checksum'] = hashlib.md5(data.encode('utf-8')).hexdigest()`
		`+ ret['filesize'] = len(raw_data)`
		`+ ret['checksum'] = hashlib.md5(raw_data).hexdigest()`
		`except IOError:`
		`if arch == "src":`
		`# This might happen in case the Module is submitted directly`

tests/test_content_generator.py

file modified

+4 -4

		`@@ -276,7 +276,7 @@`
		`@patch("module_build_service.builder.KojiContentGenerator.open", create=True)`
		`def test_get_arch_mmd_output(self, patched_open):`
		`patched_open.return_value = mock_open(`
		`- read_data=self.cg.mmd).return_value`
		`+ read_data=self.cg.mmd.encode("utf-8")).return_value`
		`ret = self.cg._get_arch_mmd_output("./fake-dir", "x86_64")`
		`assert ret == {`
		`'arch': 'x86_64',`
		`@@ -286,7 +286,7 @@`
		`'components': [],`
		`'extra': {'typeinfo': {'module': {}}},`
		`'filename': 'modulemd.x86_64.txt',`
		`- 'filesize': 1136,`
		`+ 'filesize': 1138,`
		`'type': 'file'`
		`}`

		`@@ -296,7 +296,7 @@`
		`rpm_artifacts = mmd.get_rpm_artifacts()`
		`rpm_artifacts.add("dhcp-libs-12:4.3.5-5.module_2118aef6.x86_64")`
		`mmd.set_rpm_artifacts(rpm_artifacts)`
		`- mmd_data = to_text_type(mmd.dumps())`
		`+ mmd_data = bytes(mmd.dumps())`

		`patched_open.return_value = mock_open(`
		`read_data=mmd_data).return_value`
		`@@ -336,7 +336,7 @@`
		`u'version': '4.3.5'}],`
		`'extra': {'typeinfo': {'module': {}}},`
		`'filename': 'modulemd.x86_64.txt',`
		`- 'filesize': 317,`
		`+ 'filesize': 319,`
		`'type': 'file'`
		`}`

tests/test_get_generator_json_expected_output.json

file modified

+1 -1

		`@@ -625,7 +625,7 @@`
		`}`
		`],`
		`"arch": "noarch",`
		`- "filesize": 1136,`
		`+ "filesize": 1138,`
		`"checksum": "96b7739ffa3918e6ac3e3bd422b064ea",`
		`"checksum_type": "md5",`
		`"type": "file",`

tests/test_get_generator_json_expected_output_with_log.json

file modified

+1 -1

		`@@ -625,7 +625,7 @@`
		`}`
		`],`
		`"arch": "noarch",`
		`- "filesize": 1136,`
		`+ "filesize": 1138,`
		`"checksum": "96b7739ffa3918e6ac3e3bd422b064ea",`
		`"checksum_type": "md5",`
		`"type": "file",`

jkaluza commented 5 years ago

The current code reads the data, converts them to unicode string and
then uses the len() of that string as filesize. This is wrong,
because Koji expects filesize to be really number of bytes, not number
of characters.

Therefore, in this commit, the filesize is computed from raw data (bytes).

jkaluza commented 5 years ago

Fixes https://release-engineering.github.io/mbs-ui/module/3538.

lholecek commented on line 12 of module_build_service/builder/KojiContentGenerator.py 5 years ago

ret['checksum'] = hashlib.md5(raw_data).hexdigest()

lholecek commented on line 15 of tests/test_content_generator.py 5 years ago

Ah, I was wondering why the size is different (I would expect only ASCII characters in test data). It's because there is stray ’ character in tests/staged_data/platform.yaml.

rebased onto 0408204

5 years ago

jkaluza commented 5 years ago

@lholecek, updated based on your comment. Yes, we have added unicode character causing problems to test data in the past.

lholecek commented 5 years ago