PR#116: RFE: Implement ClientSession.itercall() - koji

koji

#116 RFE: Implement ClientSession.itercall()

Closed 4 years ago by tkopecek. Opened 7 years ago by mizdebsk.

mizdebsk/koji itercall into master

Implement ClientSession.itercall()

Mikolaj Izdebski • 7 years ago

329e727

koji/__init__.py

file modified

+25

		`@@ -2174,6 +2174,31 @@`
		`raise err`
		`return ret`

		`+ def itercall(self, args, fn):`
		`+ """Alternative way of using multiCall. Iterates over list of items,`
		`+ calls a hub API function for each item. Returns generator of`
		`+ call results.`
		`+`
		`+ "args" is list of arbitrary items.`
		`+ "fn" is a function (typically lambda) taking one argument, a`
		`+ single item from the list.`
		`+`
		`+ itercall() will call "fn" for each item in the list. When executed,`
		`+ "fn" is expected to make a single API call on ClientSession.`
		`+ itercall() groups these API calls into chunks of configurable`
		`+ size and handles them with multiCall. It returns generator of`
		`+ replies for each call made.`
		`+`
		`+ """`
		`+ chunk_size = self.opts.get('itercall_chunk_size', 100)`
		`+ while args:`
		`+ self.multicall = True`
		`+ for arg in args[:chunk_size]:`
		`+ fn(arg)`
		`+ for [info] in self.multiCall():`
		`+ yield info`
		`+ args = args[chunk_size:]`
		`+`
		`def __getattr__(self,name):`
		`#if name[:1] == '_':`
		`# raise AttributeError, "no attribute %r" % name`

tests/test_client/test_itercall.py

file added

+31

		`@@ -0,0 +1,31 @@`
		`+ import mock`
		`+ import unittest`
		`+ import koji`
		`+`
		`+`
		`+ class TestItercall(unittest.TestCase):`
		`+`
		`+ def test_itercall(self):`
		`+`
		`+ ks = koji.ClientSession('http://dumy.hub/address')`
		`+ ks.multiCall = mock.Mock(return_value=[['ret1'], ['ret2'], ['ret3']])`
		`+`
		`+ args = ['arg1', 'arg2', 'arg3']`
		`+ rets = list(ks.itercall(args, lambda arg: ks.foo(arg)))`
		`+`
		`+ ks.multiCall.assert_called_once_with()`
		`+ self.assertEquals(['ret1', 'ret2', 'ret3'], rets)`
		`+`
		`+`
		`+ def test_itercall_chunk_size(self):`
		`+`
		`+ ks = koji.ClientSession('http://dumy.hub/address',`
		`+ opts={'itercall_chunk_size': 2})`
		`+ mock_rets = [[[1], [2]], [[3], [4]], [[5], [6]], [[7]]]`
		`+ ks.multiCall = mock.Mock(side_effect=lambda: mock_rets.pop(0))`
		`+`
		`+ args = [111, 222, 333, 444, 555, 666, 777]`
		`+ rets = list(ks.itercall(args, lambda arg: ks.foo(arg)))`
		`+`
		`+ ks.multiCall.assert_has_calls([mock.call()] * 4)`
		`+ self.assertEquals(range(1,8), rets)`

mizdebsk commented 7 years ago

itercall is an alternative way of using Koji multiCall API from
ClientSession in Python.

multiCall is a feature of Koji XML-RPC which allows multiple RPC calls
to be grouped into a single XML-RPC call and processed in batch by
Koji hub.

Typical usage of multicall from Python:

clientSession.multicall = True
for item in itemsToProcess:
    clientSession.someApiCall(item)
for result in clientSession.multiCall():
    processResult(result)

With itercall this can be simplified to:

for result in clientSession.itercall(itemsToProcess,
       lambda item: clientSession.someApiCall(item)):
   processResult(result)

Pros

Itercall makes coding multiCall easier. With itercall developers are
more likely to use multicall in their code, which should benefit both
their applications and Koji.

Itercall splits big multicalls into smaller chunks (configurable),
which prevents timeouts, conserves both client and hub memory (the
whole request and reply don't need to be buffered).

Itercall can save a lot of hub work in case client wants to exit early
without processing all results, or if fault occurs.

Moreover, nested itercall loops are easily doable, which is not
straightforward with standard multicall API.

tkopecek commented 7 years ago

I've a few concerns/questions in order of importance:

Iteration chunks hide underlying logic. In case when itercall runs in chunks of calls, and I iterate through them, I can hit some error earlier than chunk is exhausted. In such case I expect that in standard multicall all of functions were called on hub. Here it is some hidden number (chunk_length - chunk_index_of_failed_call), so they could be some unintended side-effects. Any idead, how this could be done in safe way?
Previous point could be partially solved by user-defined chunk size, so itercall could be extended with optional argument, e.g. 'batch'. When I know, that a maximum number is reasonable one, I would like to run it in one batch, so batch = no_of_calls. In some other cases - when I run e.g. 3 operations which makes sense only to be done together (e.g. create pkg, add to two default tags) and these operation will be called many times, it could be done by itercall with batch size 3. In such case I've at least some control about what is done and what failed and can do some rollback operations.
More cosmetic comment - itercall(self, args, fn) signature seems little-bit unnatural to me. itercall(self, fn, args) is what I would expect.