FreeIPA's json serialization is slow. In my vault performance analysis #6651, json serialization and its helper function were 3 or 4 functions in the top 15. The other slow functions were mostly I/O and network bound calls (SSL to Dogtag, GSSAPI, LDAP).
ipaserver.rpcserver.jsonserver.marshall() uses json.dumps() with sort_keys=True and indent=4. Both sort_keys and indent are problematic. Either one of the arguments forces Python's json parser to fall back to a Python-only implementation of the iter encoder, https://hg.python.org/cpython/file/tip/Lib/json/encoder.py#l246 .
ipaserver.rpcserver.jsonserver.marshall()
json.dumps()
I noticed the problem while I was analyzing WSGI profiler data with snakeviz. The profile dumps for 100 requests contain over 2.5 mio calls to isinstance() and almost 2.5 mio calls to _iterencode_dict().
2790767 0.811 0.000 0.926 0.000 {isinstance} 2495797/519934 0.776 0.000 1.205 0.000 /usr/lib64/python2.7/json/encoder.py:341(_iterencode_dict)
The function json_encode_binary consumes a lot of time and seems to call isinstance() pretty often, too. The encoder function can be implemented more efficiently as custom JSONEncoder with some additional tricks.
json_encode_binary
138038/299 0.274 0.000 0.523 0.002 /usr/lib/python2.7/site-packages/ipalib/rpc.py:276(json_encode_binary)
ipalib.util.json_serialize is yet another JSON serializer that pops up in my profiling data.
ipalib.util.json_serialize
https://github.com/freeipa/freeipa/pull/459 optimizes the performance of RPC client and server serialization and deserialization.
We might want to consider a faster JSON encoder/decoder, too. ujson is fast but lacks obj_hook feature. https://github.com/python-rapidjson/python-rapidjson is powerful and fast but has no Python 2 support.
master:
Please leave it open until I have investigated ipalib.util.json_serialize.
json_serialize definitely needs some love. ipalib.parameters.Param.__init__, too.
ipalib.parameters.Param.__init__
127.0.0.1 - - [15/Feb/2017 21:26:43] "POST /ipa/session/json HTTP/1.1" 200 - ipa: INFO: [jsonserver_session] admin@IPA.EXAMPLE: json_metadata(None, None, command=u'all', version=u'2.218'): SUCCESS -------------------------------------------------------------------------------- PATH: '/session/json' 2340515 function calls (2131924 primitive calls) in 1.142 seconds Ordered by: internal time, call count List reduced from 432 to 30 due to restriction <30> ncalls tottime percall cumtime percall filename:lineno(function) 3556 0.201 0.000 0.452 0.000 /home/heimes/redhat/freeipa/ipalib/parameters.py:441(__init__) 187490/446 0.148 0.000 1.004 0.002 /home/heimes/redhat/freeipa/ipalib/util.py:58(json_serialize) 718435 0.135 0.000 0.135 0.000 {isinstance} 3999 0.085 0.000 0.256 0.000 /home/heimes/redhat/freeipa/ipalib/parameters.py:954(__json__) 173558 0.076 0.000 0.076 0.000 {hasattr} 143774/143772 0.053 0.000 0.053 0.000 /home/heimes/redhat/freeipa/ipalib/base.py:123(__setattr__) 104200 0.037 0.000 0.072 0.000 {setattr} 1 0.032 0.032 0.032 0.032 {built-in method sasl_interactive_bind_s} 4450/1 0.031 0.000 0.049 0.049 /home/heimes/redhat/freeipa/ipalib/rpc.py:388(_enc_dict) 234842 0.026 0.000 0.027 0.000 {getattr} 4445/446 0.025 0.000 0.460 0.001 /home/heimes/redhat/freeipa/ipalib/util.py:62(<dictcomp>) 234451 0.019 0.000 0.019 0.000 {method 'get' of 'dict' objects} 1 0.018 0.018 0.018 0.018 /usr/lib64/python2.7/json/encoder.py:212(iterencode) 13038/892 0.015 0.000 0.047 0.000 /home/heimes/redhat/freeipa/ipalib/rpc.py:380(_enc_list) 3556 0.012 0.000 0.012 0.000 /home/heimes/redhat/freeipa/ipalib/parameters.py:261(parse_param_spec) 24961 0.012 0.000 0.012 0.000 {_codecs.utf_8_decode} 24961 0.011 0.000 0.030 0.000 {method 'decode' of 'str' objects} 107753 0.010 0.000 0.010 0.000 /home/heimes/redhat/freeipa/ipalib/parameters.py:506(<genexpr>) 7176 0.008 0.000 0.012 0.000 /home/heimes/redhat/freeipa/ipalib/text.py:248(as_unicode) 1973 0.007 0.000 0.273 0.000 /home/heimes/redhat/freeipa/ipalib/parameters.py:725(clone_retype) 24961 0.007 0.000 0.019 0.000 /usr/lib64/python2.7/encodings/utf_8.py:15(decode) 7388 0.006 0.000 0.006 0.000 {method 'match' of '_sre.SRE_Pattern' objects} 9958/9062 0.006 0.000 0.009 0.000 /home/heimes/redhat/freeipa/ipalib/util.py:846(__get__) 77523 0.006 0.000 0.006 0.000 {method 'pop' of 'dict' objects} 1975 0.006 0.000 0.273 0.000 /home/heimes/redhat/freeipa/ipalib/parameters.py:1433(__init__) 7565 0.006 0.000 0.006 0.000 /usr/lib64/python2.7/re.py:230(_compile) 1 0.006 0.006 0.006 0.006 {method 'encode' of 'str' objects} 3364 0.005 0.000 0.010 0.000 /home/heimes/redhat/freeipa/ipalib/plugable.py:288(__enumerate) 7386 0.005 0.000 0.017 0.000 /usr/lib64/python2.7/re.py:138(match) 7475 0.005 0.000 0.018 0.000 /home/heimes/redhat/freeipa/ipalib/text.py:264(__json__)
For Vault use case, wouldn't it be better to measure it with Vault commands? Using json_metadata command is good for measurement of JSON serialization and parameter init. Mostly because json_metadata produce really big list - pretty printed, without compression it has about 0.5MB. But in Vault use case json serialization and param init won't be probably a problem because response is most-likely small(depends on secret size) and the command doesn't iterate through all commands in IPA.
The issue is about JSON performance in general. Vault is merely the main driver to fix some critical paths ASAP. The performance issue ipalib.util.json_serialize should be fixed during the development cycle of 4.5, too. It's a low hanging fruit that will dramatically improve performance for some operations with less than an hour of coding.
How is json metadata used? Does the client call when it bootstraps the API or is it just used in the API browser?
Param.__init___ and gzip compression are unrelated issues. For gzip compression we can simply use Apache mod_gzip to handle compression on the fly. It's just a matter of adding SetOutputFilter DEFLATE to Apache ipa.conf.
Param.__init___
SetOutputFilter DEFLATE
Petr, I've created #6681 for compression.
JFTR, json_metadata command is used in Web UI right after authentication.
There are two calls:
{"method":"json_metadata","params":[[],{"object":"all","version":"2.217"}]}
And:
{"method":"json_metadata","params":[[],{"command":"all","version":"2.217"}]}
First takes 453ms, second 2.10s on my testing vm. It might be viewed as long time, but it happens only once per one browser tab. It is significantly less time then Web UI of other products. Therefore I don't see a need to optimize it now. We have a ticket to cache the metadata though.
I added issue #6693 for json_serialize().
Metadata Update from @cheimes: - Issue assigned to someone - Issue set to the milestone: FreeIPA 4.5
Metadata Update from @mbasti: - Issue close_status updated to: None (was: Fixed)
Log in to comment on this ticket.