For productization of IPA vault as secure storage backend for Custodia, we need fast and scalable vault. At the moment performance of vault_retrieve is too low and vault_retrieve calls take too long.
I created a very simple benchmark script. It simulates a small size setup in which 10 containers are start at the same time and request secrets for databases. The scripts runs 10 parallel subprocesses that request 3 secrets each. The test server is a freeipa-server-4.4.2-1.fc24.x86_64 installation in a Vagrant box with 2 CPUs. During the benchmark, load stays below 0.7.
freeipa-server-4.4.2-1.fc24.x86_64
Parallel request: 10 vault retrieve each: 3 Time per worker: connect: 0.666sec / retrieve: 7.792sec connect: 0.667sec / retrieve: 7.588sec connect: 0.261sec / retrieve: 7.551sec connect: 0.307sec / retrieve: 7.702sec connect: 0.259sec / retrieve: 7.226sec connect: 0.328sec / retrieve: 7.557sec connect: 0.639sec / retrieve: 7.788sec connect: 0.347sec / retrieve: 7.806sec connect: 0.334sec / retrieve: 7.362sec connect: 0.283sec / retrieve: 6.479sec total time 13.464sec
In a later I hooked up werkzeug's WSGI profiler to /usr/share/ipa/wsgi.py
/usr/share/ipa/wsgi.py
# dnf install python-werkzeug # mkdir /var/run/httpd/profile; chown apache:apache /var/run/httpd/profile from werkzeug.contrib.profiler import ProfilerMiddleware application = ProfilerMiddleware(application, profile_dir='/var/run/httpd/profile')
and analyzed performance
>>> import glob, pstats >>> stat = pstats.Stats(*glob.glob('/var/run/httpd/profile/*.prof')) >>> stat.sort_stats('tottime') <pstats.Stats instance at 0x7f3f752baab8> >>> stat.print_stats(50) ... 13420440 function calls (10871945 primitive calls) in 56.277 seconds Ordered by: internal time List reduced from 1428 to 50 due to restriction <50> ncalls tottime percall cumtime percall filename:lineno(function) 4106 17.769 0.004 17.769 0.004 {method 'read' of '_ssl._SSLSocket' objects} 299 11.234 0.038 11.236 0.038 {built-in method sasl_interactive_bind_s} 176 8.710 0.049 8.710 0.049 {method 'do_handshake' of '_ssl._SSLSocket' objects} 2713 2.654 0.001 2.654 0.001 {built-in method result4} 6838/6539 1.026 0.000 1.027 0.000 {repr} 2790767 0.811 0.000 0.926 0.000 {isinstance} 2495797/519934 0.776 0.000 1.205 0.000 /usr/lib64/python2.7/json/encoder.py:341(_iterencode_dict) 598 0.682 0.001 0.685 0.001 {gssapi.raw.ext_cred_store.acquire_cred_from} 1354 0.645 0.000 0.645 0.000 {built-in method search_ext} 11977 0.572 0.000 0.603 0.000 /usr/lib64/python2.7/site-packages/ldap/schema/tokenizer.py:10(split_tokens) 247833/543 0.302 0.000 1.777 0.003 /usr/lib/python2.7/site-packages/ipalib/util.py:58(json_serialize) 613 0.300 0.000 0.300 0.000 {method 'write' of '_ssl._SSLSocket' objects} 138038/299 0.274 0.000 0.523 0.002 /usr/lib/python2.7/site-packages/ipalib/rpc.py:276(json_encode_binary) 3514 0.273 0.000 0.624 0.000 /usr/lib/python2.7/site-packages/ipalib/parameters.py:433(__init__) 601 0.259 0.000 0.259 0.000 {method 'sendall' of '_socket.socket' objects} 62744 0.220 0.000 0.324 0.000 {method 'decode' of 'str' objects} 584832/519934 0.190 0.000 1.376 0.000 /usr/lib64/python2.7/json/encoder.py:417(_iterencode) 536499/486766 0.176 0.000 0.562 0.000 /usr/lib64/python2.7/json/encoder.py:288(_iterencode_list) 896 0.151 0.000 0.153 0.000 {gssapi.raw.creds.inquire_cred} 4733 0.146 0.000 0.462 0.000 /usr/lib/python2.7/site-packages/ipalib/parameters.py:946(__json__) 219189 0.141 0.000 0.151 0.000 {hasattr} 299 0.137 0.000 0.137 0.000 {built-in method unbind_ext} 182 0.130 0.001 0.130 0.001 {method 'connect' of '_socket.socket' objects} 39967/3842 0.125 0.000 0.280 0.000 /usr/lib64/python2.7/copy.py:145(deepcopy) 9842 0.119 0.000 0.178 0.000 /usr/lib64/python2.7/site-packages/ldap/schema/models.py:257(_set_attrs) 368717 0.116 0.000 0.120 0.000 {getattr} 1209 0.111 0.000 0.111 0.000 {method 'recv' of '_socket.socket' objects} 6166 0.099 0.000 14.845 0.002 /usr/lib64/python2.7/site-packages/ldap/ldapobject.py:87(_ldap_call) 1 0.091 0.091 0.190 0.190 /usr/lib/python2.7/site-packages/ipaserver/plugins/schema.py:753(_calculate_fingerprint) 11977 0.089 0.000 0.171 0.000 /usr/lib64/python2.7/site-packages/ldap/schema/tokenizer.py:52(extract_tokens) 472 0.088 0.000 1.479 0.003 /usr/lib64/python2.7/json/encoder.py:186(encode) 9038 0.087 0.000 2.171 0.000 /usr/lib/python2.7/site-packages/ipapython/ipaldap.py:353(_add_attr_name) 370437 0.087 0.000 0.087 0.000 {method 'get' of 'dict' objects} 7 0.087 0.012 1.251 0.179 /usr/lib64/python2.7/site-packages/ldap/schema/subentry.py:81(__init__) 1347/1079 0.087 0.000 5.859 0.005 /usr/lib/python2.7/site-packages/ipapython/ipaldap.py:1314(find_entries) 299 0.076 0.000 12.005 0.040 /usr/lib/python2.7/site-packages/ipaserver/plugins/ldap2.py:123(create_connection) 143579 0.067 0.000 0.067 0.000 /usr/lib/python2.7/site-packages/ipalib/base.py:123(__setattr__) 60658 0.067 0.000 0.067 0.000 {_codecs.utf_8_decode} 23131 0.066 0.000 0.116 0.000 /usr/lib64/python2.7/abc.py:128(__instancecheck__) 394038 0.065 0.000 0.065 0.000 {method 'append' of 'list' objects} 113350 0.064 0.000 0.109 0.000 {setattr} 926/299 0.064 0.000 39.159 0.131 /usr/lib/python2.7/site-packages/ipalib/frontend.py:451(__do_call) 294 0.063 0.000 52.216 0.178 /usr/share/ipa/wsgi.py:61(application) 15529 0.061 0.000 0.193 0.000 /usr/lib/python2.7/site-packages/ipapython/dn.py:1111(_rdns_from_value) 35819 0.061 0.000 0.117 0.000 /usr/lib64/python2.7/site-packages/ldap/schema/subentry.py:206(getoid) 218252 0.058 0.000 0.058 0.000 {method 'lower' of 'str' objects} 8054 0.056 0.000 17.836 0.002 /usr/lib64/python2.7/socket.py:410(readline) 14579 0.055 0.000 0.085 0.000 {sorted} 39967 0.052 0.000 0.074 0.000 /usr/lib64/python2.7/copy.py:267(_keep_alive) 300 0.051 0.000 56.206 0.187 /usr/lib/python2.7/site-packages/ipaserver/rpcserver.py:270(route)
Let's figure out how to improve vault performance.
simple vault perftest script perftest.py
pstat tottime and cumtime 100 pstat_vault.txt
Related tickets:
== FreeIPA
== Dogtag
Some comments:
vault retrieve is my primary concern. vault add, store and delete are also slow but not as performance critical as retrieval. Let's focus on either retrieval or general vault / IPA client performance improvements first. Once retrieval is fast, we can move to store.
Most vault operations depend on the result of a previous operation. This makes it impossible to benefit from batch operations. It might be possible to batch some operations, but not for vault retrieval.
The SSL operations from WSGI profiling are TLS/SSL connections from IPA framework to Dogtag's REST API. SSL and network operations between IPA client and IPA framework do not show up in the profile because they are handled by Apache and mod_nss. Dogtag's Python API can handle HTTP keep-alive. Internally a pki.client.PKIConnection object holds a requests.Session instance. Python requests automatically performs keep-alive for connections from the same session. The Dogtag plugin does not keep PKIConnection alive between framework calls. It might be feasible to keep the connection alive. I'm a bit worried how this plays along with forking WSGI server (see #6563).
pki.client.PKIConnection
requests.Session
Dogtag supports connection keep-alive (look for Re-using existing connection!):
Re-using existing connection!
# curl --keepalive 5 -v --cacert /etc/ipa/ca.crt https://master.ipa.example:8443 https://master.ipa.example:8443 >/dev/null * Rebuilt URL to: 5/ % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 0.0.0.5... * Immediate connect fail for 0.0.0.5: Invalid argument * Closing connection 0 curl: (7) Couldn't connect to server * Rebuilt URL to: https://master.ipa.example:8443/ * Trying 192.168.121.57... * Connected to master.ipa.example (192.168.121.57) port 8443 (#1) * Initializing NSS with certpath: sql:/etc/pki/nssdb * CAfile: /etc/ipa/ca.crt CApath: none * NSS: client certificate not found (nickname not specified) * ALPN/NPN, server did not agree to a protocol * SSL connection using TLS_RSA_WITH_AES_128_CBC_SHA * Server certificate: * subject: CN=master.ipa.example,O=IPA.EXAMPLE * start date: Jan 09 14:50:35 2017 GMT * expire date: Dec 30 14:50:35 2018 GMT * common name: master.ipa.example * issuer: CN=Certificate Authority,O=IPA.EXAMPLE > GET / HTTP/1.1 > Host: master.ipa.example:8443 > User-Agent: curl/7.47.1 > Accept: */* > < HTTP/1.1 200 OK < Server: Apache-Coyote/1.1 < Set-Cookie: JSESSIONID=3010A8CC145EAA9EAC5069689104CEEC; Path=/; Secure; HttpOnly < Content-Type: text/html;charset=ISO-8859-1 < Content-Length: 3183 < Date: Thu, 09 Feb 2017 09:25:57 GMT < { [3183 bytes data] 100 3183 100 3183 0 0 62232 0 --:--:-- --:--:-- --:--:-- 62232 * Connection #1 to host master.ipa.example left intact * Rebuilt URL to: https://master.ipa.example:8443/ * Found bundle for host master.ipa.example: 0x55a9326923e0 [can pipeline] * Re-using existing connection! (#1) with host master.ipa.example * Connected to master.ipa.example (192.168.121.57) port 8443 (#1) > GET / HTTP/1.1 > Host: master.ipa.example:8443 > User-Agent: curl/7.47.1 > Accept: */* > < HTTP/1.1 200 OK < Server: Apache-Coyote/1.1 < Set-Cookie: JSESSIONID=96885044D4FA5A5C5EAE5550C02AEF3B; Path=/; Secure; HttpOnly < Content-Type: text/html;charset=ISO-8859-1 < Content-Length: 3183 < Date: Thu, 09 Feb 2017 09:25:57 GMT < { [3183 bytes data] 100 3183 100 3183 0 0 2396k 0 --:--:-- --:--:-- --:--:-- 2396k * Connection #1 to host master.ipa.example left intact
https://jiffyclub.github.io/snakeviz/ is another useful tool to analyze cprofile stats. It can only analyzse one file. Multiple profile files can be combined with the pstats module:
>>> import glob, pstats >>> stat = pstats.Stats(*glob.glob('profile/*.prof')) >>> stat.dump_stats('combined.prof')
Marking as critical, but only tickets in blocking should have the priority. Not the full list in comment 1.
Metadata Update from @cheimes: - Issue assigned to someone - Issue set to the milestone: FreeIPA 4.5
Metadata Update from @cheimes: - Custom field affects_doc reset - Custom field blockedby reset - Custom field component reset - Custom field keywords reset - Custom field rhbz reset - Custom field type reset - Issue close_status updated to: None - Issue set to the milestone: None (was: FreeIPA 4.5) - Issue tagged with: integration
Metadata Update from @pvoborni: - Custom field affects_doc reset - Custom field tester adjusted to wanted - Issue set to the milestone: FreeIPA 4.5
Metadata Update from @mbasti: - Issue set to the milestone: FreeIPA 4.5.1 (was: FreeIPA 4.5)
Metadata Update from @pvoborni: - Issue set to the milestone: FreeIPA 4.7 (was: FreeIPA 4.5.1)
Vault performance is currently not on the priority list.
Metadata Update from @cheimes: - Issue priority set to: low (was: important) - Issue set to the milestone: Future Releases (was: FreeIPA 4.7)
Login to comment on this ticket.