diff options
author | Wilco Dijkstra <wdijkstr@arm.com> | 2017-10-17 18:43:31 +0100 |
---|---|---|
committer | Wilco Dijkstra <wdijkstr@arm.com> | 2017-10-17 18:43:31 +0100 |
commit | e956075a5a2044d05ce48b905b10270ed4a63e87 (patch) | |
tree | 13682cd2c0d29bd7557cfc7c1a446c4d5cd0fc5f /README | |
parent | e4dd4ace56880d2f1064cd787e2bdb96ddacc3c4 (diff) | |
download | glibc-e956075a5a2044d05ce48b905b10270ed4a63e87.tar glibc-e956075a5a2044d05ce48b905b10270ed4a63e87.tar.gz glibc-e956075a5a2044d05ce48b905b10270ed4a63e87.tar.bz2 glibc-e956075a5a2044d05ce48b905b10270ed4a63e87.zip |
Use relaxed atomics for malloc have_fastchunks
Currently free typically uses 2 atomic operations per call. The have_fastchunks
flag indicates whether there are recently freed blocks in the fastbins. This
is purely an optimization to avoid calling malloc_consolidate too often and
avoiding the overhead of walking all fast bins even if all are empty during a
sequence of allocations. However using catomic_or to update the flag is
completely unnecessary since it can be changed into a simple boolean and
accessed using relaxed atomics. There is no change in multi-threaded behaviour
given the flag is already approximate (it may be set when there are no blocks in
any fast bins, or it may be clear when there are free blocks that could be
consolidated).
Performance of malloc/free improves by 27% on a simple benchmark on AArch64
(both single and multithreaded). The number of load/store exclusive instructions
is reduced by 33%. Bench-malloc-thread speeds up by ~3% in all cases.
* malloc/malloc.c (FASTCHUNKS_BIT): Remove.
(have_fastchunks): Remove.
(clear_fastchunks): Remove.
(set_fastchunks): Remove.
(malloc_state): Add have_fastchunks.
(malloc_init_state): Use have_fastchunks.
(do_check_malloc_state): Remove incorrect invariant checks.
(_int_malloc): Use have_fastchunks.
(_int_free): Likewise.
(malloc_consolidate): Likewise.
Diffstat (limited to 'README')
0 files changed, 0 insertions, 0 deletions