From fff94fa2245612191123a8015eac94eb04f001e2 Mon Sep 17 00:00:00 2001 From: Siddhesh Poyarekar Date: Tue, 19 May 2015 06:40:37 +0530 Subject: Avoid deadlock in malloc on backtrace (BZ #16159) When the malloc subsystem detects some kind of memory corruption, depending on the configuration it prints the error, a backtrace, a memory map and then aborts the process. In this process, the backtrace() call may result in a call to malloc, resulting in various kinds of problematic behavior. In one case, the malloc it calls may detect a corruption and call backtrace again, and a stack overflow may result due to the infinite recursion. In another case, the malloc it calls may deadlock on an arena lock with the malloc (or free, realloc, etc.) that detected the corruption. In yet another case, if the program is linked with pthreads, backtrace may do a pthread_once initialization, which deadlocks on itself. In all these cases, the program exit is not as intended. This is avoidable by marking the arena that malloc detected a corruption on, as unusable. The following patch does that. Features of this patch are as follows: - A flag is added to the mstate struct of the arena to indicate if the arena is corrupt. - The flag is checked whenever malloc functions try to get a lock on an arena. If the arena is unusable, a NULL is returned, causing the malloc to use mmap or try the next arena. - malloc_printerr sets the corrupt flag on the arena when it detects a corruption - free does not concern itself with the flag at all. It is not important since the backtrace workflow does not need free. A free in a parallel thread may cause another corruption, but that's not new - The flag check and set are not atomic and may race. This is fine since we don't care about contention during the flag check. We want to make sure that the malloc call in the backtrace does not trip on itself and all that action happens in the same thread and not across threads. I verified that the test case does not show any regressions due to this patch. I also ran the malloc benchmarks and found an insignificant difference in timings (< 2%). * malloc/Makefile (tests): New test case tst-malloc-backtrace. * malloc/arena.c (arena_lock): Check if arena is corrupt. (reused_arena): Find a non-corrupt arena. (heap_trim): Pass arena to unlink. * malloc/hooks.c (malloc_check_get_size): Pass arena to malloc_printerr. (top_check): Likewise. (free_check): Likewise. (realloc_check): Likewise. * malloc/malloc.c (malloc_printerr): Add arena argument. (unlink): Likewise. (munmap_chunk): Adjust. (ARENA_CORRUPTION_BIT): New macro. (arena_is_corrupt): Likewise. (set_arena_corrupt): Likewise. (sysmalloc): Use mmap if there are no usable arenas. (_int_malloc): Likewise. (__libc_malloc): Don't fail if arena_get returns NULL. (_mid_memalign): Likewise. (__libc_calloc): Likewise. (__libc_realloc): Adjust for additional argument to malloc_printerr. (_int_free): Likewise. (malloc_consolidate): Likewise. (_int_realloc): Likewise. (_int_memalign): Don't touch corrupt arenas. * malloc/tst-malloc-backtrace.c: New test case. --- NEWS | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) (limited to 'NEWS') diff --git a/NEWS b/NEWS index bfeb3a78f6..877ea34809 100644 --- a/NEWS +++ b/NEWS @@ -9,16 +9,16 @@ Version 2.22 * The following bugs are resolved with this release: - 4719, 6792, 13064, 14094, 14841, 14906, 15319, 15467, 15790, 15969, 16339, - 16351, 16352, 16512, 16560, 16704, 16783, 16850, 17090, 17195, 17269, - 17523, 17542, 17569, 17588, 17596, 17620, 17621, 17628, 17631, 17692, - 17711, 17715, 17776, 17779, 17792, 17836, 17912, 17916, 17930, 17932, - 17944, 17949, 17964, 17965, 17967, 17969, 17978, 17987, 17991, 17996, - 17998, 17999, 18007, 18019, 18020, 18029, 18030, 18032, 18036, 18038, - 18039, 18042, 18043, 18046, 18047, 18068, 18080, 18093, 18100, 18104, - 18110, 18111, 18125, 18128, 18138, 18185, 18196, 18197, 18206, 18210, - 18211, 18217, 18220, 18221, 18247, 18287, 18319, 18333, 18346, 18397, - 18409, 18418. + 4719, 6792, 13064, 14094, 14841, 14906, 15319, 15467, 15790, 15969, 16159, + 16339, 16351, 16352, 16512, 16560, 16704, 16783, 16850, 17090, 17195, + 17269, 17523, 17542, 17569, 17588, 17596, 17620, 17621, 17628, 17631, + 17692, 17711, 17715, 17776, 17779, 17792, 17836, 17912, 17916, 17930, + 17932, 17944, 17949, 17964, 17965, 17967, 17969, 17978, 17987, 17991, + 17996, 17998, 17999, 18007, 18019, 18020, 18029, 18030, 18032, 18036, + 18038, 18039, 18042, 18043, 18046, 18047, 18068, 18080, 18093, 18100, + 18104, 18110, 18111, 18125, 18128, 18138, 18185, 18196, 18197, 18206, + 18210, 18211, 18217, 18220, 18221, 18247, 18287, 18319, 18333, 18346, + 18397, 18409, 18418. * Cache information can be queried via sysconf() function on s390 e.g. with _SC_LEVEL1_ICACHE_SIZE as argument. -- cgit v1.2.3