From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 104413 invoked by alias); 5 Sep 2018 13:56:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 104401 invoked by uid 89); 5 Sep 2018 13:56:22 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA,KAM_SHORT,SPF_HELO_PASS autolearn=no version=3.3.2 spammy=HDKIM-Filter:Filter, HDKIM-Filter:OpenDKIM, HDKIM-Filter:v2.11.0, H*c:windows-1252 X-HELO: mailout1.w1.samsung.com Received: from mailout1.w1.samsung.com (HELO mailout1.w1.samsung.com) (210.118.77.11) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 05 Sep 2018 13:56:20 +0000 Received: from eucas1p2.samsung.com (unknown [182.198.249.207]) by mailout1.w1.samsung.com (KnoxPortal) with ESMTP id 20180905135618euoutp01b0fcbd20356e95909019359892321c64~RhYqnQyq30637706377euoutp01D for ; Wed, 5 Sep 2018 13:56:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout1.w1.samsung.com 20180905135618euoutp01b0fcbd20356e95909019359892321c64~RhYqnQyq30637706377euoutp01D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1536155778; bh=WXcC+Wx2blbRb0YXy4eKFkDIx/R7kwI7MKfg51B4Ygw=; h=Subject:To:Cc:From:Date:In-Reply-To:References:From; b=bt9zTfvWg9WIuMqb/AjnhDFdce1UABgI7wCU75zbv+aGGXO1nRQ5bJfz4gJrNIAI2 BDHg0VbFOmNDaZxmVNO9EniYVWgEqaIQXEfYnll/hsrbo/ObYT9zc4PV/SVFUe2n8c yoFmcKgAQA2IVmPsXkoSWqX451/o9uhkh65t7Hdo= Received: from eusmges2new.samsung.com (unknown [203.254.199.244]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20180905135617eucas1p16f9af69717f2bd2171c68f43a34e48f2~RhYp9g0rL3237832378eucas1p1O; Wed, 5 Sep 2018 13:56:17 +0000 (GMT) Received: from eucas1p1.samsung.com ( [182.198.249.206]) by eusmges2new.samsung.com (EUCPMTA) with SMTP id 84.E2.04294.180EF8B5; Wed, 5 Sep 2018 14:56:17 +0100 (BST) Received: from eusmtrp1.samsung.com (unknown [182.198.249.138]) by eucas1p2.samsung.com (KnoxPortal) with ESMTPA id 20180905135616eucas1p2afa4bcdd31cbe719c778f87f0863e7d5~RhYpGQ6xx3240232402eucas1p2d; Wed, 5 Sep 2018 13:56:16 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp1.samsung.com (KnoxPortal) with ESMTP id 20180905135616eusmtrp1dbae7dadf5e9a04589d4c981385247cc~RhYo1HpZk1276112761eusmtrp1d; Wed, 5 Sep 2018 13:56:16 +0000 (GMT) Received: from eusmtip1.samsung.com ( [203.254.199.221]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id 8E.9D.04128.080EF8B5; Wed, 5 Sep 2018 14:56:16 +0100 (BST) Received: from [106.109.129.179] (unknown [106.109.129.179]) by eusmtip1.samsung.com (KnoxPortal) with ESMTPA id 20180905135616eusmtip1c1ec48debaea133bcb887fa11b365b15~RhYoiSWOG0062900629eusmtip1z; Wed, 5 Sep 2018 13:56:16 +0000 (GMT) Subject: Re: [PATCH] Frame pointer for arm with THUMB2 mode To: Wilco Dijkstra , GCC Patches Cc: "v.barinov@samsung.com" , nd From: Denis Khalikov Date: Wed, 05 Sep 2018 13:56:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format="flowed" Content-Transfer-Encoding: 7bit Message-Id: <20180905135616eucas1p2afa4bcdd31cbe719c778f87f0863e7d5~RhYpGQ6xx3240232402eucas1p2d@eucas1p2.samsung.com> X-CMS-MailID: 20180905135616eucas1p2afa4bcdd31cbe719c778f87f0863e7d5 X-Msg-Generator: CA X-RootMTR: 20180828004814epcas5p3fbd3af8edd93da263b6780ab3be11fe4 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20180828004814epcas5p3fbd3af8edd93da263b6780ab3be11fe4 References: <20180828104451eucas1p2f1de5376f7a84b71487d21f41b5fdbb5~PBnOkIagS0753307533eucas1p2n@eucas1p2.samsung.com> X-SW-Source: 2018-09/txt/msg00310.txt.bz2 Hi Wilco, thanks for the answer. > Adding support for a frame chain would require an ABI change. It would have to > work across GCC, LLVM, Arm, Thumb-1 and Thumb-2 - not a trivial amount of > effort. Clang already works that way. Please look at this commit: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMCallingConv.td?r1=269459&r2=269458&pathrev=269459 This is an example of code which clang generates with -mthumb -fno-omit-frame-pointer -O2. @ %bb.0: push {r4, r5, r6, r7, lr} add r7, sp, #12 push.w {r8, r9, r10} sub sp, #56 mov r8, r0 movw r0, :lower16:_ZTVN6sensor14sensor_handlerE movt r0, :upper16:_ZTVN6sensor14sensor_handlerE mov r9, r8 adds r0, #8 str r0, [r9], #4 The only difference is clang sets frame pointer to the r7 on the stack instead gcc sets to lr, but it already handles by the Sanitizers. > > So this sounds like the first thing to do is reducing the size of the stack traces. > The default is 30 which is far larger than useful. Using 1 for example should > always be fast (since you can use __builtin_return_address(0)) and still get > the function that called malloc. Also if the unwinder happens to be too slow, > it should be optimized and caching added etc. > If we change the size of the traces to 2, it could be something like this: 0xb42a50a0 is located 0 bytes inside of 88-byte region [0xb42a50a0,0xb42a50f8) freed by thread T0 here: #0 0xb6a35cc7 in __interceptor_free (/usr/lib/libasan.so+0x127cc7) #1 0xb5fa64e3 in ipc::message::unref() (/lib/libsensord-shared.so+0xe4e3) previously allocated by thread T0 here: #0 0xb6a36157 in malloc (/usr/lib/libasan.so+0x128157) #1 0xb2f8852d in accel_device::get_data(unsigned int, sensor_data_t**, int*) (/usr/lib/sensor/libsensor-hal-tm1.so+0x1052d) Instead this: 0xb42250a0 is located 0 bytes inside of 88-byte region [0xb42250a0,0xb42250f8) freed by thread T0 here: #0 0xb6989cff in __interceptor_free (/usr/lib/libasan.so+0x127cff) #1 0xb5fa64e3 in ipc::message::unref() (/lib/libsensord-shared.so+0xe4e3) #2 0xb6efbf47 in sensor::sensor_handler::notify(char const*, sensor_data_t*, int) (/usr/bin/sensord+0x1ef47) #3 0xb6efad43 in sensor::sensor_event_handler::handle(int, unsigned int) (/usr/bin/sensord+0x1dd43) #4 0xb5fa3bbb (/lib/libsensord-shared.so+0xbbbb) #5 0xb62d9a15 in g_main_context_dispatch (/lib/libglib-2.0.so.0+0x91a15) #6 0xb62da2d9 (/lib/libglib-2.0.so.0+0x922d9) #7 0xb62da9a9 in g_main_loop_run (/lib/libglib-2.0.so.0+0x929a9) #8 0xb5fa4e1b in ipc::event_loop::run(int) (/lib/libsensord-shared.so+0xce1b) #9 0xb6eec9a5 in main (/usr/bin/sensord+0xf9a5) #10 0xb5cb663b in __libc_start_main (/lib/libc.so.6+0x1663b) previously allocated by thread T0 here: #0 0xb698a18f in malloc (/usr/lib/libasan.so+0x12818f) #1 0xb2f8852d in accel_device::get_data(unsigned int, sensor_data_t**, int*) (/usr/lib/sensor/libsensor-hal-tm1.so+0x1052d) #2 0xb6ef848f in sensor::physical_sensor_handler::get_data(sensor_data_t**, int*) (/usr/bin/sensord+0x1b48f) #3 0xb6efaa51 in sensor::sensor_event_handler::handle(int, unsigned int) (/usr/bin/sensord+0x1da51) #4 0xb5fa3bbb (/lib/libsensord-shared.so+0xbbbb) #5 0xb62d9a15 in g_main_context_dispatch (/lib/libglib-2.0.so.0+0x91a15) #6 0xb62da2d9 (/lib/libglib-2.0.so.0+0x922d9) #7 0xb62da9a9 in g_main_loop_run (/lib/libglib-2.0.so.0+0x929a9) #8 0xb5fa4e1b in ipc::event_loop::run(int) (/lib/libsensord-shared.so+0xce1b) #9 0xb6eec9a5 in main (/usr/bin/sensord+0xf9a5) #10 0xb5cb663b in __libc_start_main (/lib/libc.so.6+0x1663b) At the first example we lost the full context, from where the control/data flow comes from. > > The issue is that the frame pointer and frame chain always add a large > overhead even when you do not use any sanitizers. This is especially bad > for the proposed patch - you lose much of the benefit of using Thumb-2... > The stack layout like this enables only with compile time flag (-mthumb-fp and works only together with -mthumb and -fno-omit-frame-pointer). It does not affect other codegen. Thanks. On 09/05/2018 03:11 PM, Wilco Dijkstra wrote: > Hi Denis, > >> We are working on applying Address/LeakSanitizer for the full Tizen OS >> distribution. It's about ~1000 packages, ASan/LSan runtime is installed >> to ld.so.preload. As we know ASan/LSan has interceptors for >> allocators/deallocators such as (malloc/realloc/calloc/free) and so on. >> On every allocation from user space program, ASan calls >> GET_STACK_TRACE_MALLOC; >> which unwinds the stack frame, and by default uses frame based stack >> unwinder. So, it requires to build with "-fno-omit-frame-pointer", >> switching it to default unwinder really hits the performance in our case. > > So this sounds like the first thing to do is reducing the size of the stack traces. > The default is 30 which is far larger than useful. Using 1 for example should > always be fast (since you can use __builtin_return_address(0)) and still get > the function that called malloc. Also if the unwinder happens to be too slow, > it should be optimized and caching added etc. > >>> Doing real unwinding is also far more accurate than frame pointer based >>> unwinding (the latter doesn't handle leaf functions correctly, >> entry/exit in >>> non-leaf functions and shrinkwrapped functions - and this breaks >> callgraph >>> profiling). >> >> I agree, but in our case, all interceptors for allocators are >> leaf functions, so the frame based stack unwinder works well for us. > > Yes a frame chain would work for this case. But it's not currently supported. > >> By default we build packages with ("-marm" "-fno-omit-frame-pointer"), >> because need frame based stack unwinder for every allocation, as I said >> before. As we know GCC sets fp to lr on the stack with >> ("-fno-omit-frame-pointer" and "-marm") and I don't really know why. >> But the binary size is bigger than for thumb, so, we cannot use default >> thumb frame pointer and want to reduce binary size for the full >> sanitized image. > > The issue is that the frame pointer and frame chain always add a large > overhead even when you do not use any sanitizers. This is especially bad > for the proposed patch - you lose much of the benefit of using Thumb-2... > > Using normal unwinding means your code runs at full speed and still can be > used by the sanitizer. > >> In other case clang works the same way, as I offered at the patch. >> It has the same issue, but it was fixed at the end of 2017 >> https://bugs.llvm.org/show_bug.cgi?id=18505 (The topics starts from >> discussion about APCS, but it is not the main point.) >> >> Also, unresolved issue related to this >> https://github.com/google/sanitizers/issues/640 > > Adding support for a frame chain would require an ABI change. It would have to > work across GCC, LLVM, Arm, Thumb-1 and Thumb-2 - not a trivial amount of > effort. > > Wilco > >