From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 108183 invoked by alias); 20 May 2015 20:02:27 -0000 Mailing-List: contact binutils-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sourceware.org Received: (qmail 108144 invoked by uid 89); 20 May 2015 20:02:27 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-lb0-f177.google.com Received: from mail-lb0-f177.google.com (HELO mail-lb0-f177.google.com) (209.85.217.177) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 20 May 2015 20:02:25 +0000 Received: by lbbzk7 with SMTP id zk7so2775551lbb.0 for ; Wed, 20 May 2015 13:02:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=kzyHV6E0YMqJSXVRg1ytvi9GZvJymJYb9xCRp0Nimvc=; b=Yv105fok3ExZijc5AHJncg6xqtTwlUkwPKiDQZzOMHlYO+wE4cNuJjVe0jzn8TrhFv DbZKGwaQ48wGuUoroxpoEtlbGkA4milvE5i+QznKta1cn+XPzclJwS/QyoGoOUxjHzqb XKjxbnUjFdJ7fd9IjWbiiyG57KfRIssucpyZcA0KdCo3HLEFCz5j1ujCT54YonQHI98I VAOjwOHZSntcnyxwgX57jGrOqNu0QZl6OnRNHo96tXq7veyyIr9BhghQ2paGZgXJRsmS 0yiOcG8kSKPfZ9kyExjgFodzjaQMiufY8jVN0ekM8wpPcjUlxLB1tZyNpDwiJxxekOYw 8uBg== X-Gm-Message-State: ALoCoQmmgAiCO5ogZHw3HnYWVum6+DvTuVpFdQtkV+Je+CEAbPnJc4AxpH+vno83pQ0WygYA5qK2 X-Received: by 10.152.121.42 with SMTP id lh10mr27778055lab.0.1432152141254; Wed, 20 May 2015 13:02:21 -0700 (PDT) MIME-Version: 1.0 Received: by 10.152.170.233 with HTTP; Wed, 20 May 2015 13:02:00 -0700 (PDT) In-Reply-To: References: From: Andy Lutomirski Date: Wed, 20 May 2015 20:02:00 -0000 Message-ID: Subject: Re: RFC: Add -mshared option to x86 ELF assembler To: "H.J. Lu" Cc: "H. Peter Anvin" , Jan Beulich , Binutils , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2015-05/txt/msg00197.txt.bz2 On Wed, May 13, 2015 at 5:59 AM, H.J. Lu wrote: > On Wed, May 13, 2015 at 4:50 AM, H.J. Lu wrote: >> On Tue, May 12, 2015 at 5:14 PM, H.J. Lu wrote: >>> On Fri, May 8, 2015 at 1:16 PM, H.J. Lu wrote: >>>> On Fri, May 8, 2015 at 5:09 AM, H.J. Lu wrote: >>>>> On Thu, May 7, 2015 at 8:22 PM, Andy Lutomirski wrote: >>>>>> On Thu, May 7, 2015 at 9:21 AM, H.J. Lu wrote: >>>>>>> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich wrote: >>>>>>>>>>> On 07.05.15 at 08:02, wrote: >>>>>>>>> AFAICT gas will produce relocations for jumps to global labels in the >>>>>>>>> same file. This doesn't seem directly harmful to me, except that, on >>>>>>>>> x86, it forces five-byte jumps instead of two-byte jumps. >>>>>>>>> >>>>>>>>> This seems especially unfortunate, since even hidden and protected >>>>>>>>> symbols have this problem. >>>>>>>>> >>>>>>>>> Given that many users don't want interposition support (especially the >>>>>>>>> kernel and anyone using .hidden or .protected), it would be nice to >>>>>>>>> have a command-line option to turn this off and probably also to turn >>>>>>>>> it off by default for hidden and protected symbols. Can gas do this? >>>>>>>> >>>>>>>> I've been running with the below changes (taken off of a bigger set >>>>>>>> of changes, so the line numbers may look a little odd) for the last >>>>>>>> couple of years. I never tried to submit this change because so far >>>>>>>> I couldn't find the time to check whether this would have any >>>>>>>> unwanted side effects on cases I don't normally use. >>>>>>>> >>>>>>> >>>>>>> This is the patch I checked in. >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> -- >>>>>>> H.J. >>>>>>> --- >>>>>>> Branches to global non-weak symbols defined in the same segment with >>>>>>> non-default visibility can be optimized the same way as branches to >>>>>>> local symbols. >>>>>> >>>>>> Would it make sense to also add a command line option along the lines >>>>>> of gcc's -fno-semantic-interposition or some way to override the >>>>>> default visibility? AFAICS this patch helps but only if asm code gets >>>>>> liberally sprinkled with .hidden or .protected directives. >>>>>> >>>>> >>>>> This is what I checked in. With >>>>> >>>>> diff --git a/arch/x86/Makefile b/arch/x86/Makefile >>>>> index 2fda005..186e6f7 100644 >>>>> --- a/arch/x86/Makefile >>>>> +++ b/arch/x86/Makefile >>>>> @@ -107,6 +107,10 @@ else >>>>> KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args) >>>>> endif >>>>> >>>>> +NO_SHARED_CFLAGS = $(call as-option,-Wa$(comma)-mno-shared) >>>>> +KBUILD_CFLAGS += $(NO_SHARED_CFLAGS) >>>>> +KBUILD_AFLAGS += $(NO_SHARED_CFLAGS) >>>>> + >>>>> # Make sure compiler does not have buggy stack-protector support. >>>>> ifdef CONFIG_CC_STACKPROTECTOR >>>>> cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh >>>>> >>>>> On kernel master branch, I got >>>>> >>>>> text data bss dec hex filename >>>>> 10934167 2275232 1609728 14819127 e21f37 vmlinux.old >>>>> 10934119 2275232 1609728 14819079 e21f07 vmlinux >>>>> >>>>> It saves 48 bytes. >>>> >>>> This is before I fixed: >>>> >>>> /* This is global to keep gas from relaxing the jumps */ >>>> ENTRY(early_idt_handler) >>>> cld >>>> >>>> in arch/x86/kernel/head_64.S. With -mno-shared, we must >>>> make early_idt_handler weak to keep gas from relaxing the jumps. >>>> >>> >>> Here is a patch to change the assembler default to optimize out >>> relocations to defined non-weak global branch targets with default >>> visibility. It will generate slightly smaller object files. But Linux >>> kernel will be broken unless early_idt_handler is marked weak. >>> I am little uncomfortable with -mshare and I don't like -mno-shared >>> very much either. I may just simply remove -mno-shared. >>> >> >> I reverted the -mno-shared change. >> > > Here is a patch to add -mshared, which is off by default. On Linux kernel > with this change: > > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S > index a468c0a..9a10e05 100644 > --- a/arch/x86/kernel/head_64.S > +++ b/arch/x86/kernel/head_64.S > @@ -339,8 +339,8 @@ early_idt_handlers: > i = i + 1 > .endr > > -/* This is global to keep gas from relaxing the jumps */ > -ENTRY(early_idt_handler) > +/* This is weak to keep gas from relaxing the jumps */ > +WEAK(early_idt_handler) > cld > > cmpl $2,(%rsp) # X86_TRAP_NMI > -- > 2.1.0 > > I got > > [hjl@gnu-tools-1 kernel.org]$ readelf -r old/vmlinux.o | head -5 > > Relocation section '.rela.text' at offset 0xafea2f0 contains 205717 entries: > Offset Info Type Sym. Value Sym. Name + Addend > 000000000001 1253100000002 R_X86_64_PC32 0000000000001e70 __fentry__ - 4 > 000000000009 1c8c00000002 R_X86_64_PC32 0000000000000000 .data + 51bc > [hjl@gnu-tools-1 kernel.org]$ readelf -r new/vmlinux.o | head -5 > > Relocation section '.rela.text' at offset 0xafea280 contains 205711 entries: > Offset Info Type Sym. Value Sym. Name + Addend > 000000000001 1253100000002 R_X86_64_PC32 0000000000001e70 __fentry__ - 4 > 000000000009 1c8c00000002 R_X86_64_PC32 0000000000000000 .data + 51bc > [hjl@gnu-tools-1 kernel.org]$ > > It removes 6 relocations. On gcc master branch, > > [hjl@gnu-tools-1 gcc-misc]$ size build-x86_64-linux*/gcc/cc1 > text data bss dec hex filename > 21529621 62256 1348312 22940189 15e0a1d build-x86_64-linux.branch/gcc/cc1 > 21529749 62256 1348312 22940317 15e0a9d build-x86_64-linux/gcc/cc1 > [hjl@gnu-tools-1 gcc-misc]$ size build-x86_64-linux*/gcc/cc1plus > text data bss dec hex filename > 23713509 62400 1372760 25148669 17fbcfd build-x86_64-linux.branch/gcc/cc1plus > 23713669 62400 1372760 25148829 17fbd9d build-x86_64-linux/gcc/cc1plus > [hjl@gnu-tools-1 gcc-misc]$ > > It is more effective. I will run more tests. This seems like a sensible idea, but I can imagine it breaking some weird use cases (like that one Linux thing). Is that okay? --Andy