From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oo1-xc30.google.com (mail-oo1-xc30.google.com [IPv6:2607:f8b0:4864:20::c30]) by sourceware.org (Postfix) with ESMTPS id A07843858D28 for ; Fri, 19 May 2023 10:35:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A07843858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-oo1-xc30.google.com with SMTP id 006d021491bc7-552b65f527dso1671946eaf.3 for ; Fri, 19 May 2023 03:35:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684492541; x=1687084541; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Kc/CKHdP93w4v0hd5UxrHHRkA/g4qHVcCI7A/l8OzXw=; b=ZLdMUKHpssnx2e/XEClNsCwgEfR/XoTvgfe9ZFOthMpW3odJ4DZNsjr1gyOwb+JwgE fb60QGyV03iZkk3sU3lMJm2FJsjXVYM4VAACPvk5v1cMRayJZzQ380jcN8I3yupSM+1d xLL3y2KbjKcwJA+nknLjaLj+eMSo6VA6ZDV63NsmUSrG9UgT7mr5WwCjqNnkHDs87408 fWvdNWgNA+VMcum91i+BJ8Xm2B09Su5HxNTSMYpWLlmNGB9hiaEXmV7sfGInGXZnwxfF ox9XttrF9spcszjn6Wmrs+ImDsihwhwfymIAEooBlhEZgEIuuUD/hl8+dFyZxg3hgboE 3rIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684492541; x=1687084541; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Kc/CKHdP93w4v0hd5UxrHHRkA/g4qHVcCI7A/l8OzXw=; b=hNQQYhMafRNr/LMV5Qfzk3/nfK8zs13FnnJy1SI9y1FhOHWceHZwKOlLSyczvUlkIr Gd0WW435M2cnEJb/O653v7SLkTO5HwGKLD7dKlozZ5Dfik8Qrjnqi6OfmMQzOVqqMNn/ XYLs2F8yixkNG4fNpzvki7NUHV4tpFPCGqc+jHmAL+u8VmkcJ5tNow0oG82geF2uGmqt wfqZKuMh9aLUNxbBOHNpxPLbOACwS/XAijDPc7dXA+q5EpasFX3eYVmRKtiTM0uR38D9 Ur3TbongZaqSXTYXJ33fHwu2v+odx2ms9IEHQ7b+lkP+R/DjrvY0m+N213wskkjuk9u/ deyw== X-Gm-Message-State: AC+VfDyvekSClq9XTLs1msdaNz21354ZLbb2GbMGH4ZmsHVkybc/sUPY bQInYxmAz7XL7n1OBrNKJJGPkUWeWi4zpxSr0ZUWehNTr4s= X-Google-Smtp-Source: ACHHUZ6k8paLK1FG5LcwZs8gYAUdJ98Gsgec5QjijdBx6QW5Dy4cqu8H5m7ajCHnPoTfV42CB17rwc7vqEdfzj9Uyao= X-Received: by 2002:a4a:9191:0:b0:54f:5b4c:d925 with SMTP id d17-20020a4a9191000000b0054f5b4cd925mr466874ooh.7.1684492540872; Fri, 19 May 2023 03:35:40 -0700 (PDT) MIME-Version: 1.0 References: <20230518170648.93316-1-bugaevc@gmail.com> <98620b0e-7251-3781-2935-ce058a5953dc@linaro.org> In-Reply-To: <98620b0e-7251-3781-2935-ce058a5953dc@linaro.org> From: Sergey Bugaev Date: Fri, 19 May 2023 13:35:29 +0300 Message-ID: Subject: Re: [PATCH v2] Mark more functions as __COLD To: Adhemerval Zanella Netto Cc: libc-alpha@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, May 18, 2023 at 10:43=E2=80=AFPM Adhemerval Zanella Netto wrote: > The rationale seems ok, some comments below. Thanks. Any thoughts on the .text.{startup,exit} part? > > -void > > +void __COLD > > __libc_fatal (const char *message) > > { > > _dl_fatal_printf ("%s", message); > > } > > rtld_hidden_def (__libc_fatal) > > > > Can't you just add on the function prototype at include/stdio.h? Same > question for the __assert_fail and __assert_perror_fail below. But I did just that (added __COLD to the prototypes in include/stdio.h and include/assert.h), didn't I? If you're saying that it's not worth repeating __COLD on the definition, then sure, I could remove that if you prefer. > > +/* Intentionally not marked __COLD in the header, since this only caus= es GCC > > + to create a bunch of useless __foo_chk.cold symbols containing only= a call > > + to this function; better just keep calling it directly. */ > > extern void __chk_fail (void) __attribute__ ((__noreturn__)); > > libc_hidden_proto (__chk_fail) > > rtld_hidden_proto (__chk_fail) > > Why exactly gcc generates the useless __foo_chk.cold for this case? Is th= is a > bug or a limitation? I don't know; your guess is as good as mine (actually yours would be better than mine). But my guess would be that they just didn't think to add a check that whatever code size savings they're getting by moving the cold path into a separate section outweigh the jump instruction to get there. Here's what I'm getting specifically, on i686-gnu: Dump of assembler code for function __ppoll_chk: Address range 0x198760 to 0x19879e: 0x00198760 <+0>: 56 push %esi 0x00198761 <+1>: 53 push %ebx 0x00198762 <+2>: 83 ec 04 sub $0x4,%esp 0x00198765 <+5>: 8b 44 24 20 mov 0x20(%esp),%eax 0x00198769 <+9>: 8b 54 24 14 mov 0x14(%esp),%edx 0x0019876d <+13>: 8b 4c 24 10 mov 0x10(%esp),%ecx 0x00198771 <+17>: 8b 5c 24 18 mov 0x18(%esp),%ebx 0x00198775 <+21>: c1 e8 03 shr $0x3,%eax 0x00198778 <+24>: 8b 74 24 1c mov 0x1c(%esp),%esi 0x0019877c <+28>: 39 d0 cmp %edx,%eax 0x0019877e <+30>: 0f 82 9d bb e8 ff jb 0x24321 <__ppoll_chk.cold> 0x00198784 <+36>: 89 74 24 1c mov %esi,0x1c(%esp) 0x00198788 <+40>: 89 5c 24 18 mov %ebx,0x18(%esp) 0x0019878c <+44>: 89 54 24 14 mov %edx,0x14(%esp) 0x00198790 <+48>: 89 4c 24 10 mov %ecx,0x10(%esp) 0x00198794 <+52>: 83 c4 04 add $0x4,%esp 0x00198797 <+55>: 5b pop %ebx 0x00198798 <+56>: 5e pop %esi 0x00198799 <+57>: e9 b2 b9 fb ff jmp 0x154150 <__GI_ppoll> Address range 0x24321 to 0x24326: 0x00024321 <-1524799>: e8 5c ff ff ff call 0x24282 <__GI___chk_fai= l> End of assembler dump. It's spending 6 bytes for the 'jb __ppoll_chk.cold', only to jump to 'call __GI___chk_fail' which takes 5 bytes. That's negative space savings, both overall and inside .text. And actually frankly that's bad codegen altogether, unless I'm missing something. Why not mov 20(%esp), %eax shr $3, %eax cmp 8(%esp), %eax jnb __GI_ppoll push %ebp mov %esp, %ebp call __GI___chk_fail Then maybe it'd make sense to move the "push, mov, call" into .text.unlikely, adding a jmp. Sergey