From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by sourceware.org (Postfix) with ESMTPS id 32EB23858D28 for ; Mon, 17 Oct 2022 13:17:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 32EB23858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-x534.google.com with SMTP id u21so15978118edi.9 for ; Mon, 17 Oct 2022 06:17:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Q92DNWjmLT1Z+iy72Ty1jpFHl3rezCRTB3ha+1NBVjM=; b=k7v0NKky5Gam037n0bz6X5bkAwheSUoNiGPrQpmmsxMwx1z4FZfmd04o5hKAPeaUAs Yl/JgIyQ+ia/435yGeDkYEAKxKNqc+H02aUrp6+dIdrinBgFIB1dQCpYrbw4XmFAyGhI sHknNs5NNws1IZN2+fKnS1T2rbvBfYXrkpIrHGN1VRag/FpbD+iPIzGdAwUqlws7lNN1 fdBpd76K1xmrKhUlEjj4g5xa6ntR1mK3vUDY1FCrRl+ZHZlB2Cl90jJB0nwik2WV6+Pg YZze+8S5K7uxfNu8twBjRCyldCvbacVNO0AQf+93xA/bFk5tpZcl43L+CMtNdCQJec1W PsiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Q92DNWjmLT1Z+iy72Ty1jpFHl3rezCRTB3ha+1NBVjM=; b=zqaBDagfCUfLzgUh2Zec/YBt8qQJoZ2qE3UmMl+u2lWEXsZfhbgBmf+YXl5MPAbuFz 439Yiop8llfTPcB0nUtQap5rBycJpFN+9V8Z89QvRKVU82bhX2Zv2xmauhfBZF30wH/5 HktUsTQOUZYvmKjFEzeLnGAF2NPbXIYmW1RBZ/gtTpU7SNgah9hUb8yydFSrrYRvEjU0 /a/fji5uKrAVutA1eeScuc9dZ8A1sc4nRWcqXayO8o1IBet+EFKbAEWyn6wSQvFfethl 4BTHXCESYewdnDgQbiKvWuNFzU0sI6bCqIO107COpCVfdu1MgcKPSE+guGCKJbY5Ao86 VRiQ== X-Gm-Message-State: ACrzQf3wFi5oT0scyjv2NHnzN2QaQ8DSZhqB3edp9tMZFA6L/yTLruHy ftjQKMBApmX1mafEevBuQGUOBy2uNufHTVMiJmU= X-Google-Smtp-Source: AMsMyM5Nj5pehMm3foGXZMIMb5awAgNmTaxcwh6rPx6Lj9QOm/Bz2uzlgJCPiVZi8dCwkwmrCZNneB6P0w8R0WLObmk= X-Received: by 2002:a05:6402:3509:b0:45d:c25b:b80e with SMTP id b9-20020a056402350900b0045dc25bb80emr508718edd.250.1666012653699; Mon, 17 Oct 2022 06:17:33 -0700 (PDT) MIME-Version: 1.0 References: <87edv6mwp5.fsf@oldenburg.str.redhat.com> In-Reply-To: <87edv6mwp5.fsf@oldenburg.str.redhat.com> From: Richard Biener Date: Mon, 17 Oct 2022 15:17:21 +0200 Message-ID: Subject: Re: [PATCH] libgcc: Mostly vectorize CIE encoding extraction for FDEs To: Florian Weimer Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Oct 17, 2022 at 3:01 PM Florian Weimer via Gcc-patches wrote: > > "zR" and "zPLR" are the most common augmentations. Use a simple > SIMD-with-in-a-register technique to check for both augmentations, > and that the following variable-length integers have length 1, to > get more quickly at the encoding field. > > libgcc/ > > * unwind-dw2-fde.c (get_cie_encoding_slow): Rename from > get_cie_encoding. Mark as noinline. > (get_cie_encoding): Add fast path for "zR" and "zPLR" > augmentations. Call get_cie_encoding_slow as a fall-back. > > --- > libgcc/unwind-dw2-fde.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 59 insertions(+), 2 deletions(-) > > diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c > index 3c0cc654ec0..4e3a54c5a1a 100644 > --- a/libgcc/unwind-dw2-fde.c > +++ b/libgcc/unwind-dw2-fde.c > @@ -333,8 +333,10 @@ base_from_object (unsigned char encoding, const struct object *ob) > /* Return the FDE pointer encoding from the CIE. */ > /* ??? This is a subset of extract_cie_info from unwind-dw2.c. */ > > -static int > -get_cie_encoding (const struct dwarf_cie *cie) > +/* Disable inlining because the function is only used as a slow path in > + get_cie_encoding below. */ > +static int __attribute__ ((noinline)) > +get_cie_encoding_slow (const struct dwarf_cie *cie) > { > const unsigned char *aug, *p; > _Unwind_Ptr dummy; > @@ -389,6 +391,61 @@ get_cie_encoding (const struct dwarf_cie *cie) > } > } > > +static inline int > +get_cie_encoding (const struct dwarf_cie *cie) > +{ > + /* Fast path for some augmentations and single-byte variable-length > + integers. Do this only for targets that align struct dwarf_cie to 8 > + bytes, which ensures that at least 8 bytes are available starting at > + cie->version. */ > +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ \ > + || __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ > + if (__alignof (*cie) == 8 && sizeof (unsigned long long) == 8) > + { > + unsigned long long value = *(const unsigned long long *) &cie->version; TBAA? Maybe use unsigned long long value; memcpy (&value, &cie->version, 8); instead? > + > +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ > +#define C(x) __builtin_bswap64 (x) > +#else > +#define C(x) x > +#endif > + > + /* Fast path for "zR". Check for version 1, the "zR" string and that > + the sleb128/uleb128 values are single bytes. In the comments > + below, '1', 'c', 'd', 'r', 'l' are version, code alignment, data > + alignment, return address column, augmentation length. Note that > + with CIE version 1, the return address column is byte-encoded. */ > + unsigned long long expected = > + /* 1 z R 0 c d r l. */ > + C (0x017a520000000000ULL); > + unsigned long long mask = > + /* 1 z R 0 c d r l. */ > + C (0xffffffff80800080ULL); > + > + if ((value & mask) == expected) > + return cie->augmentation[7]; > + > + /* Fast path for "zPLR". */ > + expected = > + /* 1 z P L R 0 c d. */ > + C (0x017a504c52000000ULL); > + mask = > + /* 1 z P L R 0 c d. */ > + C (0xffffffffffff8080ULL); > +#undef C > + > + /* Validate the augmentation length, and return the enconding after > + it. No check for the return address column because it is > + byte-encoded with CIE version 1. */ > + if (__builtin_expect ((value & mask) == expected > + && (cie->augmentation[8] & 0x80) == 0, 1)) > + return cie->augmentation[9]; > + } > +#endif > + > + return get_cie_encoding_slow (cie); > +} > + > static inline int > get_fde_encoding (const struct dwarf_fde *f) > { > > base-commit: de84a1e4b107b803ec3b064c3771a6ed8c0e201e >