From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 5F3F63857BB4 for ; Fri, 1 Jul 2022 14:13:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5F3F63857BB4 Received: from mail-ot1-f72.google.com (mail-ot1-f72.google.com [209.85.210.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-237-oullZJYcMY-gpG-GOCZcng-1; Fri, 01 Jul 2022 10:13:06 -0400 X-MC-Unique: oullZJYcMY-gpG-GOCZcng-1 Received: by mail-ot1-f72.google.com with SMTP id l9-20020a05683016c900b00616e99c836dso1236451otr.8 for ; Fri, 01 Jul 2022 07:13:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BkmEmsizFLJRXsOx2ypSyQV/M+WdcbPqeTxyG+R0jl0=; b=GYdvcmQWKIWlGtpbDTNXI+mfdJUwd17r+KuCCQb+wkrDyE8D4PEVCw1sIXVtBaWvdU HHXr2K8oni+Lzq97+V/fIv9sTxlQdxyrQhKbc/5X+7t5PAtmCU6yScXKQjSD1OM3QnHZ wJiENlKUadA/ILnMFsWrNKOFxaqXUAE/mWmVHa0easR8D2VdTIY3nUleIMSpINu9q4+M IgAbO/dkE9wQM3TeOwGZvhMLumC8eRVhe+ZC/ABheQ0Bzzt0nCw4XxhrTDcUsr8MhJ13 o1H+ieGNnJT7440yeCzMDbShcEgkx0BOrLIsreQGEotOeRf2pMHGJ5movIWPHYU83w26 Kzsw== X-Gm-Message-State: AJIora925TPyBRM4ZoHk0EQdTsnR2u/C4mah5WJCtVUay9YOrugPkB96 WqG7qaYKEePkxX36kZUxAILeVbfQoAIg+CnM0YXOl73T0B98YoNrXcZwF9RSZB8FA1AYaO+xyq8 RXG9jn0QzIdmKi6kMTjWg4i1pq+uPKNiLkg== X-Received: by 2002:a05:6870:c38e:b0:ed:f230:1494 with SMTP id g14-20020a056870c38e00b000edf2301494mr9753694oao.36.1656684785487; Fri, 01 Jul 2022 07:13:05 -0700 (PDT) X-Google-Smtp-Source: AGRyM1utNwd5py/h1XlGvL8ItGrq5FegMDGQHysf5HWDPXVhBhYZrcnCniyXnL5idvVFXg3mPXbdLjcv7dl5sqI7eqE= X-Received: by 2002:a05:6870:c38e:b0:ed:f230:1494 with SMTP id g14-20020a056870c38e00b000edf2301494mr9753682oao.36.1656684785254; Fri, 01 Jul 2022 07:13:05 -0700 (PDT) MIME-Version: 1.0 References: <20220629092157.230287-1-aldyh@redhat.com> In-Reply-To: <20220629092157.230287-1-aldyh@redhat.com> From: Aldy Hernandez Date: Fri, 1 Jul 2022 16:12:55 +0200 Message-ID: Subject: Re: [RFC] trailing_wide_ints with runtime variable lengths To: GCC patches Cc: Andrew MacLeod , Richard Sandiford , Jakub Jelinek X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, HTML_MESSAGE, KAM_SHORT, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jul 2022 14:13:13 -0000 FYI...if no one has anything to say, I'd like to formally post this for review. So.... OK for trunk? Aldy On Wed, Jun 29, 2022, 11:22 Aldy Hernandez wrote: > Currently global ranges are stored in SSA_NAME_RANGE_INFO as a pair of > wide_int-like objects along with the nonzero bits. We frequently lose > precision when streaming out our higher resolution iranges. The plan > was always to store the full irange between passes. However, as was > originally discussed eons ago: > > https://gcc.gnu.org/pipermail/gcc-patches/2017-May/475139.html > > ...we need a memory efficient way of saving iranges, preferably using > the trailing_wide_ints idiom. > > The problem with doing so is that trailing_wide_ints assume a > compile-time specified number of elements. For irange, we need to > determine the size at run-time. > > One solution is to adapt trailing_wide_ints such that N is the maximum > number of elements allowed, and allow setting the actual number at > run-time (defaulting to N). The attached patch does this, while > requiring no changes to existing users. > > It uses a byte to store the number of elements in the > trailing_wide_ints control word. The control word is currently a > 16-bit precision, an 8-bit max-length, and the rest is used for > m_len[N]. On a 64-bit architecture, this allows for 5 elements in > m_len without having to use an extra word. With this patch, m_len[] > would be smaller by one byte (4) before consuming the padding. This > shouldn't be a problem as the only users of trailing_wide_ints use N=2 > for NUM_POLY_INT_COEFFS in aarch64, and N=3 for range_info_def. > > For irange, my plan is to use one more word to fit a maximum of 12 > elements (the above 4 plus 8 more). This would allow for 6 pairs of > sub-ranges which would be more than adequate for our needs. In > previous tests we found that 99% of ranges fit within 3-4 pairs. More > precisely, this would allow for 5 pairs, plus the nonzero bits, plus a > spare wide-int for future development. > > Ultimately this means that streaming an irange would consume one more > word than what we currently do for range_info_def. IMO this is a nice > trade-off considering we started storing a slew of wide-ints directly > ;-). > > I'm not above rolling an altogether different approach, but would > prefer to use the existing trailing infrastructure since it's mostly > what we need. > > Thoughts? > > p.s. Tested and benchmarked on x86-64 Linux. There was no discernible > performance change in our benchmark suite. > > gcc/ChangeLog: > > * wide-int.h (struct trailing_wide_ints): Add m_num_elements. > (trailing_wide_ints::set_precision): Add num_elements argument. > (trailing_wide_ints::extra_size): Same. > --- > gcc/wide-int.h | 42 +++++++++++++++++++++++++++++------------- > 1 file changed, 29 insertions(+), 13 deletions(-) > > diff --git a/gcc/wide-int.h b/gcc/wide-int.h > index 8041b6104f9..f68ccf0a0c5 100644 > --- a/gcc/wide-int.h > +++ b/gcc/wide-int.h > @@ -1373,10 +1373,13 @@ namespace wi > : public int_traits {}; > } > > -/* An array of N wide_int-like objects that can be put at the end of > - a variable-sized structure. Use extra_size to calculate how many > - bytes beyond the sizeof need to be allocated. Use set_precision > - to initialize the structure. */ > +/* A variable-lengthed array of wide_int-like objects that can be put > + at the end of a variable-sized structure. The number of objects is > + at most N and can be set at runtime by using set_precision(). > + > + Use extra_size to calculate how many bytes beyond the > + sizeof need to be allocated. Use set_precision to initialize the > + structure. */ > template > struct GTY((user)) trailing_wide_ints > { > @@ -1387,6 +1390,9 @@ private: > /* The shared maximum length of each number. */ > unsigned char m_max_len; > > + /* The number of elements. */ > + unsigned char m_num_elements; > + > /* The current length of each number. > Avoid char array so the whole structure is not a typeless storage > that will, in turn, turn off TBAA on gimple, trees and RTL. */ > @@ -1399,12 +1405,15 @@ private: > public: > typedef WIDE_INT_REF_FOR (trailing_wide_int_storage) const_reference; > > - void set_precision (unsigned int); > + void set_precision (unsigned int precision, unsigned int num_elements = > N); > unsigned int get_precision () const { return m_precision; } > + unsigned int num_elements () const { return m_num_elements; } > trailing_wide_int operator [] (unsigned int); > const_reference operator [] (unsigned int) const; > - static size_t extra_size (unsigned int); > - size_t extra_size () const { return extra_size (m_precision); } > + static size_t extra_size (unsigned int precision, > + unsigned int num_elements = N); > + size_t extra_size () const { return extra_size (m_precision, > + m_num_elements); } > }; > > inline trailing_wide_int_storage:: > @@ -1457,11 +1466,14 @@ trailing_wide_int_storage::operator = (const T &x) > } > > /* Initialize the structure and record that all elements have precision > - PRECISION. */ > + PRECISION. NUM_ELEMENTS can be no more than N. */ > template > inline void > -trailing_wide_ints ::set_precision (unsigned int precision) > +trailing_wide_ints ::set_precision (unsigned int precision, > + unsigned int num_elements) > { > + gcc_checking_assert (num_elements <= N); > + m_num_elements = num_elements; > m_precision = precision; > m_max_len = ((precision + HOST_BITS_PER_WIDE_INT - 1) > / HOST_BITS_PER_WIDE_INT); > @@ -1484,15 +1496,19 @@ trailing_wide_ints ::operator [] (unsigned int > index) const > m_len[index].len, m_precision); > } > > -/* Return how many extra bytes need to be added to the end of the > structure > - in order to handle N wide_ints of precision PRECISION. */ > +/* Return how many extra bytes need to be added to the end of the > + structure in order to handle N wide_ints of precision PRECISION. > + NUM_ELEMENTS is the number of elements, or N if none specified. */ > template > inline size_t > -trailing_wide_ints ::extra_size (unsigned int precision) > +trailing_wide_ints ::extra_size (unsigned int precision, > + unsigned int num_elements) > { > unsigned int max_len = ((precision + HOST_BITS_PER_WIDE_INT - 1) > / HOST_BITS_PER_WIDE_INT); > - return (N * max_len - 1) * sizeof (HOST_WIDE_INT); > + if (num_elements > N) > + num_elements = N; > + return (num_elements * max_len - 1) * sizeof (HOST_WIDE_INT); > } > > /* This macro is used in structures that end with a trailing_wide_ints > field > -- > 2.36.1 > >