From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 52436 invoked by alias); 5 Oct 2017 19:40:29 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 50842 invoked by uid 89); 5 Oct 2017 19:40:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.8 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=ham version=3.3.2 spammy=REF, inclusive, Dec, LIVE X-HELO: mail-ua0-f177.google.com Received: from mail-ua0-f177.google.com (HELO mail-ua0-f177.google.com) (209.85.217.177) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 05 Oct 2017 19:40:26 +0000 Received: by mail-ua0-f177.google.com with SMTP id b11so9832825uae.12 for ; Thu, 05 Oct 2017 12:40:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=37wXn8GcgxKIa1huGcS5I5O1cADRHWsiADlY2uMVKl8=; b=qy6EjZNMHmVEvJAUct3TubQ3Ms/NNSBelyB7a6N5tbIrDTC/im/OPhsPA4UhtNxA3H cN47eLZk3bd2fgQgvNamVDePEVVkEjszziR2N6dw8+knU6UjShAN6zSdxCHZmtvWRXkN w9DIlbee3uznw0Dr7sKfWvkzkfAjOYYD4v5F1otw6rvI9iH65DJEcdS03rYLLsy0L7xV X9hrwFq7kqTuy+6/9gkUN5DzkKYzDPsvudsFlgKIRNgI+ws5+Siu0zW2TsKTR9u24QDq sF/9CH+5xFLQVnpTQ0GyOsLaW2sbmnhRpIZgG28toIHFxuuP4cV8CZrZuiv4oHoNPHjh xktA== X-Gm-Message-State: AHPjjUhkHTysaY+x3QyEAconEob+4M4d+hCdbuRldgp/OtcDXuHchEjp EkNR39/KldpwEQbDSJx7dXNWxI2OvZaZuG+nDE4h/Q== X-Google-Smtp-Source: AOwi7QB6zzLWhZn2ds0zhpmduHk1sxJgUE1ytHztaTrzED/n59IMS9gS/qGaFEIeYcEyKuP39i8WyZDE6A5RG24tPIU= X-Received: by 10.176.94.175 with SMTP id y47mr14839877uag.95.1507232423983; Thu, 05 Oct 2017 12:40:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.120.3 with HTTP; Thu, 5 Oct 2017 12:40:23 -0700 (PDT) In-Reply-To: <55bc5137-7a62-2e89-d678-addfe8e66079@redhat.com> References: <55bc5137-7a62-2e89-d678-addfe8e66079@redhat.com> From: Christophe Lyon Date: Thu, 05 Oct 2017 19:40:00 -0000 Message-ID: Subject: Re: [RFA] [PATCH 4/4] Ignore reads of "dead" memory locations in DSE To: Jeff Law Cc: Richard Biener , gcc-patches Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2017-10/txt/msg00292.txt.bz2 Hi Jeff, On 7 September 2017 at 00:18, Jeff Law wrote: > Another old patch getting resurrected... > > This patch (r253305) introduces a new FAIL on arm-none-eabi (as opposed arm-linux-gnueabi*): FAIL: gcc.dg/tree-ssa/ssa-dse-26.c scan-tree-dump-times dse1 "Deleted dead store" 2 I'm not familiar with all this, but I quickly looked at the testcase, and noticed it uses enums. One ABI difference between arm-non-eabi and arm-linux-gnueabi is that the bare-metal target defaults to short-enums, while the linux one uses no-short-enums. Could that be the problem? Thanks, Christophe > On 01/04/2017 06:50 AM, Richard Biener wrote: >> On Thu, Dec 22, 2016 at 7:26 AM, Jeff Law wrote: >>> This is the final patch in the kit to improve our DSE implementation. >>> >>> It's based on a observation by Richi. Namely that a read from bytes of >>> memory that are dead can be ignored. By ignoring such reads we can >>> sometimes find additional stores that allow us to either eliminate or trim >>> an earlier store more aggressively. >>> >>> This only hit (by hit I mean the ability to ignore resulted in finding a >>> full or partially dead store that we didn't otherwise find) once during a >>> bootstrap, but does hit often in the libstdc++ testsuite. I've added a test >>> derived from the conversation between myself and Richi last year. >>> >>> There's nothing in the BZ database on this issue and I can't reasonably call >>> it a bugfix. I wouldn't lose sleep if this deferred to gcc-8. >>> >>> Bootstrapped and regression tested on x86-64-linux-gnu. OK for the trunk or >>> defer to gcc-8? >>> >>> >>> >>> * tree-ssa-dse.c (live_bytes_read): New function. >>> (dse_classify_store): Ignore reads of dead bytes. >>> >>> * testsuite/gcc.dg/tree-ssa/ssa-dse-26.c: New test. >>> * testsuite/gcc.dg/tree-ssa/ssa-dse-26.c: Likewise. >>> >>> >>> > [ snip ] > >>> diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c >>> index a807d6d..f5b53fc 100644 >>> --- a/gcc/tree-ssa-dse.c >>> +++ b/gcc/tree-ssa-dse.c >>> @@ -475,6 +475,41 @@ maybe_trim_partially_dead_store (ao_ref *ref, sbitmap >>> live, gimple *stmt) >>> } >>> } >>> >>> +/* Return TRUE if USE_REF reads bytes from LIVE where live is >>> + derived from REF, a write reference. >>> + >>> + While this routine may modify USE_REF, it's passed by value, not >>> + location. So callers do not see those modifications. */ >>> + >>> +static bool >>> +live_bytes_read (ao_ref use_ref, ao_ref *ref, sbitmap live) >>> +{ >>> + /* We have already verified that USE_REF and REF hit the same object. >>> + Now verify that there's actually an overlap between USE_REF and REF. >>> */ >>> + if ((use_ref.offset < ref->offset >>> + && use_ref.offset + use_ref.size > ref->offset) >>> + || (use_ref.offset >= ref->offset >>> + && use_ref.offset < ref->offset + ref->size)) >> >> can you use ranges_overlap_p? (tree-ssa-alias.h) > Yes. Didn't know about it. Done. > >> >>> + { >>> + normalize_ref (&use_ref, ref); >>> + >>> + /* If USE_REF covers all of REF, then it will hit one or more >>> + live bytes. This avoids useless iteration over the bitmap >>> + below. */ >>> + if (use_ref.offset == ref->offset && use_ref.size == ref->size) >>> + return true; >>> + >>> + /* Now iterate over what's left in USE_REF and see if any of >>> + those bits are i LIVE. */ >>> + for (int i = (use_ref.offset - ref->offset) / BITS_PER_UNIT; >>> + i < (use_ref.offset + use_ref.size) / BITS_PER_UNIT; i++) >>> + if (bitmap_bit_p (live, i)) >> >> a bitmap_bit_in_range_p () would be nice to have. And it can be more >> efficient than this loop... > Yea. That likely would help here. I'm testing with a > bitmap_bit_in_range_p implementation (only for sbitmaps since that's > what we're using here). > > That implementation does the reasonably efficient things and is modeled > after the sbitmap implementation of bitmap_set_range. > > >>> @@ -554,6 +589,41 @@ dse_classify_store (ao_ref *ref, gimple *stmt, gimple >>> **use_stmt, >>> /* If the statement is a use the store is not dead. */ >>> else if (ref_maybe_used_by_stmt_p (use_stmt, ref)) >>> { >>> + /* Handle common cases where we can easily build a ao_ref >>> + structure for USE_STMT and in doing so we find that the >>> + references hit non-live bytes and thus can be ignored. */ >>> + if (live_bytes) >>> + { >>> + if (is_gimple_assign (use_stmt)) >>> + { >>> + /* Other cases were noted as non-aliasing by >>> + the call to ref_maybe_used_by_stmt_p. */ >>> + ao_ref use_ref; >>> + ao_ref_init (&use_ref, gimple_assign_rhs1 (use_stmt)); >>> + if (valid_ao_ref_for_dse (&use_ref) >>> + && use_ref.base == ref->base >>> + && use_ref.size == use_ref.max_size >>> + && !live_bytes_read (use_ref, ref, live_bytes)) >>> + { >>> + if (gimple_vdef (use_stmt)) >>> + { >>> + /* If we have already seen a store and >>> + this is also a store, then we have to >>> + fail. */ >>> + if (temp) >>> + { >>> + fail = true; >>> + BREAK_FROM_IMM_USE_STMT (ui); >>> + } >> >> as this case is rather cheap to test please test it together with >> live_bytes. Like >> >> if (live_bytes && (! gimple_vdef (use_stmt) || ! temp)) > Seems reasonable. > > >> >> otherwise the patch looks reasonable for GCC 8. > Given the sbitmap.[ch] change, posting for re-approval > > Bootstrapped and regression tested on x86_64. Earlier version without > the bitmap_bit_in_range_p improvement was also bootstrapped and > regression tested on aarch64. > > Jeff > > > > * sbitmap.c (bitmap_bit_in_range_p): New function. > * sbitmap.h (bitmap_bit_in_range_p): Prototype. > * tree-ssa-dse.c (live_bytes_read): New function. > (dse_classify_store): Ignore reads of dead bytes. > > * testsuite/gcc.dg/tree-ssa/ssa-dse-26.c: New test. > > diff --git a/gcc/sbitmap.c b/gcc/sbitmap.c > index c065d13..4bf13a1 100644 > --- a/gcc/sbitmap.c > +++ b/gcc/sbitmap.c > @@ -316,6 +316,59 @@ bitmap_set_range (sbitmap bmap, unsigned int start, unsigned int count) > bmap->elms[start_word] |= mask; > } > > +/* Return TRUE if any bit between START and END inclusive is set within > + the simple bitmap BMAP. Return FALSE otherwise. */ > + > +bool > +bitmap_bit_in_range_p (const_sbitmap bmap, unsigned int start, unsigned int end) > +{ > + unsigned int start_word = start / SBITMAP_ELT_BITS; > + unsigned int start_bitno = start % SBITMAP_ELT_BITS; > + > + /* Testing within a word, starting at the beginning of a word. */ > + if (start_bitno == 0 && (end - start) < SBITMAP_ELT_BITS) > + { > + SBITMAP_ELT_TYPE mask = ((SBITMAP_ELT_TYPE)1 << (end - start)) - 1; > + return (bmap->elms[start_word] & mask) != 0; > + } > + > + unsigned int end_word = end / SBITMAP_ELT_BITS; > + unsigned int end_bitno = end % SBITMAP_ELT_BITS; > + > + /* Testing starts somewhere in the middle of a word. Test up to the > + end of the word or the end of the requested region, whichever comes > + first. */ > + if (start_bitno != 0) > + { > + unsigned int nbits = ((start_word == end_word) > + ? end_bitno - start_bitno > + : SBITMAP_ELT_BITS - start_bitno); > + SBITMAP_ELT_TYPE mask = ((SBITMAP_ELT_TYPE)1 << nbits) - 1; > + mask <<= start_bitno; > + if (bmap->elms[start_word] & mask) > + return true; > + start_word++; > + } > + > + if (start_word > end_word) > + return false; > + > + /* Now test words at a time until we hit a partial word. */ > + unsigned int nwords = (end_word - start_word); > + while (nwords) > + { > + if (bmap->elms[start_word]) > + return true; > + start_word++; > + nwords--; > + } > + > + /* Now handle residuals in the last word. */ > + SBITMAP_ELT_TYPE mask > + = ((SBITMAP_ELT_TYPE)1 << (SBITMAP_ELT_BITS - end_bitno)) - 1; > + return (bmap->elms[start_word] & mask) != 0; > +} > + > #if GCC_VERSION < 3400 > /* Table of number of set bits in a character, indexed by value of char. */ > static const unsigned char popcount_table[] = > diff --git a/gcc/sbitmap.h b/gcc/sbitmap.h > index ce4d27d..ff52e93 100644 > --- a/gcc/sbitmap.h > +++ b/gcc/sbitmap.h > @@ -51,6 +51,7 @@ along with GCC; see the file COPYING3. If not see > * set_difference : bitmap_and_compl > * set_disjuction : (not implemented) > * set_compare : bitmap_equal_p > + * bit_in_range_p : bitmap_bit_in_range_p > > Some operations on 3 sets that occur frequently in data flow problems > are also implemented: > @@ -253,6 +254,7 @@ extern bool bitmap_and (sbitmap, const_sbitmap, const_sbitmap); > extern bool bitmap_ior (sbitmap, const_sbitmap, const_sbitmap); > extern bool bitmap_xor (sbitmap, const_sbitmap, const_sbitmap); > extern bool bitmap_subset_p (const_sbitmap, const_sbitmap); > +extern bool bitmap_bit_in_range_p (const_sbitmap, unsigned int, unsigned int); > > extern int bitmap_first_set_bit (const_sbitmap); > extern int bitmap_last_set_bit (const_sbitmap); > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c > new file mode 100644 > index 0000000..6605dfe > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c > @@ -0,0 +1,33 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-dse1-details" } */ > + > +enum constraint_expr_type > +{ > + SCALAR, DEREF, ADDRESSOF > +}; > +typedef struct constraint_expr > +{ > + enum constraint_expr_type type; > + unsigned int var; > + long offset; > +} constraint_expr ; > +typedef struct constraint > +{ > + struct constraint_expr lhs; > + struct constraint_expr rhs; > +} constraint; > +static _Bool > +constraint_expr_equal (struct constraint_expr x, struct constraint_expr y) > +{ > + return x.type == y.type && x.var == y.var && x.offset == y.offset; > +} > + > +_Bool > +constraint_equal (struct constraint a, struct constraint b) > +{ > + return constraint_expr_equal (a.lhs, b.lhs) > + && constraint_expr_equal (a.rhs, b.rhs); > +} > + > +/* { dg-final { scan-tree-dump-times "Deleted dead store" 2 "dse1" } } */ > + > diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c > index 70c8b07..1eca740 100644 > --- a/gcc/tree-ssa-dse.c > +++ b/gcc/tree-ssa-dse.c > @@ -468,6 +468,36 @@ maybe_trim_partially_dead_store (ao_ref *ref, sbitmap live, gimple *stmt) > } > } > > +/* Return TRUE if USE_REF reads bytes from LIVE where live is > + derived from REF, a write reference. > + > + While this routine may modify USE_REF, it's passed by value, not > + location. So callers do not see those modifications. */ > + > +static bool > +live_bytes_read (ao_ref use_ref, ao_ref *ref, sbitmap live) > +{ > + /* We have already verified that USE_REF and REF hit the same object. > + Now verify that there's actually an overlap between USE_REF and REF. */ > + if (ranges_overlap_p (use_ref.offset, use_ref.size, ref->offset, ref->size)) > + { > + normalize_ref (&use_ref, ref); > + > + /* If USE_REF covers all of REF, then it will hit one or more > + live bytes. This avoids useless iteration over the bitmap > + below. */ > + if (use_ref.offset <= ref->offset > + && use_ref.offset + use_ref.size >= ref->offset + ref->size) > + return true; > + > + /* Now check if any of the remaining bits in use_ref are set in LIVE. */ > + unsigned int start = (use_ref.offset - ref->offset) / BITS_PER_UNIT; > + unsigned int end = (use_ref.offset + use_ref.size) / BITS_PER_UNIT; > + return bitmap_bit_in_range_p (live, start, end); > + } > + return true; > +} > + > /* A helper of dse_optimize_stmt. > Given a GIMPLE_ASSIGN in STMT that writes to REF, find a candidate > statement *USE_STMT that may prove STMT to be dead. > @@ -547,6 +577,31 @@ dse_classify_store (ao_ref *ref, gimple *stmt, gimple **use_stmt, > /* If the statement is a use the store is not dead. */ > else if (ref_maybe_used_by_stmt_p (use_stmt, ref)) > { > + /* Handle common cases where we can easily build a ao_ref > + structure for USE_STMT and in doing so we find that the > + references hit non-live bytes and thus can be ignored. */ > + if (live_bytes && (!gimple_vdef (use_stmt) || !temp)) > + { > + if (is_gimple_assign (use_stmt)) > + { > + /* Other cases were noted as non-aliasing by > + the call to ref_maybe_used_by_stmt_p. */ > + ao_ref use_ref; > + ao_ref_init (&use_ref, gimple_assign_rhs1 (use_stmt)); > + if (valid_ao_ref_for_dse (&use_ref) > + && use_ref.base == ref->base > + && use_ref.size == use_ref.max_size > + && !live_bytes_read (use_ref, ref, live_bytes)) > + { > + /* If this statement has a VDEF, then it is the > + first store we have seen, so walk through it. */ > + if (gimple_vdef (use_stmt)) > + temp = use_stmt; > + continue; > + } > + } > + } > + > fail = true; > BREAK_FROM_IMM_USE_STMT (ui); > } >