From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x643.google.com (mail-ej1-x643.google.com [IPv6:2a00:1450:4864:20::643]) by sourceware.org (Postfix) with ESMTPS id 7E1F63857C7F for ; Wed, 29 Jul 2020 07:11:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 7E1F63857C7F Received: by mail-ej1-x643.google.com with SMTP id qc22so8417885ejb.4 for ; Wed, 29 Jul 2020 00:11:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0E4UovHaIqdYMpbnU76lydogrcU6taQll/eymiSvB0E=; b=GfDCcYYo+zfOW12S/n/1wsNAYl2UEboF2QYpeZMPX6uyQ8RRi0DC/yrmwKVSmrhQqo e5oRAoibz3Q75ZXJOru2lh+FhsK8swDPHk1JfxDoFmtR6o8tlGw8X/w0xJSAUpYNxodN d2bFyqCFXWi+0I0rMJeZqp4D4iAdXwyx4fiMMUFroUqv/E2dT8+6ynPrhyxiqG6WsnLx CsMc6RuBoxUE8N2J2HFa/mDZVXN9qEUMUFmz+MzeELnG9WiLUYjIKw1yF0iMZrbIdBcR Hl5G/xN4HmIpRG5scj9TOnBmNjEsJhqLCUrWIegJ70IpU2zeWHIGSkUGZwNtdY+hwvNE xAkw== X-Gm-Message-State: AOAM532TEghBZJBZqpZox1EL9SKwC4y9lNs3pGzMRsRGZpAxkmw91orw bT5V/P/ZHK0ayxkCSeY/lzgmMmRK9a830Ycqypg= X-Google-Smtp-Source: ABdhPJxDWPypKwshWWRyn5vo4xMZe1xveHsh1BckEwtZOmar/E3C/yYiG+6bs+8D5CRYrZhafIKhgxf7SPZc+k1cXAI= X-Received: by 2002:a17:907:b0b:: with SMTP id h11mr10606384ejl.371.1596006684460; Wed, 29 Jul 2020 00:11:24 -0700 (PDT) MIME-Version: 1.0 References: <20200722151450.1540130-1-stefansf@linux.ibm.com> <20200727142019.GA901349@localhost.localdomain> <20200728153629.GA5921@localhost.localdomain> In-Reply-To: <20200728153629.GA5921@localhost.localdomain> From: Richard Biener Date: Wed, 29 Jul 2020 09:11:12 +0200 Message-ID: Subject: Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount To: Stefan Schulze Frielinghaus Cc: Richard Biener via Gcc-patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2020 07:11:27 -0000 On Tue, Jul 28, 2020 at 5:36 PM Stefan Schulze Frielinghaus wrote: > > On Tue, Jul 28, 2020 at 08:55:57AM +0200, Richard Biener wrote: > > On Mon, Jul 27, 2020 at 4:20 PM Stefan Schulze Frielinghaus > > wrote: > > > > > > On Mon, Jul 27, 2020 at 12:29:11PM +0200, Richard Biener wrote: > > > > On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford > > > > wrote: > > > > > > > > > > Richard Biener writes: > > > > > > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford > > > > > > wrote: > > > > > >> > > > > > >> Richard Biener via Gcc-patches writes: > > > > > >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via > > > > > >> > Gcc-patches wrote: > > > > > >> >> > > > > > >> >> This is a follow up to commit 5c9669a0e6c respectively discussion > > > > > >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html > > > > > >> >> > > > > > >> >> In case that an alignment constraint is less than the size of a > > > > > >> >> corresponding scalar type, ensure that we advance at least by one > > > > > >> >> iteration. For example, on s390x we have for a long double an alignment > > > > > >> >> constraint of 8 bytes whereas the size is 16 bytes. Therefore, > > > > > >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which > > > > > >> >> can be reproduced by the following MWE: > > > > > >> > > > > > > >> > But we guard this case with vector_alignment_reachable_p, so we shouldn't > > > > > >> > have ended up here and the patch looks bogus. > > > > > >> > > > > > >> The above sounds like it ought to count as reachable alignment though. > > > > > >> If a type requires a lower alignment than its size, then that's even > > > > > >> more easily reachable than a type that requires the same alignment as > > > > > >> the size. I guess at one extreme, a target alignment of 1 is always > > > > > >> reachable. > > > > > > > > > > > > Well, if the element alignment is 8 but its size is 16 then when presumably > > > > > > the desired vector alignment is a multiple of 16 we can never reach it. > > > > > > Isn't this the case here? > > > > > > > > > > If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 then > > > > > TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is > > > > > fixing wouldn't occur. I agree that we might never be able to reach > > > > > that alignment if the pointer starts out misaligned by 8 bytes. > > > > > > > > > > But I think that's why it makes sense for the target to only ask > > > > > for 8-byte alignment for vectors too, if it can cope with it. 8-byte > > > > > alignment should always be achievable if the scalars are ABI-aligned. > > > > > And if the target does ask for only 8-byte alignment, TARGET_ALIGN / > > > > > DR_SIZE would be zero and the loop would never progress, which is the > > > > > problem that the patch is fixing. > > > > > > > > > > It would even make sense for the target to ask for 1-byte alignment, > > > > > if the target doesn't care about alignment at all. > > > > > > > > Hmm, OK. Guess I still think we should detect this somewhere upward > > > > and avoid this peeling compute at all. Somehow. > > > > > > I've been playing around with another solution which works for me by > > > changing vector_alignment_reachable_p to return also false if the > > > alignment requirements are already satisfied, i.e., by adding: > > > > > > if (known_alignment_for_access_p (dr_info) && aligned_access_p (dr_info)) > > > return false; > > > > That sounds wrong, instead ... > > Can you elaborate on that? A similar test exists for predicate > vector_alignment_reachable_p where the second conjunct is the same but > negated in order to test for the case where a misalignment is known: > https://gcc.gnu.org/git?p=gcc.git;a=blob;f=gcc/tree-vect-data-refs.c;h=e35a215e042478d11d6545f1f829d816d0c3620f;hb=refs/heads/master#l1263 > Therefore, I'm wondering why the non-negated case should be wrong. > > > > Though, I'm not entirely sure whether this makes it better or not. > > > Strictly speaking if the alignment was reachable before peeling, then > > > reaching alignment with peeling is also possible but probably not what > > > was intended. So I guess returning false in this case is sensible. Any > > > comments? > > > > ... why is the DR considered for peeling at all? If it is already > > aligned there's > > no point to do that. > > Isn't the whole point of vector_alignment_reachable_p to check DRs in > order to decide whether peeling should be done or not? At least this is > my intuition and the reason why I was suggesting to return false in case > it is aligned. Doh, you are right - I confused the function to be a mere wrapper around the VECTOR_ALIGNMENT_REACHABLE target hook. But yes, it's exactly what you say. But with your suggested extra check the code at the point of the call would simply disable peeling? The code looks odd anyway - it does FOR_EACH_VEC_ELT (datarefs, i, dr) { ... do_peeling = vector_alignment_reachable_p (dr_info); if (do_peeling) { ... insert into peeling hash for costing - also inserts already aligned accesses which may get unaligned with peeling } else { if (!aligned_access_p (dr_info)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "vector alignment may not be reachable\n"); break; } } } so in your case when do_peeling is false we'll not keep it false because aligned_access_p () and then the next DR might make do_peeling true again which will simply cause your rejected DR to be not considered for costing. So I think in the else {} case the aligned_access_p () case is broken already and your proposal makes it more likely to hit. Not sure if we'd currently survive turning that if (!aligned_access_p ()) into an assert ... In that light your original patch looks correct. Thanks, Richard. > Cheers, > Stefan > > > If we want to align another DR then the loop you fix > > should run on that DRs align/size, no? > > > > Richard. > > > > > Thanks, > > > Stefan > > > > > > > > > > > Richard. > > > > > > > > > Thanks, > > > > > Richard