From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 3CB8C3857C67 for ; Wed, 29 Jul 2020 07:49:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3CB8C3857C67 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 06T7VoCY017823; Wed, 29 Jul 2020 03:49:29 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 32j0a6q3ut-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 29 Jul 2020 03:49:29 -0400 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 06T7Vug1018214; Wed, 29 Jul 2020 03:49:29 -0400 Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0b-001b2d01.pphosted.com with ESMTP id 32j0a6q3u7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 29 Jul 2020 03:49:29 -0400 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 06T7jvoM024149; Wed, 29 Jul 2020 07:49:27 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma03ams.nl.ibm.com with ESMTP id 32gcpx4s8x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 29 Jul 2020 07:49:27 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 06T7m0CE60752292 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jul 2020 07:48:00 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5221111C058; Wed, 29 Jul 2020 07:49:25 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id ECBE811C04A; Wed, 29 Jul 2020 07:49:24 +0000 (GMT) Received: from localhost.localdomain (unknown [9.145.179.172]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Wed, 29 Jul 2020 07:49:24 +0000 (GMT) Date: Wed, 29 Jul 2020 09:49:22 +0200 From: Stefan Schulze Frielinghaus To: Richard Biener Cc: Richard Biener via Gcc-patches , Richard Sandiford Subject: Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount Message-ID: <20200729074922.GA43314@localhost.localdomain> References: <20200722151450.1540130-1-stefansf@linux.ibm.com> <20200727142019.GA901349@localhost.localdomain> <20200728153629.GA5921@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-07-29_03:2020-07-28, 2020-07-29 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 bulkscore=0 phishscore=0 lowpriorityscore=0 impostorscore=0 spamscore=0 adultscore=0 mlxscore=0 mlxlogscore=999 malwarescore=0 clxscore=1015 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2007290046 X-Spam-Status: No, score=-6.1 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2020 07:49:34 -0000 On Wed, Jul 29, 2020 at 09:11:12AM +0200, Richard Biener wrote: > On Tue, Jul 28, 2020 at 5:36 PM Stefan Schulze Frielinghaus > wrote: > > > > On Tue, Jul 28, 2020 at 08:55:57AM +0200, Richard Biener wrote: > > > On Mon, Jul 27, 2020 at 4:20 PM Stefan Schulze Frielinghaus > > > wrote: > > > > > > > > On Mon, Jul 27, 2020 at 12:29:11PM +0200, Richard Biener wrote: > > > > > On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford > > > > > wrote: > > > > > > > > > > > > Richard Biener writes: > > > > > > > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford > > > > > > > wrote: > > > > > > >> > > > > > > >> Richard Biener via Gcc-patches writes: > > > > > > >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via > > > > > > >> > Gcc-patches wrote: > > > > > > >> >> > > > > > > >> >> This is a follow up to commit 5c9669a0e6c respectively discussion > > > > > > >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html > > > > > > >> >> > > > > > > >> >> In case that an alignment constraint is less than the size of a > > > > > > >> >> corresponding scalar type, ensure that we advance at least by one > > > > > > >> >> iteration. For example, on s390x we have for a long double an alignment > > > > > > >> >> constraint of 8 bytes whereas the size is 16 bytes. Therefore, > > > > > > >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which > > > > > > >> >> can be reproduced by the following MWE: > > > > > > >> > > > > > > > >> > But we guard this case with vector_alignment_reachable_p, so we shouldn't > > > > > > >> > have ended up here and the patch looks bogus. > > > > > > >> > > > > > > >> The above sounds like it ought to count as reachable alignment though. > > > > > > >> If a type requires a lower alignment than its size, then that's even > > > > > > >> more easily reachable than a type that requires the same alignment as > > > > > > >> the size. I guess at one extreme, a target alignment of 1 is always > > > > > > >> reachable. > > > > > > > > > > > > > > Well, if the element alignment is 8 but its size is 16 then when presumably > > > > > > > the desired vector alignment is a multiple of 16 we can never reach it. > > > > > > > Isn't this the case here? > > > > > > > > > > > > If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 then > > > > > > TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is > > > > > > fixing wouldn't occur. I agree that we might never be able to reach > > > > > > that alignment if the pointer starts out misaligned by 8 bytes. > > > > > > > > > > > > But I think that's why it makes sense for the target to only ask > > > > > > for 8-byte alignment for vectors too, if it can cope with it. 8-byte > > > > > > alignment should always be achievable if the scalars are ABI-aligned. > > > > > > And if the target does ask for only 8-byte alignment, TARGET_ALIGN / > > > > > > DR_SIZE would be zero and the loop would never progress, which is the > > > > > > problem that the patch is fixing. > > > > > > > > > > > > It would even make sense for the target to ask for 1-byte alignment, > > > > > > if the target doesn't care about alignment at all. > > > > > > > > > > Hmm, OK. Guess I still think we should detect this somewhere upward > > > > > and avoid this peeling compute at all. Somehow. > > > > > > > > I've been playing around with another solution which works for me by > > > > changing vector_alignment_reachable_p to return also false if the > > > > alignment requirements are already satisfied, i.e., by adding: > > > > > > > > if (known_alignment_for_access_p (dr_info) && aligned_access_p (dr_info)) > > > > return false; > > > > > > That sounds wrong, instead ... > > > > Can you elaborate on that? A similar test exists for predicate > > vector_alignment_reachable_p where the second conjunct is the same but > > negated in order to test for the case where a misalignment is known: > > https://gcc.gnu.org/git?p=gcc.git;a=blob;f=gcc/tree-vect-data-refs.c;h=e35a215e042478d11d6545f1f829d816d0c3620f;hb=refs/heads/master#l1263 > > Therefore, I'm wondering why the non-negated case should be wrong. > > > > > > Though, I'm not entirely sure whether this makes it better or not. > > > > Strictly speaking if the alignment was reachable before peeling, then > > > > reaching alignment with peeling is also possible but probably not what > > > > was intended. So I guess returning false in this case is sensible. Any > > > > comments? > > > > > > ... why is the DR considered for peeling at all? If it is already > > > aligned there's > > > no point to do that. > > > > Isn't the whole point of vector_alignment_reachable_p to check DRs in > > order to decide whether peeling should be done or not? At least this is > > my intuition and the reason why I was suggesting to return false in case > > it is aligned. > > Doh, you are right - I confused the function to be a mere wrapper > around the VECTOR_ALIGNMENT_REACHABLE target hook. But > yes, it's exactly what you say. But with your suggested extra check > the code at the point of the call would simply disable peeling? The > code looks odd anyway - it does > > FOR_EACH_VEC_ELT (datarefs, i, dr) > { > ... > do_peeling = vector_alignment_reachable_p (dr_info); > if (do_peeling) > { > ... insert into peeling hash for costing - also inserts already aligned > accesses which may get unaligned with peeling > } > else > { > if (!aligned_access_p (dr_info)) > { > if (dump_enabled_p ()) > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > "vector alignment may not be reachable\n"); > break; > } > } > } > > so in your case when do_peeling is false we'll not keep it false because > aligned_access_p () and then the next DR might make do_peeling true > again which will simply cause your rejected DR to be not considered for > costing. So I think in the else {} case the aligned_access_p () case > is broken already and your proposal makes it more likely to hit. Not > sure if we'd currently survive turning that if (!aligned_access_p ()) > into an assert ... > > In that light your original patch looks correct. Whoopsy, yes, I forgot to consider a rejected DR for costing in my second try. The longer I stare at the code the more I tend to the original patch. Thus if no one objects I would like to commit the original patch. Thanks for taking a close look at it! Cheers, Stefan > > Thanks, > Richard. > > > Cheers, > > Stefan > > > > > If we want to align another DR then the loop you fix > > > should run on that DRs align/size, no? > > > > > > Richard. > > > > > > > Thanks, > > > > Stefan > > > > > > > > > > > > > > Richard. > > > > > > > > > > > Thanks, > > > > > > Richard