From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 3E0C83858D33 for ; Fri, 20 Jan 2023 17:39:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3E0C83858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1674236353; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=Frk75SMGzh+w0EGu485fTSfqtYUkwBVvjR8MkFc/lTI=; b=FJaIZ2EXwLqW/OSZ7x2T6fiXBY2LTj27rAEqxL5Gg7+jJhcFhecMVgvdR2KCrGbTfB/F4x /wHrZmTtrZzPrvsehRsTO2C8JkCa6YMQzWcgrhJt026D90Q/UtIxsPYIFqcOnG/yDjngaw IaLWoW6QpiFvjKwZA0ulmkWt6j49diU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-588-nqOZ3oDLNb2Dmz0C02oAQA-1; Fri, 20 Jan 2023 12:39:09 -0500 X-MC-Unique: nqOZ3oDLNb2Dmz0C02oAQA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6895E886C60; Fri, 20 Jan 2023 17:39:09 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.223]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 254662166B2A; Fri, 20 Jan 2023 17:39:08 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 30KHd6k23389066 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Fri, 20 Jan 2023 18:39:06 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 30KHd5Vo3389065; Fri, 20 Jan 2023 18:39:05 +0100 Date: Fri, 20 Jan 2023 18:39:04 +0100 From: Jakub Jelinek To: Tobias Burnus Cc: gcc-patches , fortran Subject: Re: [Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424] Message-ID: Reply-To: Jakub Jelinek References: <18c3aed8-71dd-9b7f-6c7c-da529876d3f5@codesourcery.com> MIME-Version: 1.0 In-Reply-To: <18c3aed8-71dd-9b7f-6c7c-da529876d3f5@codesourcery.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Jan 19, 2023 at 03:40:19PM +0100, Tobias Burnus wrote: > + gfc_symbol *var = code->ext.iterator->var->symtree->n.sym; > + > + gfc_se se; > + tree tree_var, a1, a2; > + a1 = integer_one_node; > + a2 = integer_zero_node; > + > + gfc_init_se (&se, NULL); > + gfc_conv_expr_lhs (&se, code->ext.iterator->var); > + gfc_add_block_to_block (pblock, &se.pre); > + tree_var = se.expr; > + > + { > + /* FIXME: Handle non-unity iterations, cf. PR fortran/107424. I think instead of non-unity etc. it is better to talk about constant step 1 or -1. > + The issue is that for those a 'count' variable is used. */ > + dovar_init *di; > + unsigned ix; > + tree t = tree_var; > + while (TREE_CODE (t) == INDIRECT_REF) > + t = TREE_OPERAND (t, 0); > + FOR_EACH_VEC_ELT (*inits, ix, di) > + { > + tree t2 = di->var; > + while (TREE_CODE (t2) == INDIRECT_REF) > + t2 = TREE_OPERAND (t2, 0); The actual problem with non-simple loops for non-rectangular loops is both in case it is an inner loop which uses some outer loop's iterator, or if it is outer loop whose iterator is used, both of those cases will not be handled properly. The former case because instead of having lb and ub expressions in canonicalized form var-outer * m + a lb will be 0 (that is fine) and ub will be (var-outer * m2 + a2 + step - var-outer * m1 - a1) / step or so (sure, we can simplify that to (var-outer * (m1 - m2) + (a2 + step - a1)) / step but the division remains. And the latter case is bad because we need var-outer but we actually compute some artificial count iterator and var-outer is only initialized in the body of the loop. These sorry_at seems to handle just one of those, when the outer loop whose var-outer is referenced is not simple, no? I wonder if it wouldn't be cleaner and easier to simply remember for each loop in XALLOCAVEC array whether it was simple or not and why (from the: if (VAR_P (dovar)) { if (integer_onep (step)) simple = 1; else if (tree_int_cst_equal (step, integer_minus_one_node)) simple = -1; } else dovar_decl = gfc_trans_omp_variable (code->ext.iterator->var->symtree->n.sym, false); remember if it was simple (1/-1) or VAR_P !simple (then we would if needed for non-rect sorry_at about step not being constant 1 or -1) or if it is the !VAR_P case. And then the non-rect sorry can be emitted for both the cases easily (especially if you precompute the: if (VAR_P (dovar)) { if (integer_onep (step)) simple_loop[i] = 1; else if (tree_int_cst_equal (step, integer_minus_one_node)) simple_loop[i] = -1; else simple_loop[i] = 0; } else simple_loop[i] = 2; early) and in this function check it for both loop_n and i. > + if (t == t2) > + { > + HOST_WIDE_INT intval; > + if (gfc_extract_hwi (code->ext.iterator->step, &intval, 0) == 0 > + && intval != 1 && intval != -1) > + sorry_at (gfc_get_location (&code->loc), > + "non-rectangular loop nest with non-unit loop iteration" > + " step for %qs", var->name); I'd say step other than constant 1 or -1. > + ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops unit ? > + ! Then same, execpt use nonunit stride for 'k' except, non-unit ? > + ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops > + ! Then same, execpt use nonunit stride for 'k' 2x again (and some more later). Jakub