From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 76F6F3858439 for ; Mon, 12 Jul 2021 08:57:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 76F6F3858439 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 745D91FD55; Mon, 12 Jul 2021 08:57:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1626080234; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=thDpAnF3WGxHHCWksTy4W6Z1popuT5X3M+w7Z7Fi/aU=; b=PpIqej/SAZbh34H2Ogsp6BomfB+mVU2SgwWy2PpalXMaYOhMsGzvx1vxcW1qe1L5JiNOhu /c0tw8h5bcvKCbYhW640hKzSj6Rja/w791/w/0zUrMYwMyZNs1e6v4lbigPbT59cvPCMbo 3B1ghqN/qjYYaO6PMxJhXMMI86mH7OI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1626080234; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=thDpAnF3WGxHHCWksTy4W6Z1popuT5X3M+w7Z7Fi/aU=; b=/mg/Q7sYHjg8bKWTEScTKjSCMghUxjtsUWhvn9z/rDb4Zs5gwVB/dcNbwk6A0LMoDDLnPz dQaU2gF+oqYI5jBQ== Received: from murzim.suse.de (murzim.suse.de [10.160.4.192]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 45586A3B8D; Mon, 12 Jul 2021 08:57:14 +0000 (UTC) Date: Mon, 12 Jul 2021 10:57:14 +0200 (CEST) From: Richard Biener To: guojiufu cc: Segher Boessenkool , gcc-patches@gcc.gnu.org, wschmidt@linux.ibm.com, dje.gcc@gmail.com, jlaw@tachyum.com, amker.cheng@gmail.com Subject: Re: [PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837] In-Reply-To: Message-ID: References: <20210709020718.177302-1-guojiufu@linux.ibm.com> <20210709170832.GW1583@gate.crashing.org> User-Agent: Alpine 2.21 (LSU 202 2017-01-01) MIME-Version: 1.0 X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, PDS_TONAME_EQ_TOLOCAL_HDRS_LCASE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 8BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Jul 2021 08:57:17 -0000 On Mon, 12 Jul 2021, guojiufu wrote: > On 2021-07-12 14:20, Richard Biener wrote: > > On Fri, 9 Jul 2021, Segher Boessenkool wrote: > > > >> On Fri, Jul 09, 2021 at 08:43:59AM +0200, Richard Biener wrote: > >> > I wonder if there's a way to query the target what modes the doloop > >> > pattern can handle (not being too familiar with the doloop code). > >> > >> You can look what modes are allowed for operand 0 of doloop_end, > >> perhaps? Although that is a define_expand, not a define_insn, so it is > >> hard to introspect. > >> > >> > Why do you need to do any checks besides the new type being able to > >> > represent all IV values? The original doloop IV will never wrap > >> > (OTOH if niter is U*_MAX then we compute niter + 1 which will become > >> > zero ... I suppose the doloop might still do the correct thing here > >> > but it also still will with a IV with larger type). > > The issue comes from U*_MAX (original short MAX), as you said: on which > niter + 1 becomes zero. And because the step for doloop is -1; then, on > larger type 'zero - 1' will be a very large number on larger type > (e.g. 0xff...ff); but on the original short type 'zero - 1' is a small value > (e.g. "0xff"). But for the larger type the small type MAX + 1 fits and does not yield zero so it should still work exactly as before, no? Of course you have to compute the + 1 in the larger type. > >> > >> doloop_valid_p guarantees it is simple and doesn't wrap. > >> > >> > I'd have expected sth like > >> > > >> > ntype = lang_hooks.types.type_for_mode (word_mode, TYPE_UNSIGNED > >> > (ntype)); > >> > > >> > thus the decision made using a mode - which is also why I wonder > >> > if there's a way to query the target for this. As you say, > >> > it _may_ be fast, so better check (somehow). > > > I was also thinking of using hooks like type_for_size/type_for_mode. > /* Use type in word size may fast. */ > if (TYPE_PRECISION (ntype) < BITS_PER_WORD > && Wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE > (ntype)))) > { > ntype = lang_hooks.types.type_for_size (BITS_PER_WORD, 1); > base = fold_convert (ntype, base); > } > > As you pointed out, this does not query the mode from targets. > As Segher pointed out "doloop_end" checks unsupported mode, while it seems > not easy to use it in tree-ssa-loop-ivopts.c. > For implementations of doloop_end, tartgets like rs6000/aarch64/ia64 requires > Pmode/DImode; while there are other targets that work on other 'mode' (e.g. > SI). > > > In doloop_optimize, there is code: > > ``` > mode = desc->mode; > ..... > doloop_reg = gen_reg_rtx (mode); > rtx_insn *doloop_seq = targetm.gen_doloop_end (doloop_reg, > start_label); > > word_mode_size = GET_MODE_PRECISION (word_mode); > word_mode_max = (HOST_WIDE_INT_1U << (word_mode_size - 1) << 1) - 1; > if (! doloop_seq > && mode != word_mode > /* Before trying mode different from the one in that # of > iterations is > computed, we must be sure that the number of iterations fits into > the new mode. */ > && (word_mode_size >= GET_MODE_PRECISION (mode) > || wi::leu_p (iterations_max, word_mode_max))) > { > if (word_mode_size > GET_MODE_PRECISION (mode)) > count = simplify_gen_unary (ZERO_EXTEND, word_mode, count, mode); > else > count = lowpart_subreg (word_mode, count, mode); > PUT_MODE (doloop_reg, word_mode); > doloop_seq = targetm.gen_doloop_end (doloop_reg, start_label); > } > if (! doloop_seq) > { > if (dump_file) > fprintf (dump_file, > "Doloop: Target unwilling to use doloop pattern!\n"); > return false; > } > ``` > The above code first tries the mode of niter_desc by call > targetm.gen_doloop_end > to see if the target can generate doloop insns, if fail, then try to use > 'word_mode' against gen_doloop_end. > > > >> > >> Almost all targets just use Pmode, but there is no such guarantee I > >> think, and esp. some targets that do not have machine insns for this > >> (but want to generate different code for this anyway) can do pretty much > >> anything. > >> > >> Maybe using just Pmode here is good enough though? > > > > I think Pmode is a particularly bad choice and I'd prefer word_mode > > if we go for any hardcoded mode. s390x for example seems to handle > > both SImode and DImode (but names the helper gen_doloop_si64 > > for SImode?!). But indeed it looks like somehow querying doloop_end > > is going to be difficult since the expander doesn't have any mode, > > so we'd have to actually try emit RTL here. > > Instead of using hardcode mode, maybe we could add a hook for targets to > return > the preferred mode. That's a possiblity of course. Like the following which just shows the default implementation then (pass in current mode, return a more preferred mode or the mode itself) enum machine_mode prefered_doloop_mode (enum machine_mode mode) { return mode; } > > Thanks for those valuable comments! > > Jiufu Guo > > > > > > > Richard. > > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)