From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd32.google.com (mail-io1-xd32.google.com [IPv6:2607:f8b0:4864:20::d32]) by sourceware.org (Postfix) with ESMTPS id 7568A3858D32 for ; Fri, 29 Sep 2023 19:22:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7568A3858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-io1-xd32.google.com with SMTP id ca18e2360f4ac-79fb64b5265so400585539f.1 for ; Fri, 29 Sep 2023 12:22:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696015326; x=1696620126; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=WhqnIAJB0stZedqUKK8L9mdiXyO/LBoQJZhjZiykQAY=; b=IUd31KY0Nwt3ZwoOPL1MrqvuOjOMnFapuwBu2XVTwVAfFyCn6UaA9rS6qw76fXZO/X pDUiecV7cgNKa0s0FKwbR5fmHSKlQMJfpl5jnpmyyqnBysmGWr39yymWA7zDSyPF6v8p FsxrCrfax/RJKm93mZGebIFyxJeVLThLqgEHe2qt5lz5cW9ExXQaVKasfSV6MvjVj0qp PlcO8rzOOqRwAKz2cRHDfIgJsygWtZIG1Uo/iMH5xkMsNZVb0JmKhncEFVWRCBc6AFM+ lIehebzqnaYHOnCyk92LFXfa2k92ba1cyMtNMRGxM9/w7ZgmmgYYVyEOBRO0CqZI9KJm kyDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696015326; x=1696620126; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WhqnIAJB0stZedqUKK8L9mdiXyO/LBoQJZhjZiykQAY=; b=kpz+/MTd1jT4pAX8qpGD2LC76FM3YpYzeWm2axM/wQsjThZl5+EYHNi73P1oAEOZ9x lhduFYRxJ7rFCiPJdTIorTCM36mcWw5HJ4yddWLwGpCafdGVsutI10GxN7AZFrQdv1v+ kOZ6EQYnTgXCDmSZ4b7RJJNY0Br0JnajM1ETRSbIJv0vXJk9ZQxsIgt7Z0pmG0PB8PvU dIUwFicUP9/31AQgCG0906PKlWn9YAAECEC1VaRSu81ZPisuwkSOuFFZyv2qGtCfRHZz c4LJ9vFC1xS3Sbc0re6J4xFrzsgB74yNn4IR+ilPnii64i26Yi/8N7iPHb2/O6Na8dNW KhFg== X-Gm-Message-State: AOJu0YygQgOaz3RxLX2K2kaEEiqxabJeHwf6j6m7TFn56V/n02wzUD8g cMKw/xzDmCmIJq88F0usbkU= X-Google-Smtp-Source: AGHT+IHAxK/OJVLTS9rIy6ldup3RbyQWv9lTwZPWb1eqTBU3hiXXjq2X06k4CJdRsDikRWvwb0AALA== X-Received: by 2002:a6b:6618:0:b0:790:a010:4c42 with SMTP id a24-20020a6b6618000000b00790a0104c42mr5059015ioc.13.1696015326521; Fri, 29 Sep 2023 12:22:06 -0700 (PDT) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id v26-20020a02b91a000000b0042b37dda71asm5210136jan.136.2023.09.29.12.22.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 29 Sep 2023 12:22:06 -0700 (PDT) Message-ID: Date: Fri, 29 Sep 2023 13:22:04 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v5] Implement new RTL optimizations pass: fold-mem-offsets. Content-Language: en-US To: Manolis Tsamis Cc: gcc-patches@gcc.gnu.org, Philipp Tomsich , Vineet Gupta , Richard Biener References: <20230909084652.2655745-1-manolis.tsamis@vrull.eu> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 9/12/23 04:13, Manolis Tsamis wrote: >>> + >>> +/* Get the single reaching definition of an instruction inside a BB. >>> + The definition is desired for REG used in INSN. >>> + Return the definition insn or NULL if there's no definition with >>> + the desired criteria. */ >>> +static rtx_insn* >>> +get_single_def_in_bb (rtx_insn *insn, rtx reg) >>> +{ >>> + df_ref use; >>> + struct df_link *ref_chain, *ref_link; >>> + >>> + FOR_EACH_INSN_USE (use, insn) >>> + { >>> + if (GET_CODE (DF_REF_REG (use)) == SUBREG) >>> + return NULL; >>> + if (REGNO (DF_REF_REG (use)) == REGNO (reg)) >>> + break; >>> + } >>> + >>> + if (!use) >>> + return NULL; >>> + >>> + ref_chain = DF_REF_CHAIN (use); >> So what if there's two uses of REG in INSN? I don't think it's be >> common at all, but probably better safe and reject than sorry, right? Or >> is that case filtered out earlier? >> > If the REG is the same won't the definitions be the same even if that > REG appears multiple times in INSN? Yes. > fold_offsets_1 should be able to handle the folding with multiple uses > of REG just fine, for example add R1, R1 or add (ashift R1, 1), R1. > If there's no other issue here I assume we want to keep that as-is in > order to not reduce the propagation power (Which I assume is similar > to ree which uses the same logic). OK. I was primarily concerned about the folding and rewriting aspects. It probably can only show up on targets with LEA like instructions, and even on such targets it's probably rate. >>> +/* Test if INSN is a memory load / store that can have an offset folded to it. >>> + Return true iff INSN is such an instruction and return through MEM_OUT, >>> + REG_OUT and OFFSET_OUT the RTX that has a MEM code, the register that is >>> + used as a base address and the offset accordingly. >>> + All of the out pointers may be NULL in which case they will be ignored. */ >>> +bool >>> +get_fold_mem_root (rtx_insn* insn, rtx *mem_out, rtx *reg_out, >>> + HOST_WIDE_INT *offset_out) >>> +{ >>> + rtx set = single_set (insn); >>> + rtx mem = NULL_RTX; >>> + >>> + if (set != NULL_RTX) >>> + { >>> + rtx src = SET_SRC (set); >>> + rtx dest = SET_DEST (set); >>> + >>> + /* Don't fold when we have unspec / volatile. */ >>> + if (GET_CODE (src) == UNSPEC >>> + || GET_CODE (src) == UNSPEC_VOLATILE >>> + || GET_CODE (dest) == UNSPEC >>> + || GET_CODE (dest) == UNSPEC_VOLATILE) >>> + return false; >>> + >>> + if (MEM_P (src)) >>> + mem = src; >>> + else if (MEM_P (dest)) >>> + mem = dest; >>> + else if ((GET_CODE (src) == SIGN_EXTEND >>> + || GET_CODE (src) == ZERO_EXTEND) >>> + && MEM_P (XEXP (src, 0))) >> Note some architectures allow both a source and destination memory. It >> looks like your code will prefer the source operand in that case. >> That's fine, just pointing it out. >> > Thanks for pointing that possibility out. I thought for a moment that > this would be a bug with multiple mentions of the address register. > but it should be fine due to: > /* Special case: A foldable memory store is not foldable if it > mentions DEST outside of the address calculation. */ ACK. The other thing I keep pondering is autoincrement style addressing. Though I think at some point I convinced myself they weren't a problem. I think your checks only allow specific kinds of expressions for the memory address and I don't think {PRE,POST}_{INC,DEC.MODIFY} were in the list of valid ops. >>> + >>> + int max_iters = 5; >>> + for (int i = 0; i < max_iters; i++) >>> + { >>> + bool made_changes = false; >>> + for (fold_info_map::iterator iter = fold_info->begin (); >>> + iter != fold_info->end (); ++iter) >>> + { >>> + fold_mem_info *info = (*iter).second; >>> + if (bitmap_intersect_p (&cannot_fold_insns, info->fold_insns)) >>> + made_changes |= bitmap_ior_into (&cannot_fold_insns, >>> + info->fold_insns); >>> + } >>> + >>> + if (!made_changes) >>> + return true; >>> + } >>> + >>> + return false; >> So how was the magic value of "5" determined here? In general we try >> not to have magic #s like that and instead find a better way to control >> iterations, falling back to a PARAM when all else fails. >> > It's indeed just a magic value :) > In general even two iterations of this should be quite rare. One > iteration will handle the majority of interesting cases. I chose 5 to > limit the worst case execution time for degenerate cases. > > I'll be happy to change that to something better but I couldn't > figure out how "falling back to a PARAM when all else fails" should > work here. Is there a similar example somewhere else in the code that > I can look at? If we expect two iterations to be quite rare, then I doubt a param is really worth the effort. I might suggest using a loop like for (pass = 0; pass < flag_expensive_optimizations + 1; pass++) No magic numbers :-) We use similar loops elsewhere for this kind of scenario. If you ever need to create a param, the procedure is to add the option to params.opt. If you look in that file you'll see that you define a variable name in the specification that you can then reference. So for a simple example look for param_avg_loop_niter. Jeff