From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x82c.google.com (mail-qt1-x82c.google.com [IPv6:2607:f8b0:4864:20::82c]) by sourceware.org (Postfix) with ESMTPS id 84C51395A455 for ; Thu, 2 Jun 2022 10:55:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 84C51395A455 Received: by mail-qt1-x82c.google.com with SMTP id j8so307663qtn.13 for ; Thu, 02 Jun 2022 03:55:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=214kwwWgl4XO4MA9OKgWnda3Laox37/F/l92lxTeiQQ=; b=HYJikv0UB2oFJXyIfXtdbY7HzEPi385zKeMydImSYcvh7a621Pg+gT5BYseOAaRg+c m5PJfTBiEucOyI3d70A9iLlcUgZZTP70CmX9JoXtvmonfNDAHunhd9Kl3MlkWg9dL/1M omNTIR09Dp9Xo/XHRWM1JToQ5dfrcQwiuxGz/L6MquCuZGfZECLI+9BQS3K0X79aWQIc ZekNqmLQsSTgii5jDI+DcgsTDXuk1rDdGo3HT5/BvYXw87X5T4I96YxeirX178OH5O8o E5otU2Qkp/rSD2p2YNAzk0+GgBl+canrQHvlSMeNAXDen1gqtqD3lJbBcpfbDrNjmgn1 IfYQ== X-Gm-Message-State: AOAM530MJGuQGQJBcpChTrbhBTDOhEcQ12oX6mG0eTanBbmOR2DHUNk+ /chG00hSio4/KAbdH0QyiS0AxENRWlt6E8bhr1J/QhoQPc4OSg== X-Google-Smtp-Source: ABdhPJzZBYVsOvKtF6j6nQUav2w7EeDgqRqrpwPQxMavn18d3VJdI/1ecKfSAs1Xk/oIJ6/AmDJtza0gEL2BloakUMQ= X-Received: by 2002:ac8:5ccb:0:b0:304:baf5:1990 with SMTP id s11-20020ac85ccb000000b00304baf51990mr3049321qta.224.1654167345812; Thu, 02 Jun 2022 03:55:45 -0700 (PDT) MIME-Version: 1.0 References: <03c201d8766a$450d9a80$cf28cf80$@nextmovesoftware.com> In-Reply-To: <03c201d8766a$450d9a80$cf28cf80$@nextmovesoftware.com> From: Richard Biener Date: Thu, 2 Jun 2022 12:55:34 +0200 Message-ID: Subject: Re: [PATCH/RFC] cprop_hardreg... Third time's a charm. To: Roger Sayle Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jun 2022 10:55:48 -0000 On Thu, Jun 2, 2022 at 12:20 PM Roger Sayle wrote: > > > > This middle-end patch proposes the "hard register constant propagation" > > pass be performed up to three times on each basic block (up from the > > current two times) if the second pass successfully made changes. > > > > The motivation for three passes is to handle the "swap idiom" (i.e. > > t = x; x = y; y = t;" sequences) that get generated by register allocation > > (reload). > > > > Consider the x86_64 test case for __int128 addition recently discussed > > on gcc-patches. With that proposed patch, the input to the cprop_hardreg > > pass looks like: > > > > movq %rdi, %r8 > > movq %rsi, %rdi > > movq %r8, %rsi > > movq %rdx, %rax > > movq %rcx, %rdx > > addq %rsi %rax > > adcq %rdi, %rdx > > ret > > > > where the first three instructions effectively swap %rsi and %rdi. > > > > On the first pass of cprop_hardreg, we notice that the third insn, > > %rsi := %r8, is redundant and can eliminated/propagated to produce: > > > > movq %rdi, %r8 > > movq %rsi, %rdi > > movq %rdx, %rax > > movq %rcx, %rdx > > addq %r8 %rax > > adcq %rdi, %rdx > > ret > > > > Because a successful propagation was found, cprop_hardreg then runs > > a second pass/sweep on affected basic blocks (using worklist), and > > on this second pass notices that the second instruction, %rdi := %rsi, > > may now be propagated (%rsi was killed in the before the first transform), > > and after a second pass, we now end up with: > > > > movq %rdi, %r8 > > movq %rdx, %rax > > movq %rcx, %rdx > > addq %r8, %rax > > adcq %rsi, %rdx > > ret > > > > which is the current behaviour on mainline. However, a third and final > > pass would now notice that the first insn, "%r8 := %rdi" is also now > > eliminable, and a third iteration would produce optimal code: > > > > movq %rdx, %rax > > movq %rcx, %rdx > > addq %rdi, %rax > > adcq %rsi, %rdx > > ret > > > > The patch below creates an additional worklist, third_pass, that is > > populated with the basic block id's of blocks that were updated during > > the second pass. Does the motivation for three passes (reload doesn't > > generate more or less than three instructions to swap a pair of registers) > > seem reasonable for all targets? If cprop_hardreg is considered an > > expensive pass, this change could be gated based on basic block count > > or similar. Finally, I should point out that this a regression fix; > > GCC 4.8 generated optimal code with two moves (whereas GCC 12 required > > 5 moves, up from GCC 11's 4 moves). > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check, both with and without --target_board=unix{-m32} with > > no new failures. Thoughts? Ok for mainline? Can you instead refactor the code to have auto_vec worklist1, worklist2; vec *worklist = &worklist1; and alternate between worklist1 and worklist2 N times (const int N = 2 is OK for now I guess, OTOH for -O1 we might want to stick to 1). it might be tempting to maintain the number of times we visited a block and just directly iterate on changed BBs. There's also the comment in cprop_hardreg_bb: /* If a block has a single predecessor, that we've already processed, begin with the value data that was live at the end of the predecessor block. */ /* ??? Ought to use more intelligent queuing of blocks. */ which suggests we want to improve on its dataflow (like iterating in RPO order so we can intersect the live value data of predecessors?) That said, calling df_analyze () between iterations is what makes iterating expensive I guess, as well as cprop_hardreg_debug. Richard. > > > > > 2022-06-02 Roger Sayle > > > > gcc/ChangeLog > > * regcprop.cc (pass_cprop_hardreg::execute): Perform a third > > iteration over each basic block that was updated by the second > > iteration. > > > > > > Thanks in advance, > > Roger > > -- > > >