From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x135.google.com (mail-lf1-x135.google.com [IPv6:2a00:1450:4864:20::135]) by sourceware.org (Postfix) with ESMTPS id 12EDC385840D for ; Wed, 11 Jan 2023 09:28:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 12EDC385840D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-x135.google.com with SMTP id f34so22560723lfv.10 for ; Wed, 11 Jan 2023 01:28:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=2pAJUlHbsuI3Rx5MIV0ynWblbQ6/0D5THIXuL/n9L5s=; b=eymhAKm7yr5p1qhakPebaa1xqOoX+IAZevqEGvWiTCwVLZYxZQpRYEoB1CR4iym4DH NWRFZ4np813QXX7rbr1lwJaEp4egHEtVJcbiTQQBQcMCRbWIuzzUSc/qQxWq1vLSn7K5 qLerj8AIDkYru3XiSYiEAyU9oec4Q7OEE3c2fYLLQK4JqcCTBLNx1fGbkWJV/CILrclI N/ezqMQ2itjO9HivAUT2YxxP7229xBAK0gPhCZ0flffe0fm6akLIINisPSnSvdNWEuFz I4SzhLWYK2GKHaGr935Nw7IFcpVEGmOXzsnVSz0PZzii4NW99jhTzBpbs4OVrZXcTAk5 8Xbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2pAJUlHbsuI3Rx5MIV0ynWblbQ6/0D5THIXuL/n9L5s=; b=SJmx4TEiBHSycSJml+MK/KZEnHwD9axbJaxs/doFLM75I55lQvBA472mmzkOhZC9hn rDtnJvkyI+HOvs7eOfia7eDBIxAye09EAxEtNULguqvwyy2b5XeaskMKDxr7AtUyu0XA fy9MP7uJtginnvZ9njCjT0rMD/RaOCr9A7PDpltZMztyIkl9sN85nOUp3SEdVqOmo9ao AUyhYzeFo/pf0UKPbj0TFYij6zSNnH4rqe19JwqWlDlMAGfhnbabOzA1mypmNhP1a7eT V4wosPLmxLXCQD+OjrM9rLAbs+yCI5TbrtBBF42iu/4fp/kkPIEKrmuZ0Rc6jb62ch9o yAMg== X-Gm-Message-State: AFqh2krmZ6ksrV7kSsiDfGHuzag2fClgRSC9GmF6EkgkADaBHEQyOYCY cQkPiixH3PrRePOHX0FQuDa0LGzg2ltn8e2EQoc= X-Google-Smtp-Source: AMrXdXuQ6hHfkphU/KZDFIpr8Hr2/1QTIZAeg8cXzwlayPNfX6BNZJfO81acNvZvzRCyI7XPpdNWCy1ccjFeUM2BI3Y= X-Received: by 2002:a05:6512:533:b0:4b5:9125:1432 with SMTP id o19-20020a056512053300b004b591251432mr3848920lfc.204.1673429294322; Wed, 11 Jan 2023 01:28:14 -0800 (PST) MIME-Version: 1.0 References: <011401d9243b$3782ce10$a6886a30$@nextmovesoftware.com> <034a01d92504$71b95ad0$552c1070$@nextmovesoftware.com> In-Reply-To: <034a01d92504$71b95ad0$552c1070$@nextmovesoftware.com> From: Uros Bizjak Date: Wed, 11 Jan 2023 10:28:02 +0100 Message-ID: Subject: Re: [x86 PATCH] PR rtl-optimization/107991: peephole2 to tweak register allocation. To: Roger Sayle Cc: Richard Sandiford , GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Jan 10, 2023 at 4:01 PM Roger Sayle wrote: > > > Hi Richard and Uros, > I believe I've managed to reduce a minimal test case that exhibits the > underlying > problem with reload. The following snippet when compiled on x86-64 with > -O2: > > void ext(int x); > void foo(int x, int y) { ext(y - x); } > > produces the following 5 instructions prior to reload: > insn 13: r86:SI=di:SI // REG_DEAD di:SI > insn 14: r87:SI=si:SI // REG_READ si:SI > insn 7: {r85:SI=r87:SI-r86:SI;clobber flags:CC;} // REG_DEAD r86:SI, > r87:SI > insn 8: di:SI=r85:SI // REG_READ r85:SI > insn 9: call [`ext'] argc:0 > > Hence there are three pseudos (allocnos) to be register allocated; r85, r86 > & r87. > > Currently, reload produces the following assignments/colouring using 3 hard > regs. > r85 in di > r86 in ax > r87 in si > > A better (optimal) register allocation requires only 2 hard regs. > r85 in di > r86 in si > r87 in di > > Fortunately, this over-allocation is cleaned up later (during > cprop_hardreg), but > as pointed out by Uros, there's little benefit in reducing register pressure > this > late (after peephole2). > > As far as I understand it, Richard's patch to handle fully-tied destinations > looks > very reasonable (and is impressively tested/benchmarked): > https://gcc.gnu.org/pipermail/gcc-patches/2019-September/530743.html > but in the prototypical 0:"=r", 1:"0", 2:"r" constraint case, as used in the > problematic subsi3_1 pattern (of insn 7), I'm trying to figure out why r85 > and r87 don't get allocated to the same register [given the local spilling > of non-eliminable hard regs in insn 7, temporarily introducing a new pseudo > r89]. > > In closing, reload is a complex piece of code that's shared between a large > number of backends; if Richard's patch is a win "statistically", then it's > not unreasonable to use a peephole2 to clean-up/catch the corner cases > on class_likely_spilled_p targets [indeed many of the peephole2s in i386.md > tidy up register allocation issues], and such a "specialized" fix is more > suitable > for stage 3, than a potentially disruptive tweak to reload. At worst, the > peephole2 becomes dead if/when the problem is fixed upstream. > > Or put another way, if reload worked perfectly, i386.md wouldn't need > many of the peephole2s that it currently has. Oh, for such an ideal world. I have benchmarked the new peephole a bit and during the build of linux kernel and during the whole gcc bootstrap, it didn't trigger even once. It looks to me that the compiler produces the problematic sequence only for specially crafted testcases, when argument setup is involved. These testcases expose a minor annoyance with the reload (which IMO should be fixed in the reload and not papered over with a peephole). Technically, the pattern is OK, but it really doesn't bring much to the table. OTOH, the pattern is simple enough that it won't hurt if we have another specialized pattern in the .md file. I'll leave the decision to you. Uros.