From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 48203 invoked by alias); 1 Aug 2019 09:38:08 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 48195 invoked by uid 89); 1 Aug 2019 09:38:08 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mail-io1-f66.google.com Received: from mail-io1-f66.google.com (HELO mail-io1-f66.google.com) (209.85.166.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 01 Aug 2019 09:38:07 +0000 Received: by mail-io1-f66.google.com with SMTP id f4so142896976ioh.6 for ; Thu, 01 Aug 2019 02:38:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=rZ6EbJHpKhA/Fz2AxOWKY9JS7Jp2kqnd0ynljv77Zhk=; b=CX/Ykz6JdkdIUFaPDR7qSCbVLrlO+jPvKC+bUvFdIha7araOYPTIBaBAvBuUoVvZvx MmfzIJn6MBHoY+Fxm+xUi1euMli6Mk7OQoqz+bkqXGI9AcmCJwk+7xh3pJUN1Dk78jA3 iODO4fXyoD7AdVeZOBBfWVqfImTYSG4PvMADddCPuDVqFPolTlgzwoFPaazmsNMov9/F BprAjVBeP+cjuFXIXyBUN9uHaGaUGcB4GsKlIwp3+NJyrGgQvIfUulbP6jdOH4U+Gu1u OJFkbKLx7jfqgm9k8Bh7ltYOQg2TohnP3ZkzWyHldkGBk/XCDwghvtXue3e75Jac63OP f6Bw== MIME-Version: 1.0 References: In-Reply-To: From: Uros Bizjak Date: Thu, 01 Aug 2019 09:38:00 -0000 Message-ID: Subject: Re: [PATCH][RFC][x86] Fix PR91154, add SImode smax, allow SImode add in SSE regs To: Richard Biener Cc: Martin Jambor , "gcc-patches@gcc.gnu.org" , Jakub Jelinek , Vladimir Makarov Content-Type: text/plain; charset="UTF-8" X-SW-Source: 2019-08/txt/msg00012.txt.bz2 On Thu, Aug 1, 2019 at 11:28 AM Richard Biener wrote: > > > So you unconditionally add a smaxdi3 pattern - indeed this looks > > > necessary even when going the STV route. The actual regression > > > for the testcase could also be solved by turing the smaxsi3 > > > back into a compare and jump rather than a conditional move sequence. > > > So I wonder how you'd do that given that there's pass_if_after_reload > > > after pass_split_after_reload and I'm not sure we can split > > > as late as pass_split_before_sched2 (there's also a split _after_ > > > sched2 on x86 it seems). > > > > > > So how would you go implement {s,u}{min,max}{si,di}3 for the > > > case STV doesn't end up doing any transform? > > > > If STV doesn't transform the insn, then a pre-reload splitter splits > > the insn back to compare+cmove. > > OK, that would work. But there's no way to force a jumpy sequence then > which we know is faster than compare+cmove because later RTL > if-conversion passes happily re-discover the smax (or conditional move) > sequence. > > > However, considering the SImode move > > from/to int/xmm register is relatively cheap, the cost function should > > be tuned so that STV always converts smaxsi3 pattern. > > Note that on both Zen and even more so bdverN the int/xmm transition > makes it no longer profitable but a _lot_ slower than the cmp/cmov > sequence... (for the loop in hmmer which is the only one I see > any effect of any of my patches). So identifying chains that > start/end in memory is important for cost reasons. Please note that the cost function also considers the cost of move from/to xmm. So, the cost of the whole chain would disable the transformation. > So I think the splitting has to happen after the last if-conversion > pass (and thus we may need to allocate a scratch register for this > purpose?) I really hope that the underlying issue will be solved by a machine dependant pass inserted somewhere after the pre-reload split. This way, we can split unconverted smax to the cmove, and this later pass would handle jcc and cmove instructions. Until then... yes your proposed approach is one of the ways to avoid unwanted if-conversion, although sometimes we would like to split to cmove instead. Uros.