From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) by sourceware.org (Postfix) with ESMTPS id BA4DB3858CD1 for ; Mon, 31 Jul 2023 10:14:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BA4DB3858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-x52b.google.com with SMTP id 4fb4d7f45d1cf-5217ad95029so5823831a12.2 for ; Mon, 31 Jul 2023 03:14:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690798448; x=1691403248; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=P+f3NpGvE8114DIHbYao0aw/YWnItrpRacxavJXXWpE=; b=sXiwS+uY/Cbz8AojQa7nI3Qs3koID7CK6jK/Sd92IwrpYunJT700YF6orvoa6HnSbK dEkhOAINsiNMXkwfk8opzTMpBlzLa+smCrIsSf8pi0L6FseNT9zICAM6X8WKGv8OElaV CWqpbcIRZDBstX4XsDOKPR6D9XYlvhTD9BU87/cruu769OY8hLVZP9U1/O4VYoMJa4OD aFSUL2QqA3kWjnMLKTPyhc7vYqKacC9yn/g+3b9jqKx7dQFLHf8uwsfuEed/Rn9VNtpu 303SXYsXwS4VopKZL5RN6taRFqSwxKQyCwZ+mlsf3FHyvK/XgyiUi6B+c+wk5ZgNaRuN FNcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690798448; x=1691403248; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P+f3NpGvE8114DIHbYao0aw/YWnItrpRacxavJXXWpE=; b=FlSiBW0WByVQb3S93IZRlE+Mc3FuT0f7OeY8RbrMGle4RUONmv6gqMPdd8pGZZlNm9 wch7PhTAMULiBZCbbxTZiHgj8mPlW322LRbmPvg0ZV68MtIQuQ57AB3r49Y6olHNxgJt DBISMesjxNQi21pOmXSOmyhw0Xd4yk3YLkS2zs2MfGy3n6cgdLnvTbx1fOZUhyFKh4/d nlHxd5QohXwK0qbd2AsIyBCfNCUOgDGwsVrp07hyMlmLPZPH2Vad+PTjeSCYh2dWf+UH 7ATQy1BwrMkr6nqEBMmVJVX7gCET0CnRw5xMTjwnzEmrg/b1T75W8V6L+2jN1g5M4OTv zCqQ== X-Gm-Message-State: ABy/qLYPDe3uZZFTU3KwGTHh4IXqr2+gcrGydr2ZxKoFsm1Pfgy5vD3w uN1uAR+yNHlTGvKhXzo3dun/onbnMVN0+5buQ2s= X-Google-Smtp-Source: APBJJlF8b//DlwXbcBjfAFckAbdhHeAzR4/Psu1vQUodzA4Yxs7fkPEeh0Ud6I1SQ7Yk7CGDItcqQxLI7dw2SCnB/sU= X-Received: by 2002:aa7:c2d5:0:b0:51e:309:2e12 with SMTP id m21-20020aa7c2d5000000b0051e03092e12mr7095037edp.28.1690798448360; Mon, 31 Jul 2023 03:14:08 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Uros Bizjak Date: Mon, 31 Jul 2023 12:13:57 +0200 Message-ID: Subject: Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832] To: Richard Biener Cc: "gcc-patches@gcc.gnu.org" , Jan Hubicka , Hongtao Liu Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Jul 31, 2023 at 11:40=E2=80=AFAM Richard Biener = wrote: > > On Sun, 30 Jul 2023, Uros Bizjak wrote: > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF > > named patterns in order to avoid generation of partial vector V4SFmode > > trapping instructions. > > > > The new option is enabled by default, because even with sanitization, > > a small but consistent speed up of 2 to 3% with Polyhedron capacita > > benchmark can be achieved vs. scalar code. > > > > Using -fno-trapping-math improves Polyhedron capacita runtime 8 to 9% > > vs. scalar code. This is what clang does by default, as it defaults > > to -fno-trapping-math. > > I like the new option, note you lack invoke.texi documentation where > I'd also elaborate a bit on the interaction with -fno-trapping-math > and the possible performance impact then NaNs or denormals leak > into the upper halves and cross-reference -mdaz-ftz. Yes, this is my plan (lack of documentation is due to RFC status of the patch). OTOH, Hongtao has some other ideas in the PR, so I'll wait with the patch a bit. Thanks, Uros. > Thanks, > Richard. > > > PR target/110832 > > > > gcc/ChangeLog: > > > > * config/i386/i386.h (TARGET_MMXFP_WITH_SSE): New macro. > > * config/i386/i386/opt (mmmxfp-with-sse): New option. > > * config/i386/mmx.md (movq__to_sse): Do not sanitize > > upper part of V2SFmode register with -fno-trapping-math. > > (v2sf3): Enable for TARGET_MMXFP_WITH_SSE. > > (divv2sf3): Ditto. > > (v2sf3): Ditto. > > (sqrtv2sf2): Ditto. > > (*mmx_haddv2sf3_low): Ditto. > > (*mmx_hsubv2sf3_low): Ditto. > > (vec_addsubv2sf3): Ditto. > > (vec_cmpv2sfv2si): Ditto. > > (vcondv2sf): Ditto. > > (fmav2sf4): Ditto. > > (fmsv2sf4): Ditto. > > (fnmav2sf4): Ditto. > > (fnmsv2sf4): Ditto. > > (fix_truncv2sfv2si2): Ditto. > > (fixuns_truncv2sfv2si2): Ditto. > > (floatv2siv2sf2): Ditto. > > (floatunsv2siv2sf2): Ditto. > > (nearbyintv2sf2): Ditto. > > (rintv2sf2): Ditto. > > (lrintv2sfv2si2): Ditto. > > (ceilv2sf2): Ditto. > > (lceilv2sfv2si2): Ditto. > > (floorv2sf2): Ditto. > > (lfloorv2sfv2si2): Ditto. > > (btruncv2sf2): Ditto. > > (roundv2sf2): Ditto. > > (lroundv2sfv2si2): Ditto. > > > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > > > > Uros. > > > > -- > Richard Biener > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg= )