From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) by sourceware.org (Postfix) with ESMTPS id 033143858D28 for ; Mon, 7 Aug 2023 16:00:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 033143858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-x12a.google.com with SMTP id 2adb3069b0e04-4fe48a2801bso7698450e87.1 for ; Mon, 07 Aug 2023 09:00:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691424010; x=1692028810; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=GlpCtAD/Rix+LlD1v0p+Pq8me8I1HwlFjZsaJ4HH/Vg=; b=G7w0vIIXDZOcHpm5HgSZMNezTxp5O4UH/WAPa+1PvMc//7wekDEgRs5Ms/3nvRugWH a4M1+aqIwM9kz5U0IeOn1iqxmwcNTIbkDNJnPXobtR2rKtj2CckjbaH5E7ONqdPOlX8H rGXrWTIOWjX+WD2BUXBc0OrLF7ykbWSBtg2ZOV9dcr3dIJ2jUhuynF9sRGRyWDd7MEFV ltK4wEHxQahYWDu8OR5dO4W+3+6u+ulJNcLWXrjPfFG1R7SxjJGSOBYyE5IrKxI9tOVE w21AWwzDRWx6aQBrS+41EdQ/0aY9iTVfxDqoW9C73ZY2nLiP6WtRX+9Hz0aLAkeBzUj4 v+vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691424010; x=1692028810; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GlpCtAD/Rix+LlD1v0p+Pq8me8I1HwlFjZsaJ4HH/Vg=; b=jTCClYt/r76U5IHPRBkBu9wPfyn1Z3+o7YXcrmQvRHlGm+n1htd7vUT4KViG27r5dt Vyt8nFxl5Jv0vO5e2SxNoCGE+E+elzAhdMLiIgWRQ5WgrgfIFTyKWGKX+vkAp/X5RPA/ t/JvcWp3+ZSCM34Eq0iJwX0JdDl/OmXtfgx3EpjnVA9qb3Q7mX2HZogxU7xg/ndhg2x4 ITfrMYJNg0vC8hqLHAgDDlduYLyStZZWc7o14pm45eLQwiQz2CRPhAxws9aZ43opVwH/ kI3XdZlNRUl2HV7Zs+oMdda85amr4lNFKKLdZSvtbwImpfNc+sz1b1rR1O0J3QTBGzmR d6sw== X-Gm-Message-State: AOJu0Yz7W/IjY019+Fl+jwWNMIT1v0epnCWn1Ui6W+ozpzab0FnIfwkR 55o5bsUCtTwEREvI116+71AqD0snS8+KzZ6+TJ8= X-Google-Smtp-Source: AGHT+IEjUetXZAw9EmYuPKVC8Sd/R8FMf/n3fQRZjQWrmX0X+2etGIuaootsxMXj238nsu7DpG/AUz5dVGr05W3zh1s= X-Received: by 2002:a05:6512:3487:b0:4fb:8bcd:acd4 with SMTP id v7-20020a056512348700b004fb8bcdacd4mr5214663lfr.37.1691424010056; Mon, 07 Aug 2023 09:00:10 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Uros Bizjak Date: Mon, 7 Aug 2023 17:59:58 +0200 Message-ID: Subject: Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832] To: Richard Biener Cc: "gcc-patches@gcc.gnu.org" , Jan Hubicka , Hongtao Liu Content-Type: multipart/mixed; boundary="0000000000004961fd060257575c" X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --0000000000004961fd060257575c Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Jul 31, 2023 at 11:40=E2=80=AFAM Richard Biener = wrote: > > On Sun, 30 Jul 2023, Uros Bizjak wrote: > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF > > named patterns in order to avoid generation of partial vector V4SFmode > > trapping instructions. > > > > The new option is enabled by default, because even with sanitization, > > a small but consistent speed up of 2 to 3% with Polyhedron capacita > > benchmark can be achieved vs. scalar code. > > > > Using -fno-trapping-math improves Polyhedron capacita runtime 8 to 9% > > vs. scalar code. This is what clang does by default, as it defaults > > to -fno-trapping-math. > > I like the new option, note you lack invoke.texi documentation where > I'd also elaborate a bit on the interaction with -fno-trapping-math > and the possible performance impact then NaNs or denormals leak > into the upper halves and cross-reference -mdaz-ftz. The attached doc patch is invoke.texi entry for -mmmxfp-with-sse option. It is written in a way to also cover half-float vectors. WDYT? Uros. --0000000000004961fd060257575c Content-Type: text/plain; charset="US-ASCII"; name="d.diff.txt" Content-Disposition: attachment; filename="d.diff.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_ll1266w50 ZGlmZiAtLWdpdCBhL2djYy9kb2MvaW52b2tlLnRleGkgYi9nY2MvZG9jL2ludm9rZS50ZXhpCmlu ZGV4IGZhNzY1ZDVhMGRkLi45OTA5MzE3MmFiZSAxMDA2NDQKLS0tIGEvZ2NjL2RvYy9pbnZva2Uu dGV4aQorKysgYi9nY2MvZG9jL2ludm9rZS50ZXhpCkBAIC0xNDE3LDYgKzE0MTcsNyBAQCBTZWUg UlMvNjAwMCBhbmQgUG93ZXJQQyBPcHRpb25zLgogLW1jbGQgIC1tY3gxNiAgLW1zYWhmICAtbW1v dmJlICAtbWNyYzMyIC1tbXdhaXQKIC1tcmVjaXAgIC1tcmVjaXA9QHZhcntvcHR9CiAtbXZ6ZXJv dXBwZXIgIC1tcHJlZmVyLWF2eDEyOCAgLW1wcmVmZXItdmVjdG9yLXdpZHRoPUB2YXJ7b3B0fQor LW1tbXhmcC13aXRoLXNzZQogLW1tb3ZlLW1heD1AdmFye2JpdHN9IC1tc3RvcmUtbWF4PUB2YXJ7 Yml0c30KIC1tbW14ICAtbXNzZSAgLW1zc2UyICAtbXNzZTMgIC1tc3NzZTMgIC1tc3NlNC4xICAt bXNzZTQuMiAgLW1zc2U0ICAtbWF2eAogLW1hdngyICAtbWF2eDUxMmYgIC1tYXZ4NTEycGYgIC1t YXZ4NTEyZXIgIC1tYXZ4NTEyY2QgIC1tYXZ4NTEydmwKQEAgLTMzNzA4LDYgKzMzNzA5LDIyIEBA IFRoaXMgb3B0aW9uIGluc3RydWN0cyBHQ0MgdG8gdXNlIDEyOC1iaXQgQVZYIGluc3RydWN0aW9u cyBpbnN0ZWFkIG9mCiBUaGlzIG9wdGlvbiBpbnN0cnVjdHMgR0NDIHRvIHVzZSBAdmFye29wdH0t Yml0IHZlY3RvciB3aWR0aCBpbiBpbnN0cnVjdGlvbnMKIGluc3RlYWQgb2YgZGVmYXVsdCBvbiB0 aGUgc2VsZWN0ZWQgcGxhdGZvcm0uCiAKK0BvcGluZGV4IC1tbW14ZnAtd2l0aC1zc2UKK0BpdGVt IC1tbW14ZnAtd2l0aC1zc2UKK1RoaXMgb3B0aW9uIGVuYWJsZXMgR0NDIHRvIGdlbmVyYXRlIHRy YXBwaW5nIGZsb2F0aW5nLXBvaW50IG9wZXJhdGlvbnMgb24KK3BhcnRpYWwgdmVjdG9ycywgd2hl cmUgdmVjdG9yIGVsZW1lbnRzIHJlc2lkZSBpbiB0aGUgbG93IHBhcnQgb2YgdGhlIDEyOC1iaXQK K1NTRSByZWdpc3Rlci4gIFVubGVzcyBAb3B0aW9uey1mbm8tdHJhcHBpbmctbWF0aH0gaXMgc3Bl Y2lmaWVkLCB0aGUgY29tcGlsZXIKK2d1YXJhbnRlZXMgY29ycmVjdCB0cmFwcGluZyBiZWhhdmlv ciBieSBzYW5pdGl6aW5nIGFsbCBpbnB1dCBvcGVyYW5kcyB0bworaGF2ZSB6ZXJvZXMgaW4gdGhl IHVwcGVyIHBhcnQgb2YgdGhlIHZlY3RvciByZWdpc3Rlci4gIE5vdGUgdGhhdCBieSB1c2luZwor YnVpbHQtaW4gZnVuY3Rpb25zIG9yIGlubGluZSBhc3NlbWJseSB3aXRoIHBhcnRpYWwgdmVjdG9y IGFyZ3VtZW50cywgTmFOcywKK2Rlbm9ybWFsIG9yIGludmFsaWQgdmFsdWVzIGNhbiBsZWFrIGlu dG8gdGhlIHVwcGVyIHBhcnQgb2YgdGhlIHZlY3RvciwKK2NhdXNpbmcgcG9zc2libGUgcGVyZm9y bWFuY2UgaXNzdWVzIHdoZW4gQG9wdGlvbnstZm5vLXRyYXBwaW5nLW1hdGh9IGlzIGluCitlZmZl Y3QuICBUaGVzZSBpc3N1ZXMgY2FuIGJlIG1pdGlnYXRlZCBieSBtYW51YWxseSBzYW5pdGl6aW5n IHRoZSB1cHBlciBwYXJ0CitvZiB0aGUgcGFydGlhbCB2ZWN0b3IgYXJndW1lbnQgcmVnaXN0ZXIg b3IgYnkgdXNpbmcgQG9wdGlvbnstbWRhei1mdHp9IHRvIHNldAorZGVub3JtYWxzLWFyZS16ZXJv IChEQVopIGZsYWcgaW4gdGhlIE1YQ1NSIHJlZ2lzdGVyLgorCitUaGlzIG9wdGlvbiBpcyBlbmFi bGVkIGJ5IGRlZmF1bHQuCisKIEBvcGluZGV4IG1tb3ZlLW1heAogQGl0ZW0gLW1tb3ZlLW1heD1A dmFye2JpdHN9CiBUaGlzIG9wdGlvbiBpbnN0cnVjdHMgR0NDIHRvIHNldCB0aGUgbWF4aW11bSBu dW1iZXIgb2YgYml0cyBjYW4gYmUK --0000000000004961fd060257575c--