From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10028 invoked by alias); 29 Dec 2007 06:32:29 -0000 Received: (qmail 10012 invoked by uid 22791); 29 Dec 2007 06:32:27 -0000 X-Spam-Check-By: sourceware.org Received: from hs-out-0708.google.com (HELO hs-out-2122.google.com) (64.233.178.240) by sourceware.org (qpsmtpd/0.31) with ESMTP; Sat, 29 Dec 2007 06:32:23 +0000 Received: by hs-out-2122.google.com with SMTP id 4so2651985hsl.8 for ; Fri, 28 Dec 2007 22:32:20 -0800 (PST) Received: by 10.150.177.20 with SMTP id z20mr2816921ybe.137.1198909940700; Fri, 28 Dec 2007 22:32:20 -0800 (PST) Received: by 10.150.206.16 with HTTP; Fri, 28 Dec 2007 22:32:20 -0800 (PST) Message-ID: <4fc48eb10712282232j7ec01502qdca0c64dd2710532@mail.gmail.com> Date: Sat, 29 Dec 2007 15:35:00 -0000 From: tbp To: GCC Subject: censored naked SSE reciprocals, -mrecip MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_5907_16148143.1198909940707" X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2007-12/txt/msg00700.txt.bz2 ------=_Part_5907_16148143.1198909940707 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Content-length: 1481 Merry xmas, i lately had some use for -mrecip but it turned out to come with all sorts of strings attached and apparently no opt-out. Briefly, barring inline asm, i can't get gcc to emit those ops without a NR fixup. # cat src/pr-recip.c #include typedef float v4sf_t __attribute__ ((__vector_size__ (16))); __m128 foo(__m128 a) { return _mm_sqrt_ps(a); } __m128 bar(__m128 a) { return _mm_rsqrt_ps(a); } __m128 baz(__m128 a) { return _mm_rcp_ps(a); } v4sf_t nope1(v4sf_t a) { return __builtin_ia32_sqrtps(a); } v4sf_t nope2(v4sf_t a) { return __builtin_ia32_rsqrtps(a); } v4sf_t allright(v4sf_t a) { return __builtin_ia32_rcpps(a); } int main() { return 0; } # /usr/local/gcc-4.3-20071221/bin/gcc -march=native -ffast-math -mrecip -O2 src/pr-recip.c ... and as can be witnessed in the attached asm dump foo, bar, nope1, nope2 get mangled (at least on x86-64 linux). While i can somehow understand the logic behind the automatic transformation of _mm_sqrt_ps - it can be argued that's what the user has asked for - there's no obvious way to opt out. But then i really don't understand why gcc feels the urge to tinker when i specifically ask for a rsqrt. To add insult to injury -mrecip, unlike fast-math, doesn't set any macro so kludging around is a cat / mouse game. Questions: a) is that really by design? b) what's the official way to dodge fixups when -mrecip is active? c) any chance for -mrecip to set __FAST_MATH_NONE_SHALL_PASS__ or something? ------=_Part_5907_16148143.1198909940707 Content-Type: application/octet-stream; name=dump.asm Content-Transfer-Encoding: base64 X-Attachment-Id: f_farrx9sj0 Content-Disposition: attachment; filename=dump.asm Content-length: 5023 MDAwMDAwMDAwMDQwMDQ3MCA8YWxscmlnaHQ+OgogIDQwMDQ3MDoJMGYgNTMg YzAgICAgICAgICAgICAgCXJjcHBzICAleG1tMCwleG1tMAogIDQwMDQ3MzoJ YzMgICAgICAgICAgICAgICAgICAgCXJldHEgICAKICA0MDA0NzQ6CTY2IDY2 IDY2IDJlIDBmIDFmIDg0IAlub3B3ICAgJWNzOjB4MCglcmF4LCVyYXgsMSkK ICA0MDA0N2I6CTAwIDAwIDAwIDAwIDAwIAoKMDAwMDAwMDAwMDQwMDQ4MCA8 YmF6PjoKICA0MDA0ODA6CTBmIDUzIGMwICAgICAgICAgICAgIAlyY3BwcyAg JXhtbTAsJXhtbTAKICA0MDA0ODM6CWMzICAgICAgICAgICAgICAgICAgIAly ZXRxICAgCiAgNDAwNDg0Ogk2NiA2NiA2NiAyZSAwZiAxZiA4NCAJbm9wdyAg ICVjczoweDAoJXJheCwlcmF4LDEpCiAgNDAwNDhiOgkwMCAwMCAwMCAwMCAw MCAKCjAwMDAwMDAwMDA0MDA0OTAgPG5vcGUyPjoKICA0MDA0OTA6CTBmIDI4 IGQwICAgICAgICAgICAgIAltb3ZhcHMgJXhtbTAsJXhtbTIKICA0MDA0OTM6 CTBmIDU3IGM5ICAgICAgICAgICAgIAl4b3JwcyAgJXhtbTEsJXhtbTEKICA0 MDA0OTY6CTBmIDI4IDA1IGQzIDAxIDAwIDAwIAltb3ZhcHMgMHgxZDMoJXJp cCksJXhtbTAgICAgICAgICMgNDAwNjcwIDxfSU9fc3RkaW5fdXNlZCsweDEw PgogIDQwMDQ5ZDoJMGYgMjggZGEgICAgICAgICAgICAgCW1vdmFwcyAleG1t MiwleG1tMwogIDQwMDRhMDoJMGYgYzIgZDkgMDQgICAgICAgICAgCWNtcG5l cXBzICV4bW0xLCV4bW0zCiAgNDAwNGE0OgkwZiA1MiBjYSAgICAgICAgICAg ICAJcnNxcnRwcyAleG1tMiwleG1tMQogIDQwMDRhNzoJMGYgNTQgY2IgICAg ICAgICAgICAgCWFuZHBzICAleG1tMywleG1tMQogIDQwMDRhYToJMGYgNTkg ZDEgICAgICAgICAgICAgCW11bHBzICAleG1tMSwleG1tMgogIDQwMDRhZDoJ MGYgNTkgZDEgICAgICAgICAgICAgCW11bHBzICAleG1tMSwleG1tMgogIDQw MDRiMDoJMGYgNTkgMGQgYzkgMDEgMDAgMDAgCW11bHBzICAweDFjOSglcmlw KSwleG1tMSAgICAgICAgIyA0MDA2ODAgPF9JT19zdGRpbl91c2VkKzB4MjA+ CiAgNDAwNGI3OgkwZiA1YyBjMiAgICAgICAgICAgICAJc3VicHMgICV4bW0y LCV4bW0wCiAgNDAwNGJhOgkwZiA1OSBjMSAgICAgICAgICAgICAJbXVscHMg ICV4bW0xLCV4bW0wCiAgNDAwNGJkOgljMyAgICAgICAgICAgICAgICAgICAJ cmV0cSAgIAogIDQwMDRiZToJNjYgOTAgICAgICAgICAgICAgICAgCXhjaGcg ICAlYXgsJWF4CgowMDAwMDAwMDAwNDAwNGMwIDxiYXI+OgogIDQwMDRjMDoJ MGYgMjggZDAgICAgICAgICAgICAgCW1vdmFwcyAleG1tMCwleG1tMgogIDQw MDRjMzoJMGYgNTcgYzkgICAgICAgICAgICAgCXhvcnBzICAleG1tMSwleG1t MQogIDQwMDRjNjoJMGYgMjggMDUgYTMgMDEgMDAgMDAgCW1vdmFwcyAweDFh MyglcmlwKSwleG1tMCAgICAgICAgIyA0MDA2NzAgPF9JT19zdGRpbl91c2Vk KzB4MTA+CiAgNDAwNGNkOgkwZiAyOCBkYSAgICAgICAgICAgICAJbW92YXBz ICV4bW0yLCV4bW0zCiAgNDAwNGQwOgkwZiBjMiBkOSAwNCAgICAgICAgICAJ Y21wbmVxcHMgJXhtbTEsJXhtbTMKICA0MDA0ZDQ6CTBmIDUyIGNhICAgICAg ICAgICAgIAlyc3FydHBzICV4bW0yLCV4bW0xCiAgNDAwNGQ3OgkwZiA1NCBj YiAgICAgICAgICAgICAJYW5kcHMgICV4bW0zLCV4bW0xCiAgNDAwNGRhOgkw ZiA1OSBkMSAgICAgICAgICAgICAJbXVscHMgICV4bW0xLCV4bW0yCiAgNDAw NGRkOgkwZiA1OSBkMSAgICAgICAgICAgICAJbXVscHMgICV4bW0xLCV4bW0y CiAgNDAwNGUwOgkwZiA1OSAwZCA5OSAwMSAwMCAwMCAJbXVscHMgIDB4MTk5 KCVyaXApLCV4bW0xICAgICAgICAjIDQwMDY4MCA8X0lPX3N0ZGluX3VzZWQr MHgyMD4KICA0MDA0ZTc6CTBmIDVjIGMyICAgICAgICAgICAgIAlzdWJwcyAg JXhtbTIsJXhtbTAKICA0MDA0ZWE6CTBmIDU5IGMxICAgICAgICAgICAgIAlt dWxwcyAgJXhtbTEsJXhtbTAKICA0MDA0ZWQ6CWMzICAgICAgICAgICAgICAg ICAgIAlyZXRxICAgCiAgNDAwNGVlOgk2NiA5MCAgICAgICAgICAgICAgICAJ eGNoZyAgICVheCwlYXgKCjAwMDAwMDAwMDA0MDA0ZjAgPG5vcGUxPjoKICA0 MDA0ZjA6CTBmIDI4IGQwICAgICAgICAgICAgIAltb3ZhcHMgJXhtbTAsJXht bTIKICA0MDA0ZjM6CTBmIDU3IGM5ICAgICAgICAgICAgIAl4b3JwcyAgJXht bTEsJXhtbTEKICA0MDA0ZjY6CTBmIDI4IDA1IDczIDAxIDAwIDAwIAltb3Zh cHMgMHgxNzMoJXJpcCksJXhtbTAgICAgICAgICMgNDAwNjcwIDxfSU9fc3Rk aW5fdXNlZCsweDEwPgogIDQwMDRmZDoJMGYgMjggZGEgICAgICAgICAgICAg CW1vdmFwcyAleG1tMiwleG1tMwogIDQwMDUwMDoJMGYgYzIgZDkgMDQgICAg ICAgICAgCWNtcG5lcXBzICV4bW0xLCV4bW0zCiAgNDAwNTA0OgkwZiA1MiBj YSAgICAgICAgICAgICAJcnNxcnRwcyAleG1tMiwleG1tMQogIDQwMDUwNzoJ MGYgNTQgY2IgICAgICAgICAgICAgCWFuZHBzICAleG1tMywleG1tMQogIDQw MDUwYToJMGYgNTkgZDEgICAgICAgICAgICAgCW11bHBzICAleG1tMSwleG1t MgogIDQwMDUwZDoJMGYgNTkgY2EgICAgICAgICAgICAgCW11bHBzICAleG1t MiwleG1tMQogIDQwMDUxMDoJMGYgNTkgMTUgNjkgMDEgMDAgMDAgCW11bHBz ICAweDE2OSglcmlwKSwleG1tMiAgICAgICAgIyA0MDA2ODAgPF9JT19zdGRp bl91c2VkKzB4MjA+CiAgNDAwNTE3OgkwZiA1YyBjMSAgICAgICAgICAgICAJ c3VicHMgICV4bW0xLCV4bW0wCiAgNDAwNTFhOgkwZiA1OSBjMiAgICAgICAg ICAgICAJbXVscHMgICV4bW0yLCV4bW0wCiAgNDAwNTFkOgljMyAgICAgICAg ICAgICAgICAgICAJcmV0cSAgIAogIDQwMDUxZToJNjYgOTAgICAgICAgICAg ICAgICAgCXhjaGcgICAlYXgsJWF4CgowMDAwMDAwMDAwNDAwNTIwIDxmb28+ OgogIDQwMDUyMDoJMGYgMjggZDAgICAgICAgICAgICAgCW1vdmFwcyAleG1t MCwleG1tMgogIDQwMDUyMzoJMGYgNTcgYzkgICAgICAgICAgICAgCXhvcnBz ICAleG1tMSwleG1tMQogIDQwMDUyNjoJMGYgMjggMDUgNDMgMDEgMDAgMDAg CW1vdmFwcyAweDE0MyglcmlwKSwleG1tMCAgICAgICAgIyA0MDA2NzAgPF9J T19zdGRpbl91c2VkKzB4MTA+CiAgNDAwNTJkOgkwZiAyOCBkYSAgICAgICAg ICAgICAJbW92YXBzICV4bW0yLCV4bW0zCiAgNDAwNTMwOgkwZiBjMiBkOSAw NCAgICAgICAgICAJY21wbmVxcHMgJXhtbTEsJXhtbTMKICA0MDA1MzQ6CTBm IDUyIGNhICAgICAgICAgICAgIAlyc3FydHBzICV4bW0yLCV4bW0xCiAgNDAw NTM3OgkwZiA1NCBjYiAgICAgICAgICAgICAJYW5kcHMgICV4bW0zLCV4bW0x CiAgNDAwNTNhOgkwZiA1OSBkMSAgICAgICAgICAgICAJbXVscHMgICV4bW0x LCV4bW0yCiAgNDAwNTNkOgkwZiA1OSBjYSAgICAgICAgICAgICAJbXVscHMg ICV4bW0yLCV4bW0xCiAgNDAwNTQwOgkwZiA1OSAxNSAzOSAwMSAwMCAwMCAJ bXVscHMgIDB4MTM5KCVyaXApLCV4bW0yICAgICAgICAjIDQwMDY4MCA8X0lP X3N0ZGluX3VzZWQrMHgyMD4KICA0MDA1NDc6CTBmIDVjIGMxICAgICAgICAg ICAgIAlzdWJwcyAgJXhtbTEsJXhtbTAKICA0MDA1NGE6CTBmIDU5IGMyICAg ICAgICAgICAgIAltdWxwcyAgJXhtbTIsJXhtbTAKICA0MDA1NGQ6CWMzICAg ICAgICAgICAgICAgICAgIAlyZXRxICAgCiAgNDAwNTRlOgk5MCAgICAgICAg ICAgICAgICAgICAJbm9wICAgIAogIDQwMDU0ZjoJOTAgICAgICAgICAgICAg ICAgICAgCW5vcCAgICAK ------=_Part_5907_16148143.1198909940707--