From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id CFD173858D3C for ; Fri, 26 May 2023 08:00:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CFD173858D3C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nexgo.de Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nexgo.de Received: from mr6.vodafonemail.de ([145.253.228.166]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q2SNX-0002L0-So for gcc@gnu.org; Fri, 26 May 2023 04:00:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nexgo.de; s=vfde-smtpout-mb-15sep; t=1685088048; bh=ZbfuypoB3VbEq0+f1ugqhIS96fibZjVbEtjVMU9rkPU=; h=Message-ID:From:To:References:In-Reply-To:Subject:Date: Content-Type:X-Mailer:From; b=XZ0XnMKOh6PEUiH5EgjtbZ3NGvoP+TQ4dU0JZO/2Wl5AamanWkSd3V2tsawKBeRx1 rQYsjK6we19y2yiHS+JOWUT48StAfWevxyBVjdb7Mywl5SFL3yVbAfyLM+uIBIjsui WXxj6s2+K1vH7rVN/rEXWqrh8OGTgqA/XoJbmfkU= Received: from smtp.vodafone.de (unknown [10.0.0.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by mr6.vodafonemail.de (Postfix) with ESMTPS id 4QSHRN2jMkz1y3Z; Fri, 26 May 2023 08:00:48 +0000 (UTC) Received: from H270 (p5b38f631.dip0.t-ipconnect.de [91.56.246.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by smtp.vodafone.de (Postfix) with ESMTPSA id 4QSHR96ZhpzMks2; Fri, 26 May 2023 08:00:34 +0000 (UTC) Message-ID: <896EB515110646CEBAA84E98E273E4B8@H270> From: "Stefan Kanthak" To: "Jonathan Wakely" Cc: , "Andrew Pinski" References: <51071A92918346ABBC6B5703179F5174@H270> In-Reply-To: Subject: Re: Will GCC eventually support SSE2 or SSE4.1? Date: Fri, 26 May 2023 09:58:43 +0200 Organization: Me, myself & IT MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Windows Mail 6.0.6002.18197 X-MimeOLE: Produced By Microsoft MimeOLE V6.1.7601.24158 X-purgate-type: clean X-purgate: clean X-purgate-size: 1988 X-purgate-ID: 155817::1685088044-78FF84D1-62962563/0/0 Received-SPF: pass client-ip=145.253.228.166; envelope-from=stefan.kanthak@nexgo.de; helo=mr6.vodafonemail.de X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1,DKIM_VALID_EF=-0.1,RCVD_IN_DNSWL_LOW=-0.7,SPF_HELO_NONE=0.001,SPF_PASS=-0.001,T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_FAIL,SPF_HELO_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: "Jonathan Wakely" wrote: > On Fri, 26 May 2023, 08:01 Andrew Pinski via Gcc, wrote: > >> On Thu, May 25, 2023 at 11:56?PM Stefan Kanthak >> wrote: >>> >>> Hi, >>> >>> compile the following function on a system with Core2 processor >>> (released January 2008) for the 32-bit execution environment: >>> >>> --- demo.c --- >>> int ispowerof2(unsigned long long argument) >>> { >>> return (argument & argument - 1) == 0; >>> } >>> --- EOF --- >>> >>> GCC 13.3: gcc -m32 -O3 demo.c >>> >>> NOTE: -mtune=native is the default! >> >> You need to use -march=native and not -mtune=native .... to turn on >> the architecture features. (Un)fortunately this changes nothing! STOP: that's wrong, it makes it even WORSE! # Compilation provided by Compiler Explorer at https://godbolt.org/ ispowerof2(unsigned long long): vmovq xmm1, QWORD PTR [esp+4] vpcmpeqd xmm0, xmm0, xmm0 xor eax, eax vpaddq xmm0, xmm1, xmm0 vpand xmm0, xmm0, xmm1 vpunpcklqdq xmm0, xmm0, xmm0 vptest xmm0, xmm0 sete al ret That's what I call a REALLY EPIC FAILURE! Compare this unefficient BLOAT to the SSE4.1 code from my original post! > Yes this is just user error. You didn't use the right options to say you > want SSE2. ARGH: please read CAREFULLY what I wrote! 1) I didn't tell GCC to use SSE at all (I DON'T want any compiler to use SSE per default, especially when the generated code is SLOWER and BIGGER than conventional code using the general purpose registers)! 2) GCC uses SSE2 on its own, but doesn't support it well: it FAILS to use PMOVMSKB here, despite -O3! 3) -march=core2 doesn't help too, GCC fails to use SSE4.1 at all! > GCC supports it fine already. DREAM ON! Again: view the 2 counter examples from my original post CAREFULLY! not amused Stefan