From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 8105A3858C52; Sun, 12 Feb 2023 11:42:07 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8105A3858C52
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1676202127;
	bh=1j9hvOq+N1GWmJLdK1aDj86iieGf/wPWoD6vHjs7E0Q=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=JbaDDL1xgMSewK6LiAq9NbLldGB4rrMEfvbHKOMBs72Q+5Rj/F/KYudX3/lJRlZCx
	 KejQ2Yx57UsQC2ER43OFXoJ7wP1Z0Z9aLVYo7wVjlxW5SxNqeAJw4PTZb+tbQ4uOvX
	 +a9gKl0O9tBJwkRY+M+BETShRywZld6oEr00migQ=
From: "michael.crusoe at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/100927] [sse2] floating point to integer conversion
 functions incorrect results w/ NaN constants + optimization
Date: Sun, 12 Feb 2023 11:42:06 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 11.1.1
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: michael.crusoe at gmail dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-100927-4-ggsrcMb7QO@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-100927-4@http.gcc.gnu.org/bugzilla/>
References: <bug-100927-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D100927

--- Comment #3 from Michael Crusoe <michael.crusoe at gmail dot com> ---
Good question, lets check the reference.

Summary: it is specified behavior that _mm_cvttpd_epi32 returns Integer
Indefinite (80000000H) for NaN inputs.

All references below are from the December 2022 edition (Order Number:
325462-078US) of "Intel=C2=AE 64 and IA-32 Architectures Software Developer=
=E2=80=99s Manual
Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" from
https://www.intel.com/content/www/us/en/developer/articles/technical/intel-=
sdm.html

The formal signature of the _mm_cvttpd_epi32 intrinsic is in Table C-1 "Sim=
ple
Intrinsics" on page 2987, reminding us that the mnemonic is CVTTPD2DQ.

The formal definition of CVTTPD2DQ is given in section 5.6.1.6 "Intel=C2=AE=
 SSE2
Conversion Instructions" on page 133

> Convert with truncation packed double precision floating-point values to =
packed double-
word integers.

On page 106 we learn more about what truncation means in the definition of
CVTTPD2DQ

> 4.8.4.2 Truncation with Intel=C2=AE SSE, SSE2, and AVX Conversion Instruc=
tions
> The following Intel SSE/SSE2 instructions automatically truncate the resu=
lts of
> conversions from floating-point values to integers when the result it ine=
xact: CVTTPD2DQ,
> CVTTPS2DQ, CVTTPD2PI, CVTTPS2PI, CVTTSD2SI, and CVTTSS2SI. Here, truncati=
on means the
> round toward zero mode described in Table 4-8. There are also several Int=
el AVX2 and
> AVX-512 instructions which use truncation (VCVTT*)

Table 4.8 from section 4.8.4 states

> Rounding Mode: Round toward zero (Truncate)
> Description: Rounded result is closest to but no greater in absolute valu=
e than the infinitely precise result.

Section 11.4.1.6 ("SSE2 Conversion Instructions") states that

> The CVTTPD2DQ (convert with truncation packed double precision floating-p=
oint values to
> packed doubleword integers) instruction is similar to the CVTPD2DQ instru=
ction except
> that truncation is used to round a source value to an integer value.

Table 11-1. "Masked Responses of SSE/SSE2/SSE3 Instructions to Invalid
Arithmetic Operations" states that

> Condition: Conversion to integer when the value in the source register is=
 a NaN, =E2=88=9E, or
> exceeds the representable range for CVTPS2PI, CVTTPS2PI, CVTSS2SI, CVTTSS=
2SI, CVTPD2PI,
> CVTSD2SI, CVTPD2DQ, CVTTPD2PI, CVTTSD2SI, CVTTPD2DQ, CVTPS2DQ, or CVTTPS2=
DQ

> Masked Response: Return the integer Indefinite

More explicitly stated is in section D.4.2.2 "Results of Operations with NaN
Operands or a NaN Result for SSE/SSE2/SSE3 Numeric Instructions" where Table
D-8 (page 455) ("CVTPS2PI, CVTSS2SI, CVTTPS2PI, CVTTSS2SI, CVTPD2PI, CVTSD2=
SI,
CVTTPD2PI, CVTTSD2SI, CVTPS2DQ, CVTTPS2DQ, CVTPD2DQ, CVTTPD2DQ") states that
the masked result from any type of NaN (SNaN or QNaN) will be the Integer
Indefinite (80000000H in for 32-bit values).=