From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 6A8E13858D37; Mon, 26 Jun 2023 18:48:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6A8E13858D37 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1687805316; bh=3QA1GXr1uomMA8OB4RY65BNyctz/5DKHyIzhVkh11XM=; h=From:To:Subject:Date:In-Reply-To:References:From; b=bRkQQ+JfK7e3O6/gmA7FIbc3RThuhQEjX9XdIXw7/i5LR1dLTR6ZXDgcY1riNfS4/ oF31IzIQNCImK64U5dF3DG1XpMjGJx0r4RsMG6Bi2MCwWa8rx+Lgq5SEqnf04BqxEG F4Sr2OWdMrlb40eXiSvBqbXSWkd29k43caXZDc7M= From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/102989] Implement C2x's n2763 (_BitInt) Date: Mon, 26 Jun 2023 18:48:34 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102989 --- Comment #70 from Jakub Jelinek --- For right shifts, I wonder if we shouldn't emit inline (perhaps with except= ion of -Os) something like: __attribute__((noipa)) void ashiftrt575 (unsigned long *p, unsigned long *q, int n) { int prec =3D 575; int n1 =3D n & 63; int n2 =3D n / 64; int n3 =3D n1 !=3D 0; int n4 =3D (-n1) & 63; unsigned long ext; int i; for (i =3D n2; i < prec / 64 - n3; ++i) p[i - n2] =3D (q[i] >> n1) | (q[i + n3] << n4); ext =3D ((signed long) (q[prec / 64] << (64 - (prec & 63)))) >> (64 - (pr= ec & 63)); if (n1 && i =3D=3D prec / 64 - n3) { p[i - n2] =3D (q[i] >> n1) | (ext << n4); ++i; } i -=3D n2; p[i] =3D ((signed long) ext) >> n1; ext =3D ((signed long) ext) >> 63; for (++i; i < prec / 64 + 1; ++i) p[i] =3D ext; } __attribute__((noipa)) void lshiftrt575 (unsigned long *p, unsigned long *q, int n) { int prec =3D 575; int n1 =3D n & 63; int n2 =3D n / 64; int n3 =3D n1 !=3D 0; int n4 =3D (-n1) & 63; unsigned long ext; int i; for (i =3D n2; i < prec / 64 - n3; ++i) p[i - n2] =3D (q[i] >> n1) | (q[i + n3] << n4); ext =3D q[prec / 64] & ((1UL << (prec % 64)) - 1); if (n1 && i =3D=3D prec / 64 - n3) { p[i - n2] =3D (q[i] >> n1) | (ext << n4); ++i; } i -=3D n2; p[i] =3D ext >> n1; ext =3D 0; for (++i; i < prec / 64 + 1; ++i) p[i] =3D 0; } (for _BitInt(575) and 64-bit limb little endian). If the shift count is constant, it will allow further optimizations, and if e.g. get_nonzero_bits tells us that n is variable but multiple of li= mb precision, we can optimize some more as well. Looking at what LLVM does, they seem to sign extend in memory to twice as m= any bits and then just use an unrolled loop without any conditionals, but that doesn't look well for memory usage etc.=