From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from elaine.keithp.com (home.keithp.com [63.227.221.253]) by sourceware.org (Postfix) with ESMTPS id 0EA403857C54 for ; Sat, 8 Aug 2020 22:34:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0EA403857C54 Received: from localhost (localhost [127.0.0.1]) by elaine.keithp.com (Postfix) with ESMTP id BDBE33F2D304 for ; Sat, 8 Aug 2020 15:34:15 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at keithp.com Received: from elaine.keithp.com ([127.0.0.1]) by localhost (elaine.keithp.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Aabu6HrFo-qZ for ; Sat, 8 Aug 2020 15:34:15 -0700 (PDT) Received: from keithp.com (koto.keithp.com [10.0.0.2]) by elaine.keithp.com (Postfix) with ESMTPSA id E94CB3F2D301 for ; Sat, 8 Aug 2020 15:34:14 -0700 (PDT) Received: by keithp.com (Postfix, from userid 1000) id ABA7715821C1; Sat, 8 Aug 2020 15:34:14 -0700 (PDT) From: Keith Packard To: newlib@sourceware.org Subject: [PATCH 0/3] ARM with only 32-bit floats do not have fast 64-bit FMA Date: Sat, 8 Aug 2020 15:34:10 -0700 Message-Id: <20200808223413.4015633-1-keithp@keithp.com> X-Mailer: git-send-email 2.28.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Newlib mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Aug 2020 22:34:19 -0000 I added some new test configurations to my CI system for picolibc and discovered that when the new math code was built on 32-bit ARM processors with only single-precision floating hardware, several math functions were returning imprecise results. I got the expected results on processors with no FPU and on processors with both 32- and 64- bit FPUs. I discovered that the affected functions were using the 'fma' function on this hardware, even though (lacking 64-bit HW support), that function was being emulated without the required precision. This all boiled down to math_config.h incorrectly detecting 64-bit FMA support on ARM processors. This patch series contains three changes: 1. fix the fast FMA process so that 32-bit ARM processors without 64-bit FMA support don't use 'fma' for the new math functions 2. Add detection of fast FMAF, which 32-bit ARM processors with only 32-bit FPUs *do* support. 3. Add ARM versions of fma and fmaf which are used when those instructions are available.