From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 544A7387093A; Fri, 30 Jun 2023 14:05:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 544A7387093A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1688133921; bh=ar4JiqruyftD3obXmIWpLxrVCs1knH09r4W2FTpodLk=; h=From:To:Subject:Date:In-Reply-To:References:From; b=eQ5tOgHRt1lG7x/uIOFYlNwMrEoVoDB50WT2HGAarfcYU2OXGgedUQYDA6ELCLt5A hvNKNIhanDaHr95Z+k00rZ9SSfKhr52UYudXNjkuxJTUJeH/OLwW0aw0ji28YRypAr ouqrFbyZdLOwwHgfFWTT0jRxsqZRJ2+kXXo8K2yY= From: "clyon at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/110381] [11/12/13 Regression] double counting for sum of structs of floating point types Date: Fri, 30 Jun 2023 14:05:20 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.1.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: clyon at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: 11.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110381 --- Comment #15 from Christophe Lyon --- (In reply to Richard Biener from comment #14) > (In reply to Christophe Lyon from comment #12) > > The new testcase (gcc.dg/vect/pr110381.c) fails: > > FAIL: gcc.dg/vect/pr110381.c -flto -ffat-lto-objects execution test > > FAIL: gcc.dg/vect/pr110381.c execution test > >=20 > > on arm-linux-gnueabihf configured with --with-float=3Dhard > > --with-fpu=3Dneon-fp-armv8 --with-mode=3Dthumb --with-arch=3Darmv8-a >=20 > Can you check if it works now? I've added a missing check_vect () call in > case the harness passes in command-line options that your HW doesn't > support. Otherwise I'd appreciate command-line options to reproduce. I still fails (check_vect() passes on my config, so there's no change). Here is what sum_8_foos looks like: sum_8_foos: @ args =3D 0, pretend =3D 0, frame =3D 0 @ frame_needed =3D 0, uses_anonymous_args =3D 0 @ link register save eliminated. vmov.i64 d0, #0 @ float add r3, r0, #192 .L10: vldr.64 d16, [r0, #8] adds r0, r0, #24 vldr.64 d18, [r0, #-24] vldr.64 d17, [r0, #-8] cmp r3, r0 vadd.f64 d16, d16, d18 vadd.f64 d16, d16, d17 vadd.f64 d0, d0, d16 bne .L10 bx lr so we load: d16=3D5 d17=3D-__DBL_MAX__ d18=3D__DBL_MAX__ the first addition makes d16=3D__DBL_MAX__ and the second one makes d16=3D0 > I cannot get anything to vectorize with a cc1 cross using >=20 > > ./cc1 -quiet t.c -O2 -ftree-vectorize -fno-vect-cost-model -fopt-info-v= ec -I include tri >=20 > but I have a cross configured with --with-float=3Dhard --with-cpu=3Dcorte= x-a9 > --with-fpu=3Dneon-fp16 Not sure what happens. I tried my native compiler with the above flags, I g= et the same code. I tried to build my native compiler with the same flags, same code too. >=20 > I hope the FPU is compliant enough to compute __DBL_MAX__ + -__DBL_MAX__ + > 5. to 5.=