public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
@ 2021-02-13 23:51 jeff.science at gmail dot com
  2021-02-15  8:30 ` [Bug target/99092] " rguenth at gcc dot gnu.org
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: jeff.science at gmail dot com @ 2021-02-13 23:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

            Bug ID: 99092
           Summary: Using -O3 and -fprefetch-loop-arrays to compile BLAS
                    on Apple M1 fails
           Product: gcc
           Version: 10.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jeff.science at gmail dot com
  Target Milestone: ---

Created attachment 50179
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50179&action=edit
source code that triggers bug

I am using GCC 10.2.1 installed via Homebrew on an Apple M1 system.

I attempted to compile one of the functions from the BLAS and it fails as
follows.  The failure is triggered by the combined use of -O3 and
-fprefetch-loop-arrays.  Reducing to -O2 or removing the latter eliminates the
issue, so there is a simple workaround.  Nonetheless, I encountered it the
first time I tried to build NWChem on this platform, so it will be seen by
others until I can fix the NWChem build system to avoid it.

I do not see this issue on x86, although I am not sure if I have tested GCC
10.2.1 specifically.

% wget https://netlib.sandia.gov/blas/ctrsm.f # also attached

% gfortran -O3 -fprefetch-loop-arrays -c ctrsm.f && echo OKAY

/var/folders/8n/llwp7zmd4jx697g8sw5w46p00000gn/T//ccj3jW77.s:362:23: error:
index must be a multiple of 8 in range [0, 32760].
        prfm    PLDL1KEEP, [x0, -8]
                                ^
% gfortran -O2 -fprefetch-loop-arrays -c ctrsm.f && echo OKAY

OKAY

% gfortran -O3 -c ctrsm.f && echo OKAY

OKAY

% gfortran --version
GNU Fortran (Homebrew GCC 10.2.0_3) 10.2.1 20201220
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
@ 2021-02-15  8:30 ` rguenth at gcc dot gnu.org
  2021-02-15  8:34 ` marxin at gcc dot gnu.org
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-15  8:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|fortran                     |target
             Target|                            |aarch64
           Keywords|                            |wrong-code

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
sounds familiar, maybe you can try more recent GCC 10 snapshots.  target bug in
printing the asm I guess or an assembler bug in rejecting the constant.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
  2021-02-15  8:30 ` [Bug target/99092] " rguenth at gcc dot gnu.org
@ 2021-02-15  8:34 ` marxin at gcc dot gnu.org
  2021-02-15  8:35 ` marxin at gcc dot gnu.org
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-02-15  8:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

--- Comment #2 from Martin Liška <marxin at gcc dot gnu.org> ---
The problem is very likely in LLVM assembler, GAS works fine.
Please take a look here:
https://reviews.llvm.org/D40011

Can you please paste the output of GCC invocation with --verbose argument?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
  2021-02-15  8:30 ` [Bug target/99092] " rguenth at gcc dot gnu.org
  2021-02-15  8:34 ` marxin at gcc dot gnu.org
@ 2021-02-15  8:35 ` marxin at gcc dot gnu.org
  2021-02-15  9:09 ` iains at gcc dot gnu.org
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-02-15  8:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |marxin at gcc dot gnu.org

--- Comment #3 from Martin Liška <marxin at gcc dot gnu.org> ---
$ aarch64-suse-linux-as --version
GNU assembler (GNU Binutils; openSUSE Tumbleweed) 2.35.1.20201112-1

$ grep 'prfm.*-8' ctrsm.s && aarch64-suse-linux-as ctrsm.s  && echo OK
        prfm    PLDL1KEEP, [x0, -8]
OK

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (2 preceding siblings ...)
  2021-02-15  8:35 ` marxin at gcc dot gnu.org
@ 2021-02-15  9:09 ` iains at gcc dot gnu.org
  2021-02-15  9:33 ` ktkachov at gcc dot gnu.org
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: iains at gcc dot gnu.org @ 2021-02-15  9:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

Iain Sandoe <iains at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fxcoudert at gcc dot gnu.org,
                   |                            |iains at gcc dot gnu.org

--- Comment #4 from Iain Sandoe <iains at gcc dot gnu.org> ---
please note:

The Apple M1 compiler is 'experimental' on master, the back port to 10.2 is
'even more experimental' (and local to Home-brew) - the sources are not yet
part of GCC "upstream" so hard for folks here to fix.

The bug could well be genuine, but please report it either on home-brew, or on
https://github.com/iains/gcc-darwin-arm64/issues - so that we can try to fix is
there (or propose a fix for it here if it's a generic issue).

thanks.

<aside>
Its not practical (with the resources available) to do a GAS port for
aarch64/mach-o, so we will have to fix either the llvm back end (and then wait
for that to be included in Xcode) or fix the asm emitted for the Darwin/Mach-O
back end.
</aside>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (3 preceding siblings ...)
  2021-02-15  9:09 ` iains at gcc dot gnu.org
@ 2021-02-15  9:33 ` ktkachov at gcc dot gnu.org
  2021-02-15  9:36 ` iains at gcc dot gnu.org
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-02-15  9:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ktkachov at gcc dot gnu.org

--- Comment #5 from ktkachov at gcc dot gnu.org ---
I do think it's one of those LLVM assembler issues.
Maybe it's due to the fact that "prfm    PLDL1KEEP, [x0, -8]"
is just the alias to the:
prfum   pldl1keep, [x0, #-8]

architectural instruction.
Or it could be that the lack of '#' confuses the assembler

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (4 preceding siblings ...)
  2021-02-15  9:33 ` ktkachov at gcc dot gnu.org
@ 2021-02-15  9:36 ` iains at gcc dot gnu.org
  2021-02-18  0:37 ` jeff.science at gmail dot com
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: iains at gcc dot gnu.org @ 2021-02-15  9:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

--- Comment #6 from Iain Sandoe <iains at gcc dot gnu.org> ---
(In reply to ktkachov from comment #5)
> I do think it's one of those LLVM assembler issues.
> Maybe it's due to the fact that "prfm    PLDL1KEEP, [x0, -8]"
> is just the alias to the:
> prfum   pldl1keep, [x0, #-8]
> 
> architectural instruction.
> Or it could be that the lack of '#' confuses the assembler

likely the latter - I have one fix for that already approved for master (but
not applied) but that only affected parenthesised expressions e.g. #(a - b).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (5 preceding siblings ...)
  2021-02-15  9:36 ` iains at gcc dot gnu.org
@ 2021-02-18  0:37 ` jeff.science at gmail dot com
  2021-02-18  1:02 ` iains at gcc dot gnu.org
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: jeff.science at gmail dot com @ 2021-02-18  0:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

--- Comment #7 from Jeff Hammond <jeff.science at gmail dot com> ---
@Martin

% gfortran -O3 -fprefetch-loop-arrays --verbose -c ctrsm.f && echo OKAY

Using built-in specs.
COLLECT_GCC=gfortran
Target: aarch64-apple-darwin20
Configured with: ../configure --build=aarch64-apple-darwin20
--prefix=/opt/homebrew/Cellar/gcc/10.2.0_3
--libdir=/opt/homebrew/Cellar/gcc/10.2.0_3/lib/gcc/10 --disable-nls
--enable-checking=release --enable-languages=c,c++,objc,obj-c++,fortran
--program-suffix=-10 --with-gmp=/opt/homebrew/opt/gmp
--with-mpfr=/opt/homebrew/opt/mpfr --with-mpc=/opt/homebrew/opt/libmpc
--with-isl=/opt/homebrew/opt/isl --with-system-zlib --with-pkgversion='Homebrew
GCC 10.2.0_3' --with-bugurl=https://github.com/Homebrew/homebrew-core/issues
--disable-multilib --with-native-system-header-dir=/usr/include
--with-sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
SED=/usr/bin/sed
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.2.1 20201220 (Homebrew GCC 10.2.0_3) 
COLLECT_GCC_OPTIONS='-O3' '-fprefetch-loop-arrays' '-v' '-c'
'-mmacosx-version-min=11.2.0' '-asm_macosx_version_min=11.2' '-mlittle-endian'
'-mabi=lp64'

/opt/homebrew/Cellar/gcc/10.2.0_3/libexec/gcc/aarch64-apple-darwin20/10.2.1/f951
ctrsm.f -ffixed-form -fPIC -quiet -dumpbase ctrsm.f -mmacosx-version-min=11.2.0
-mlittle-endian -mabi=lp64 -auxbase ctrsm -O3 -version -fprefetch-loop-arrays
-fintrinsic-modules-path
/opt/homebrew/Cellar/gcc/10.2.0_3/lib/gcc/10/gcc/aarch64-apple-darwin20/10.2.1/finclude
-o /var/folders/8n/llwp7zmd4jx697g8sw5w46p00000gn/T//ccR79V1w.s
GNU Fortran (Homebrew GCC 10.2.0_3) version 10.2.1 20201220
(aarch64-apple-darwin20)
        compiled by GNU C version 10.2.1 20201220, GMP version 6.2.1, MPFR
version 4.1.0, MPC version 1.2.1, isl version isl-0.23-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU Fortran2008 (Homebrew GCC 10.2.0_3) version 10.2.1 20201220
(aarch64-apple-darwin20)
        compiled by GNU C version 10.2.1 20201220, GMP version 6.2.1, MPFR
version 4.1.0, MPC version 1.2.1, isl version isl-0.23-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
COLLECT_GCC_OPTIONS='-O3' '-fprefetch-loop-arrays' '-v' '-c'
'-mmacosx-version-min=11.2.0'  '-mlittle-endian' '-mabi=lp64'
 as -arch arm64 -v -mmacosx-version-min=11.2 -o ctrsm.o
/var/folders/8n/llwp7zmd4jx697g8sw5w46p00000gn/T//ccR79V1w.s
Apple clang version 12.0.0 (clang-1200.0.32.29)
Target: aarch64-apple-darwin20.3.0
Thread model: posix
InstalledDir:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

"/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang"
-cc1as -triple arm64-apple-macosx11.2.0 -filetype obj -main-file-name
ccR79V1w.s -target-cpu vortex -target-feature +v8.3a -target-feature +fp-armv8
-target-feature +neon -target-feature +crc -target-feature +crypto
-target-feature +fullfp16 -target-feature +ras -target-feature +lse
-target-feature +rdm -target-feature +rcpc -target-feature +zcm -target-feature
+zcz -target-feature +sha2 -target-feature +aes -fdebug-compilation-dir /tmp
-dwarf-debug-producer "Apple clang version 12.0.0 (clang-1200.0.32.29)"
-dwarf-version=4 -mrelocation-model pic -o ctrsm.o
/var/folders/8n/llwp7zmd4jx697g8sw5w46p00000gn/T//ccR79V1w.s
/var/folders/8n/llwp7zmd4jx697g8sw5w46p00000gn/T//ccR79V1w.s:362:23: error:
index must be a multiple of 8 in range [0, 32760].
        prfm    PLDL1KEEP, [x0, -8]
                                ^

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (6 preceding siblings ...)
  2021-02-18  0:37 ` jeff.science at gmail dot com
@ 2021-02-18  1:02 ` iains at gcc dot gnu.org
  2021-02-18  1:15 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: iains at gcc dot gnu.org @ 2021-02-18  1:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

--- Comment #8 from Iain Sandoe <iains at gcc dot gnu.org> ---
it seems that GAS is accepting an encoding that's not specified in at least
version DDI0487Fc_armv8_arm.

that says that 
C6.2.212 PRFM (immediate) takes 

"<pimm> Is the optional positive immediate byte offset, a multiple of 8 in the
range 0 to 32760, defaulting to 0 and encoded in the "imm12" field as
<pimm>/8."

= and

C6.2.215 PRFUM 

"<simm> Is the optional signed immediate byte offset, in the range -256 to 255,
defaulting to 0 and encoded in the "imm9" field."

=======

so probably the bug is present for all targets, not just Darwin - it just
happens to show there.  FWIW, the encoding is shown thus:

PRFM (<prfop>|#<imm5>), [<Xn|SP>{, #<pimm>}]

So LLVM might well also reject it without the '#' (I have encountered at least
one case before where that happened).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (7 preceding siblings ...)
  2021-02-18  1:02 ` iains at gcc dot gnu.org
@ 2021-02-18  1:15 ` pinskia at gcc dot gnu.org
  2021-02-18  1:17 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-02-18  1:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
hmmm, see https://gcc.gnu.org/legacy-ml/gcc-patches/2014-07/msg00612.html :
"When it comes to emitting the pattern, always use "prfm" -- the prfum
form can be generated from the prfm mnemonic when the offset implies
this is necessary."

>From readin the ARM ARM, it does look like the prfm mnemonic should accept the
unscaled 9bit signed value.  Just like how ldr vs ldur.
So the bug is in LLVM assembler I think.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (8 preceding siblings ...)
  2021-02-18  1:15 ` pinskia at gcc dot gnu.org
@ 2021-02-18  1:17 ` pinskia at gcc dot gnu.org
  2021-02-20  9:28 ` iains at gcc dot gnu.org
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-02-18  1:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>From the ARM ARM:
An assembler program translating a Load/Store instruction, for example LDR, is
required to encode an unambiguous offset using the unscaled 9-bit offset
form, and to encode an ambiguous offset using the scaled 12-bit offset form. A
programmer might force the generation of the unscaled 9-bit form by using one
of the mnemonics in Table C3-17. Arm recommends that a disassembler outputs all
unscaled 9-bit offset forms using one of these mnemonics, but unambiguous
offsets can be output using a Load/Store single register mnemonic, for example,
LDR.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (9 preceding siblings ...)
  2021-02-18  1:17 ` pinskia at gcc dot gnu.org
@ 2021-02-20  9:28 ` iains at gcc dot gnu.org
  2021-02-20 19:43 ` iains at gcc dot gnu.org
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: iains at gcc dot gnu.org @ 2021-02-20  9:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

--- Comment #11 from Iain Sandoe <iains at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #10)
> From the ARM ARM:
> An assembler program translating a Load/Store instruction, for example LDR,
> is required to encode an unambiguous offset using the unscaled 9-bit offset
> form, and to encode an ambiguous offset using the scaled 12-bit offset form.
> A programmer might force the generation of the unscaled 9-bit form by using
> one of the mnemonics in Table C3-17. Arm recommends that a disassembler
> outputs all unscaled 9-bit offset forms using one of these mnemonics, but
> unambiguous offsets can be output using a Load/Store single register
> mnemonic, for example, LDR.

it would be nice if that applied to a 'generic' version of the insn (one might
read the advice as so):

prf     PLDL1KEEP, [x0, 200]  ===> assembler chooses prfm/prfum as it likes

prfm  PLDL1KEEP, [x0, 200] --> use the insn I wrote! 
prfm  PLDL1KEEP, [x0, -8] --> .. or error if I'm dumb

prfum PLDL1KEEP, [x0, 200] --> use the insn I wrote! 
prfum PLDL1KEEP, [x0, 4096] --> .. or error if I'm dumb

.... but I guess we have to live with the status quo.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (10 preceding siblings ...)
  2021-02-20  9:28 ` iains at gcc dot gnu.org
@ 2021-02-20 19:43 ` iains at gcc dot gnu.org
  2021-11-09  0:07 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: iains at gcc dot gnu.org @ 2021-02-20 19:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

Iain Sandoe <iains at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |SUSPENDED
   Last reconfirmed|                            |2021-02-20

--- Comment #12 from Iain Sandoe <iains at gcc dot gnu.org> ---
I added an issue to the experimental branch :

https://github.com/iains/gcc-darwin-arm64/issues/43

And produced two patches to work around the issue (although the first should
tighten up the constraint on prf*m for all targets).

--

The first patch is a conservative fix, it just prevents the generation of pfrm
insns when the offset is out of range (and when it would require pfrum for
Darwin)

https://github.com/iains/gcc-darwin-arm64/commit/2fbd9a7f9cddc7e243c0025713841e0bc1465c41

The second patch adds predicate, constraint and patterns for the prfum insn,
which means that Darwin now generates:

prfum [X0, -8]

which is accepted by the LLVM backend,

https://github.com/iains/gcc-darwin-arm64/commit/881a59f2258a5a7a9c2c862420c4e93e9df17f2c

====

Given some more time, I expect that the two could be combined in some way; at
least unless/until LLVM gets a fix and that percolates through to Xcode.

So the bug is "fixed on the experimental branch".

Given that it cannot be fixed on GCC 'upstream' until we have a chance to
submit the port (which isn't ready yet!) .. I suggest that "SUSPEND" is a
reasonable state for this bug.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (11 preceding siblings ...)
  2021-02-20 19:43 ` iains at gcc dot gnu.org
@ 2021-11-09  0:07 ` pinskia at gcc dot gnu.org
  2021-11-09  0:49 ` iains at gcc dot gnu.org
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-11-09  0:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

--- Comment #13 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Did the LLVM assembler get fixed?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (12 preceding siblings ...)
  2021-11-09  0:07 ` pinskia at gcc dot gnu.org
@ 2021-11-09  0:49 ` iains at gcc dot gnu.org
  2024-02-28  5:56 ` pinskia at gcc dot gnu.org
  2024-02-28  6:01 ` pinskia at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: iains at gcc dot gnu.org @ 2021-11-09  0:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

--- Comment #14 from Iain Sandoe <iains at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #13)
> Did the LLVM assembler get fixed?

not as of xcode 13.0 (I don't know if anyone filed a radar tho) - since the
problem was fixed on the branch, I guess no-one was motivated.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (13 preceding siblings ...)
  2021-11-09  0:49 ` iains at gcc dot gnu.org
@ 2024-02-28  5:56 ` pinskia at gcc dot gnu.org
  2024-02-28  6:01 ` pinskia at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-28  5:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

--- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Iain Sandoe from comment #14)
> (In reply to Andrew Pinski from comment #13)
> > Did the LLVM assembler get fixed?
> 
> not as of xcode 13.0 (I don't know if anyone filed a radar tho) - since the
> problem was fixed on the branch, I guess no-one was motivated.

and it is still a bug in the upstream LLVM too; just checked. Will file a bug
there soon.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/99092] Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails
  2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
                   ` (14 preceding siblings ...)
  2024-02-28  5:56 ` pinskia at gcc dot gnu.org
@ 2024-02-28  6:01 ` pinskia at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-28  6:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99092

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://github.com/llvm/llv
                   |                            |m-project/issues/83226

--- Comment #16 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Filed it as https://github.com/llvm/llvm-project/issues/83226 .

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-02-28  6:01 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-13 23:51 [Bug fortran/99092] New: Using -O3 and -fprefetch-loop-arrays to compile BLAS on Apple M1 fails jeff.science at gmail dot com
2021-02-15  8:30 ` [Bug target/99092] " rguenth at gcc dot gnu.org
2021-02-15  8:34 ` marxin at gcc dot gnu.org
2021-02-15  8:35 ` marxin at gcc dot gnu.org
2021-02-15  9:09 ` iains at gcc dot gnu.org
2021-02-15  9:33 ` ktkachov at gcc dot gnu.org
2021-02-15  9:36 ` iains at gcc dot gnu.org
2021-02-18  0:37 ` jeff.science at gmail dot com
2021-02-18  1:02 ` iains at gcc dot gnu.org
2021-02-18  1:15 ` pinskia at gcc dot gnu.org
2021-02-18  1:17 ` pinskia at gcc dot gnu.org
2021-02-20  9:28 ` iains at gcc dot gnu.org
2021-02-20 19:43 ` iains at gcc dot gnu.org
2021-11-09  0:07 ` pinskia at gcc dot gnu.org
2021-11-09  0:49 ` iains at gcc dot gnu.org
2024-02-28  5:56 ` pinskia at gcc dot gnu.org
2024-02-28  6:01 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).