public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
@ 2023-11-06 21:00 sjames at gcc dot gnu.org
2023-11-06 21:00 ` [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA " sjames at gcc dot gnu.org
` (55 more replies)
0 siblings, 56 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-06 21:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
Bug ID: 112415
Summary: [14 regression] Python 3.11 miscompiled with new RTL
fold mem offset pass, since r14-4664-g04c9cf5c786b94
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: sjames at gcc dot gnu.org
CC: danglin at gcc dot gnu.org, manolis.tsamis at vrull dot eu
Target Milestone: ---
I've bisected this twice and come to r14-4664-g04c9cf5c786b94 ('Implement new
RTL optimizations pass: fold-mem-offsets'). -fno-fold-mem-offsets makes things
work.
Python 3.11.6 fails to build on HPPA since that commit with the built-Python
segfaulting during the build.
```
hppa2.0-unknown-linux-gnu-gcc -c -Wsign-compare -DNDEBUG -O2 -pipe
-march=2.0 -fdiagnostics-color=always -frecord-gcc-switches -ggdb3 -fwrapv
-std=c11 -Wextra -Wno-unused-parameter -Wno-missing-field-init
ializers -Wstrict-prototypes -Werror=implicit-function-declaration
-fvisibility=hidden -I./Include/internal -I. -I./Include
-I/usr/include/ncursesw -fPIC -DPy_BUILD_CORE -o Python/frozen.o
Python/frozen.c
./_bootstrap_python ./Tools/scripts/deepfreeze.py \
Python/frozen_modules/importlib._bootstrap.h:importlib._bootstrap \
Python/frozen_modules/importlib._bootstrap_external.h:importlib._bootstrap_external
\
Python/frozen_modules/zipimport.h:zipimport \
Python/frozen_modules/abc.h:abc \
Python/frozen_modules/codecs.h:codecs \
Python/frozen_modules/io.h:io \
Python/frozen_modules/_collections_abc.h:_collections_abc \
Python/frozen_modules/_sitebuiltins.h:_sitebuiltins \
Python/frozen_modules/genericpath.h:genericpath \
Python/frozen_modules/ntpath.h:ntpath \
Python/frozen_modules/posixpath.h:posixpath \
Python/frozen_modules/os.h:os \
Python/frozen_modules/site.h:site \
Python/frozen_modules/stat.h:stat \
Python/frozen_modules/importlib.util.h:importlib.util \
Python/frozen_modules/importlib.machinery.h:importlib.machinery \
Python/frozen_modules/runpy.h:runpy \
Python/frozen_modules/__hello__.h:__hello__ \
Python/frozen_modules/__phello__.h:__phello__ \
Python/frozen_modules/__phello__.ham.h:__phello__.ham \
Python/frozen_modules/__phello__.ham.eggs.h:__phello__.ham.eggs \
Python/frozen_modules/__phello__.spam.h:__phello__.spam \
Python/frozen_modules/frozen_only.h:frozen_only \
-o Python/deepfreeze/deepfreeze.c
make: *** [Makefile:1298: Python/deepfreeze/deepfreeze.c] Segmentation fault
(core dumped)
make: *** Waiting for unfinished jobs....
hppa2.0-unknown-linux-gnu-gcc -c -I./Modules/_decimal/libmpdec -DCONFIG_32=1
-DANSI=1 -Wsign-compare -DNDEBUG -O2 -pipe -march=2.0
-fdiagnostics-color=always -frecord-gcc-switches -ggdb3 -fwrapv -std=c11
-Wextra -Wno-unused-parameter -Wno-missing-field-initializers
-Wstrict-prototypes -Werror=implicit-function-declaration -fvisibility=hidden
-I./Include/internal -I. -I./Include -I/usr/include/ncursesw -fPIC -fPIC -o
Modules/_decimal/libmpdec/mpdecimal.o ./Modules/_decimal/libmpdec/mpdecimal.c
* ERROR: dev-lang/python-3.11.6::gentoo failed (compile phase):
* emake failed
```
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
@ 2023-11-06 21:00 ` sjames at gcc dot gnu.org
2023-11-06 21:01 ` sjames at gcc dot gnu.org
` (54 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-06 21:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
Sam James <sjames at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[14 regression] Python 3.11 |[14 regression] Python 3.11
|miscompiled with new RTL |miscompiled on HPPA with
|fold mem offset pass, since |new RTL fold mem offset
|r14-4664-g04c9cf5c786b94 |pass, since
| |r14-4664-g04c9cf5c786b94
--- Comment #1 from Sam James <sjames at gcc dot gnu.org> ---
Backtrace from the crashing Python:
```
(gdb) r
Starting program:
/var/tmp/portage/dev-lang/python-3.11.6/work/Python-3.11.6/_bootstrap_python
./Tools/scripts/deepfreeze.py
Python/frozen_modules/importlib._bootstrap.h:importlib._bootstrap
Python/frozen_modules/importlib._bootstrap_external.h:importlib._bootstrap_external
Python/frozen_modules/zipimport.h:zipimport Python/frozen_modules/abc.h:abc
Python/frozen_modules/codecs.h:codecs Python/frozen_modules/io.h:io
Python/frozen_modules/_collections_abc.h:_collections_abc
Python/frozen_modules/_sitebuiltins.h:_sitebuiltins
Python/frozen_modules/genericpath.h:genericpath
Python/frozen_modules/ntpath.h:ntpath
Python/frozen_modules/posixpath.h:posixpath Python/frozen_modules/os.h:os
Python/frozen_modules/site.h:site Python/frozen_modules/stat.h:stat
Python/frozen_modules/importlib.util.h:importlib.util
Python/frozen_modules/importlib.machinery.h:importlib.machinery
Python/frozen_modules/runpy.h:runpy Python/frozen_modules/__hello__.h:__hello__
Python/frozen_modules/__phello__.h:__phello__
Python/frozen_modules/__phello__.ham.h:__phello__.ham
Python/frozen_modules/__phello__.ham.eggs.h:__phello__.ham.eggs
Python/frozen_modules/__phello__.spam.h:__phello__.spam
Python/frozen_modules/frozen_only.h:frozen_only -o
Python/deepfreeze/deepfreeze.c
warning: File "/usr/lib/libthread_db.so.1" auto-loading has been declined by
your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path /usr/lib/libthread_db.so.1
line to your configuration file "/root/.config/gdb/gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/root/.config/gdb/gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
warning: Unable to find libthread_db matching inferior's thread library, thread
debugging will not be available.
Program received signal SIGSEGV, Segmentation fault.
0x412083fc in _PyST_GetSymbol (name=0xf9a33a60, ste=<optimized out>) at
Python/symtable.c:396
396 PyObject *v = PyDict_GetItemWithError(ste->ste_symbols, name);
(gdb) bt
#0 0x412083fc in _PyST_GetSymbol (name=0xf9a33a60, ste=<optimized out>) at
Python/symtable.c:396
#1 _PyST_GetScope (ste=<optimized out>, name=0xf9a33a60) at
Python/symtable.c:406
#2 0x411bb8f8 in compiler_nameop (c=0xf7b03b88, name=<optimized out>,
ctx=Load) at Python/compile.c:4274
#3 0x411be074 in compiler_visit_expr (c=0x1, e=<optimized out>) at
Python/compile.c:5969
#4 0x411bcc88 in compiler_visit_expr1 (c=0xf7b03b88, e=0x1) at
Python/compile.c:5915
#5 0x411be074 in compiler_visit_expr (c=0x1, e=<optimized out>) at
Python/compile.c:5969
#6 0x411bceac in compiler_call (e=0x1, c=0xf7b03b88) at Python/compile.c:4952
#7 compiler_visit_expr1 (c=0xf7b03b88, e=0x1) at Python/compile.c:5905
#8 0x411c1f34 in compiler_visit_expr (e=<optimized out>, c=0xf9a33a60) at
Python/compile.c:5969
#9 compiler_decorators (decos=0x8d, c=0xf9a33a60) at Python/compile.c:2327
#10 compiler_class (c=0xf9a33a60, s=0x414e4490) at Python/compile.c:2702
#11 0x411c566c in compiler_body (c=0xf7b03b88, stmts=0xf9a33a60) at
Python/compile.c:2180
#12 0x411c7e98 in compiler_mod (mod=0xf7b03b88, c=0x0) at Python/compile.c:2197
#13 _PyAST_Compile (mod=0xf7b03b88, filename=0x8d, flags=<optimized out>,
optimize=<optimized out>, arena=<optimized out>) at Python/compile.c:581
#14 0x411fe7b8 in Py_CompileStringObject (str=0xf7b03b88
"\371\240\277\220\371\236\353`\371\257\221\260\367\260:t", filename=0x8d,
start=-139445336, flags=0xf9a33a60, optimize=<optimized out>)
at Python/pythonrun.c:1799
#15 0x4119c334 in builtin_compile_impl (module=<optimized out>,
feature_version=<optimized out>, optimize=<optimized out>,
dont_inherit=<optimized out>, flags=<optimized out>, mode=<optimized out>,
filename=0xf998db68, source=0x8d) at Python/bltinmodule.c:831
#16 builtin_compile (module=<optimized out>, args=<optimized out>,
nargs=<optimized out>, kwnames=<optimized out>) at
Python/clinic/bltinmodule.c.h:328
#17 0x410f3ae4 in cfunction_vectorcall_FASTCALL_KEYWORDS (func=0xf9a33a60,
args=0x8d, nargsf=<optimized out>, kwnames=<optimized out>) at
./Include/cpython/methodobject.h:52
#18 0x4109fa88 in _PyVectorcall_Call (tstate=0xf7b03b88, func=<optimized out>,
callable=0xf9a33a60, tuple=<optimized out>, kwargs=<optimized out>) at
Objects/call.c:257
#19 0x4109fd28 in _PyObject_Call (tstate=0xf9a33a60, callable=0x1,
args=0xf7b03ba8, kwargs=0x8d) at Objects/call.c:328
#20 0x4109fdb8 in PyObject_Call () at Objects/call.c:352
#21 0x411a47c8 in do_call_core (tstate=0x8d, func=0x1, callargs=0xf9a33a60,
kwdict=0xf7b03b88, use_tracing=<optimized out>) at Python/ceval.c:7315
#22 0x411ab5dc in _PyEval_EvalFrameDefault (tstate=0xf7b03ba8,
frame=0xf9a33a60, throwflag=1) at Python/ceval.c:5367
#23 0x411af42c in _PyEval_EvalFrame (throwflag=0, frame=0xf9a33a60, tstate=0x1)
at ./Include/internal/pycore_ceval.h:73
#24 _PyEval_Vector (tstate=0x1, func=<optimized out>, locals=<optimized out>,
args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at
Python/ceval.c:6425
#25 0x4109fe48 in _PyFunction_Vectorcall (func=<optimized out>,
stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at
Objects/call.c:396
#26 0x410a0a0c in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized
out>, args=0xf998db68, callable=0xf7b03b88, tstate=0xf7b03ba8) at
./Include/internal/pycore_call.h:92
#27 object_vacall (tstate=0xf7b03ba8, base=<optimized out>,
callable=0xf7b03b88, vargs=<optimized out>) at Objects/call.c:819
#28 0x410a0be0 in PyObject_CallMethodObjArgs (obj=<optimized out>,
name=<optimized out>) at Objects/call.c:879
#29 0x411dd9e8 in import_find_and_load (abs_name=0xf7b03ba8, tstate=0xf9a33a60)
at Python/import.c:1737
#30 PyImport_ImportModuleLevelObject (name=0x1, globals=<optimized out>,
locals=<optimized out>, fromlist=0xf7b03b88, level=<optimized out>) at
Python/import.c:1836
#31 0x411aefbc in import_name (level=<optimized out>, fromlist=<optimized out>,
name=<optimized out>, frame=<optimized out>, tstate=<optimized out>) at
Python/ceval.c:7415
#32 _PyEval_EvalFrameDefault (tstate=0xf7b03ba8, frame=0xf9a33a60, throwflag=1)
at Python/ceval.c:3937
#33 0x411af42c in _PyEval_EvalFrame (throwflag=0, frame=0xf9a33a60, tstate=0x1)
at ./Include/internal/pycore_ceval.h:73
#34 _PyEval_Vector (tstate=0x1, func=<optimized out>, locals=<optimized out>,
args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at
Python/ceval.c:6425
#35 0x411af4e4 in PyEval_EvalCode (co=0xf9a33a60, globals=<optimized out>,
locals=0xf7b03b88) at Python/ceval.c:1140
#36 0x4119b6d4 in builtin_exec_impl (module=<optimized out>, closure=<optimized
out>, locals=0xf7b03ba8, globals=0x8d, source=0xf998db68) at
Python/bltinmodule.c:1077
#37 builtin_exec (module=<optimized out>, args=<optimized out>,
nargs=<optimized out>, kwnames=<optimized out>) at
Python/clinic/bltinmodule.c.h:465
#38 0x410f3ae4 in cfunction_vectorcall_FASTCALL_KEYWORDS (func=0xf9a33a60,
args=0x8d, nargsf=<optimized out>, kwnames=<optimized out>) at
./Include/cpython/methodobject.h:52
#39 0x4109fa14 in _PyVectorcall_Call (tstate=0xf7b03b88, func=<optimized out>,
callable=0xf9a33a60, tuple=<optimized out>, kwargs=<optimized out>) at
Objects/call.c:245
#40 0x4109fd28 in _PyObject_Call (tstate=0xf9a33a60, callable=0x1,
args=0xf7b03ba8, kwargs=0x8d) at Objects/call.c:328
#41 0x4109fdb8 in PyObject_Call () at Objects/call.c:352
#42 0x411a47c8 in do_call_core (tstate=0x8d, func=0x1, callargs=0xf9a33a60,
kwdict=0xf7b03b88, use_tracing=<optimized out>) at Python/ceval.c:7315
#43 0x411ab5dc in _PyEval_EvalFrameDefault (tstate=0xf7b03ba8,
frame=0xf9a33a60, throwflag=1) at Python/ceval.c:5367
#44 0x411af42c in _PyEval_EvalFrame (throwflag=0, frame=0xf9a33a60, tstate=0x1)
at ./Include/internal/pycore_ceval.h:73
#45 _PyEval_Vector (tstate=0x1, func=<optimized out>, locals=<optimized out>,
args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at
Python/ceval.c:6425
#46 0x4109fe48 in _PyFunction_Vectorcall (func=<optimized out>,
stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at
Objects/call.c:396
--Type <RET> for more, q to quit, c to continue without paging--
#47 0x410a0a0c in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized
out>, args=0xf998db68, callable=0xf7b03b88, tstate=0xf7b03ba8) at
./Include/internal/pycore_call.h:92
#48 object_vacall (tstate=0xf7b03ba8, base=<optimized out>,
callable=0xf7b03b88, vargs=<optimized out>) at Objects/call.c:819
#49 0x410a0be0 in PyObject_CallMethodObjArgs (obj=<optimized out>,
name=<optimized out>) at Objects/call.c:879
#50 0x411dd9e8 in import_find_and_load (abs_name=0xf7b03ba8, tstate=0xf9a33a60)
at Python/import.c:1737
#51 PyImport_ImportModuleLevelObject (name=0x1, globals=<optimized out>,
locals=<optimized out>, fromlist=0xf7b03b88, level=<optimized out>) at
Python/import.c:1836
#52 0x411aefbc in import_name (level=<optimized out>, fromlist=<optimized out>,
name=<optimized out>, frame=<optimized out>, tstate=<optimized out>) at
Python/ceval.c:7415
#53 _PyEval_EvalFrameDefault (tstate=0xf7b03ba8, frame=0xf9a33a60, throwflag=1)
at Python/ceval.c:3937
#54 0x411af42c in _PyEval_EvalFrame (throwflag=0, frame=0xf9a33a60, tstate=0x1)
at ./Include/internal/pycore_ceval.h:73
#55 _PyEval_Vector (tstate=0x1, func=<optimized out>, locals=<optimized out>,
args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at
Python/ceval.c:6425
#56 0x411af4e4 in PyEval_EvalCode (co=0xf9a33a60, globals=<optimized out>,
locals=0xf7b03b88) at Python/ceval.c:1140
#57 0x411fa628 in run_eval_code_obj (tstate=0xf7b03b88, co=0xf9a33a60,
globals=0x1, locals=0x8d) at Python/pythonrun.c:1710
#58 0x411fa8d8 in run_mod (mod=<optimized out>, filename=<optimized out>,
globals=0xf9a33a60, locals=0x8d, flags=<optimized out>, arena=<optimized out>)
at Python/pythonrun.c:1731
#59 0x411faa50 in pyrun_file (fp=0x0, filename=0x8d, start=<optimized out>,
globals=0xf7b03b88, locals=<optimized out>, closeit=<optimized out>,
flags=<optimized out>) at Python/pythonrun.c:1626
#60 0x411fdc38 in _PyRun_SimpleFileObject (fp=0xf998db68, filename=0x8d,
closeit=-139445336, flags=0x0) at Python/pythonrun.c:440
#61 0x411fe30c in _PyRun_AnyFileObject (fp=0xf9a33a60, filename=0x1,
closeit=141, flags=0xf7b03b88) at Python/pythonrun.c:79
#62 0x41222278 in pymain_run_file_obj (skip_source_first_line=1095637024,
filename=0xf7b03ba8, program_name=0x8e) at Modules/main.c:360
#63 pymain_run_file (config=0x1) at Modules/main.c:379
#64 pymain_run_python (exitcode=0x8d) at Modules/main.c:601
#65 Py_RunMain () at Modules/main.c:680
#66 0x4104c4c8 in main (argc=<optimized out>, argv=<optimized out>) at
Programs/_bootstrap_python.c:109
(gdb)
```
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
2023-11-06 21:00 ` [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA " sjames at gcc dot gnu.org
@ 2023-11-06 21:01 ` sjames at gcc dot gnu.org
2023-11-06 21:03 ` pinskia at gcc dot gnu.org
` (53 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-06 21:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #2 from Sam James <sjames at gcc dot gnu.org> ---
I'll grab a bad vs good build directory next and upload both, and then try see
which objects differ.
Dave, can you reproduce?
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
2023-11-06 21:00 ` [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA " sjames at gcc dot gnu.org
2023-11-06 21:01 ` sjames at gcc dot gnu.org
@ 2023-11-06 21:03 ` pinskia at gcc dot gnu.org
2023-11-06 21:31 ` dave.anglin at bell dot net
` (52 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-06 21:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |wrong-code
Target Milestone|--- |14.0
Target| |hppa2.0-unknown-linux-gnu
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (2 preceding siblings ...)
2023-11-06 21:03 ` pinskia at gcc dot gnu.org
@ 2023-11-06 21:31 ` dave.anglin at bell dot net
2023-11-06 22:09 ` sjames at gcc dot gnu.org
` (51 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-06 21:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #3 from dave.anglin at bell dot net ---
On 2023-11-06 4:00 p.m., sjames at gcc dot gnu.org wrote:
> Program received signal SIGSEGV, Segmentation fault.
> 0x412083fc in _PyST_GetSymbol (name=0xf9a33a60, ste=<optimized out>) at
> Python/symtable.c:396
> 396 PyObject *v = PyDict_GetItemWithError(ste->ste_symbols, name);
> (gdb) bt
> #0 0x412083fc in _PyST_GetSymbol (name=0xf9a33a60, ste=<optimized out>) at
> Python/symtable.c:396
> #1 _PyST_GetScope (ste=<optimized out>, name=0xf9a33a60) at
> Python/symtable.c:406
Probably, ste is NULL or in page 0, and it's symtable.c that's miscompiled.
There's not a lot of testing of gcc-14 on hppa yet.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (3 preceding siblings ...)
2023-11-06 21:31 ` dave.anglin at bell dot net
@ 2023-11-06 22:09 ` sjames at gcc dot gnu.org
2023-11-06 22:11 ` sjames at gcc dot gnu.org
` (50 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-06 22:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #4 from Sam James <sjames at gcc dot gnu.org> ---
Created attachment 56520
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56520&action=edit
list_of_differing_files.txt
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (4 preceding siblings ...)
2023-11-06 22:09 ` sjames at gcc dot gnu.org
@ 2023-11-06 22:11 ` sjames at gcc dot gnu.org
2023-11-06 22:20 ` law at gcc dot gnu.org
` (49 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-06 22:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
Sam James <sjames at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |law at gcc dot gnu.org
--- Comment #5 from Sam James <sjames at gcc dot gnu.org> ---
Built with 14.0.0 20231029.
*
https://dev.gentoo.org/~sam/bugs/gcc/gcc-python-hppa/cpython-3.11.6-good.tar.xz
*
https://dev.gentoo.org/~sam/bugs/gcc/gcc-python-hppa/cpython-3.11.6-bad.tar.xz
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (5 preceding siblings ...)
2023-11-06 22:11 ` sjames at gcc dot gnu.org
@ 2023-11-06 22:20 ` law at gcc dot gnu.org
2023-11-06 22:33 ` dave.anglin at bell dot net
` (48 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: law at gcc dot gnu.org @ 2023-11-06 22:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #6 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Do we have assembly code around the faulting point (x/20i $pc) and a register
dump (i r)? The biggest concern I'd have with f-m-o on the PA would be the
implicit segment selection that happens on the base register -- but it would
only be an issue if we are faulting on an unscaled indexed addressing mode and
only if the linux-gnu port was actually putting different values into the space
registers.
WRT testing -- we did test this on hppa1.1-linux-gnu. Just a bootstrap and
regression test of the compiler itself.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (6 preceding siblings ...)
2023-11-06 22:20 ` law at gcc dot gnu.org
@ 2023-11-06 22:33 ` dave.anglin at bell dot net
2023-11-06 22:49 ` sjames at gcc dot gnu.org
` (47 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-06 22:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #7 from dave.anglin at bell dot net ---
On 2023-11-06 5:20 p.m., law at gcc dot gnu.org wrote:
> The biggest concern I'd have with f-m-o on the PA would be the
> implicit segment selection that happens on the base register -- but it would
> only be an issue if we are faulting on an unscaled indexed addressing mode and
> only if the linux-gnu port was actually putting different values into the space
> registers.
The linux-gnu port does not put different values into the space resisters.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (7 preceding siblings ...)
2023-11-06 22:33 ` dave.anglin at bell dot net
@ 2023-11-06 22:49 ` sjames at gcc dot gnu.org
2023-11-06 23:11 ` sjames at gcc dot gnu.org
` (46 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-06 22:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #8 from Sam James <sjames at gcc dot gnu.org> ---
(In reply to Jeffrey A. Law from comment #6)
Program received signal SIGSEGV, Segmentation fault.
0x412083f0 in _PyST_GetSymbol (name=0xf9a34a00, ste=<optimized out>) at
Python/symtable.c:396
396 PyObject *v = PyDict_GetItemWithError(ste->ste_symbols, name);
(gdb) x/20i $pc
=> 0x412083f0 <_PyST_GetScope+20>: ldw c(r26),r26
0x412083f4 <_PyST_GetScope+24>: movb,= ret0,r26,0x41208414
<_PyST_GetScope+56>
0x412083f8 <_PyST_GetScope+28>: copy r4,r19
0x412083fc <_PyST_GetScope+32>: b,l 0x410d6900 <PyLong_AsLong>,rp
0x41208400 <_PyST_GetScope+36>: nop
0x41208404 <_PyST_GetScope+40>: ldw -54(sp),rp
0x41208408 <_PyST_GetScope+44>: extrw,u ret0,20,4,ret0
0x4120840c <_PyST_GetScope+48>: bve (rp)
0x41208410 <_PyST_GetScope+52>: ldw,mb -40(sp),r4
0x41208414 <_PyST_GetScope+56>: copy r26,ret0
0x41208418 <_PyST_GetScope+60>: ldw -54(sp),rp
0x4120841c <_PyST_GetScope+64>: bve (rp)
0x41208420 <_PyST_GetScope+68>: ldw,mb -40(sp),r4
0x41208424 <_Py_SymtableStringObjectFlags>: stw rp,-14(sp)
0x41208428 <_Py_SymtableStringObjectFlags+4>: stw,ma r8,80(sp)
0x4120842c <_Py_SymtableStringObjectFlags+8>: copy r23,r8
0x41208430 <_Py_SymtableStringObjectFlags+12>: stw r7,-7c(sp)
0x41208434 <_Py_SymtableStringObjectFlags+16>: copy r24,r7
0x41208438 <_Py_SymtableStringObjectFlags+20>: stw r6,-78(sp)
0x4120843c <_Py_SymtableStringObjectFlags+24>: copy r25,r6
(gdb)
(gdb) i r
flags <unavailable>
r1 0x411bc688 1092339336
rp 0x412083f7 1092649975
r3 0x1 1
r4 0x4136c000 1094107136
r5 0xf9a34a00 4188228096
r6 0x8d 141
r7 0xf7b03b88 4155521928
r8 0xf7b03ba8 4155521960
r9 0xf9953b68 4187306856
r10 0x0 0
r11 0x8e 142
r12 0x414e1820 1095637024
r13 0x414e4490 1095648400
r14 0xf9a76498 4188497048
r15 0x1 1
r16 0xf99bb5e8 4187731432
r17 0xf9ae11b4 4188934580
r18 0xf99e3b68 4187896680
r19 0x4136c000 1094107136
r20 0x411bc7f0 1092339696
r21 0x41450268 1095041640
r22 0x8d 141
r23 0x1 1
r24 0x1 1
r25 0xf9a34a00 4188228096
r26 0x34 52
dp 0x4136c000 1094107136
ret0 0xf9964020 4187373600
ret1 0x8d 141
sp 0xf7b04080 4155523200
r31 0x1 1
sar 0x3d 61
pcoqh 0x412083f3 1092649971
pcsqh <unavailable>
pcoqt 0x410e4c0f 1091456015
pcsqt <unavailable>
eiem <unavailable>
iir <unavailable>
isr <unavailable>
ior <unavailable>
ipsw 0xeff0f 982799
goto <unavailable>
sr4 <unavailable>
sr0 <unavailable>
sr1 <unavailable>
sr2 <unavailable>
sr3 <unavailable>
sr5 <unavailable>
sr6 <unavailable>
sr7 <unavailable>
cr0 <unavailable>
cr8 <unavailable>
cr9 <unavailable>
ccr <unavailable>
cr12 <unavailable>
cr13 <unavailable>
cr24 <unavailable>
cr25 <unavailable>
cr26 0xeff0f 982799
mpsfu_high 0xf7afa500 4155483392
mpsfu_low <unavailable>
mpsfu_ovflo <unavailable>
pad <unavailable>
fpsr <unavailable>
fpe1 <unavailable>
fpe2 <unavailable>
fpe3 <unavailable>
fpe4 <unavailable>
fpe5 <unavailable>
fpe6 <unavailable>
fpe7 <unavailable>
(gdb)
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (8 preceding siblings ...)
2023-11-06 22:49 ` sjames at gcc dot gnu.org
@ 2023-11-06 23:11 ` sjames at gcc dot gnu.org
2023-11-06 23:18 ` dave.anglin at bell dot net
` (45 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-06 23:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #9 from Sam James <sjames at gcc dot gnu.org> ---
I think the key object is Python/compile.o, but not certain yet.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (9 preceding siblings ...)
2023-11-06 23:11 ` sjames at gcc dot gnu.org
@ 2023-11-06 23:18 ` dave.anglin at bell dot net
2023-11-07 14:08 ` manolis.tsamis at vrull dot eu
` (44 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-06 23:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #10 from dave.anglin at bell dot net ---
On 2023-11-06 5:49 p.m., sjames at gcc dot gnu.org wrote:
> Program received signal SIGSEGV, Segmentation fault.
> 0x412083f0 in _PyST_GetSymbol (name=0xf9a34a00, ste=<optimized out>) at
> Python/symtable.c:396
> 396 PyObject *v = PyDict_GetItemWithError(ste->ste_symbols, name);
> (gdb) x/20i $pc
> => 0x412083f0 <_PyST_GetScope+20>: ldw c(r26),r26
r26=0x34, so the ldw will fault. It appears r26 and r25 have been exchanged in
the code
prior to <_PyST_GetScope+20>. In any case, the problem is with the ste
argument passed
to _PyST_GetSymbol.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (10 preceding siblings ...)
2023-11-06 23:18 ` dave.anglin at bell dot net
@ 2023-11-07 14:08 ` manolis.tsamis at vrull dot eu
2023-11-07 21:12 ` sjames at gcc dot gnu.org
` (43 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2023-11-07 14:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #11 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
Hi all,
I will also go ahead and try to reproduce that, although it may take me some
time due to my limited experience with HPPA. Once I manage to reproduce, most
f-m-o issues are straightforward to locate by bisecting the transformed
instructions.
> I think the key object is Python/compile.o, but not certain yet.
In this case the dump file of fold-mem-offsets
(-fdump-rtl-fold_mem_offsets-all) could also be useful, as it contains all the
information needed to see whether a transformation is valid. If it would be
easy for anyone to provide the dump file, I could look at it and see if
anything stands out (until I manage to reproduce this).
Thanks,
Manolis
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (11 preceding siblings ...)
2023-11-07 14:08 ` manolis.tsamis at vrull dot eu
@ 2023-11-07 21:12 ` sjames at gcc dot gnu.org
2023-11-08 1:36 ` sjames at gcc dot gnu.org
` (42 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-07 21:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #12 from Sam James <sjames at gcc dot gnu.org> ---
(In reply to Manolis Tsamis from comment #11)
> Hi all,
>
> I will also go ahead and try to reproduce that, although it may take me some
> time due to my limited experience with HPPA. Once I manage to reproduce,
> most f-m-o issues are straightforward to locate by bisecting the transformed
> instructions.
Thanks! You are very welcome to have access to some HPPA machines for this kind
of work. Please email me an SSH public key + desired username if that sounds
helpful.
>
> > I think the key object is Python/compile.o, but not certain yet.
>
> In this case the dump file of fold-mem-offsets
> (-fdump-rtl-fold_mem_offsets-all) could also be useful, as it contains all
> the information needed to see whether a transformation is valid. If it would
> be easy for anyone to provide the dump file, I could look at it and see if
> anything stands out (until I manage to reproduce this).
I'll get the dumps in a moment, thanks.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (12 preceding siblings ...)
2023-11-07 21:12 ` sjames at gcc dot gnu.org
@ 2023-11-08 1:36 ` sjames at gcc dot gnu.org
2023-11-08 2:24 ` dave.anglin at bell dot net
` (41 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-08 1:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #13 from Sam James <sjames at gcc dot gnu.org> ---
Created attachment 56527
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56527&action=edit
compile.c.323r.fold_mem_offsets.bad.xz
Output from
```
hppa2.0-unknown-linux-gnu-gcc -c -DNDEBUG -g -fwrapv -O3 -Wall -O2 -std=c11
-Werror=implicit-function-declaration -fvisibility=hidden
-I/home/sam/git/cpython/Include/internal -IObjects -IInclude -IPython -I.
-I/home/sam/git/cpython/Include -DPy_BUILD_CORE -o Python/compile.o
/home/sam/git/cpython/Python/compile.c -fdump-rtl-fold_mem_offsets-all
```
If I instrument certain functions in compile.c with no optimisation attribuet
or build the file with -fno-fold-mem-offsets, Python works, so I'm reasonably
sure this is the relevant object.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (13 preceding siblings ...)
2023-11-08 1:36 ` sjames at gcc dot gnu.org
@ 2023-11-08 2:24 ` dave.anglin at bell dot net
2023-11-08 10:09 ` manolis.tsamis at vrull dot eu
` (40 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-08 2:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #14 from dave.anglin at bell dot net ---
On 2023-11-07 8:36 p.m., sjames at gcc dot gnu.org wrote:
> If I instrument certain functions in compile.c with no optimisation attribuet
> or build the file with -fno-fold-mem-offsets, Python works, so I'm reasonably
> sure this is the relevant object.
I believe this bug is related to https://gcc.gnu.org/PR97431
I see the same fault with using debian/rules and -finline-small-functions
option.
Debian has been building with -fno-inline-small-functions on sh and hppa. This
hides
problem.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (14 preceding siblings ...)
2023-11-08 2:24 ` dave.anglin at bell dot net
@ 2023-11-08 10:09 ` manolis.tsamis at vrull dot eu
2023-11-08 14:42 ` jeffreyalaw at gmail dot com
` (39 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2023-11-08 10:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #15 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
(In reply to Sam James from comment #13)
> Created attachment 56527 [details]
> compile.c.323r.fold_mem_offsets.bad.xz
>
> Output from
> ```
> hppa2.0-unknown-linux-gnu-gcc -c -DNDEBUG -g -fwrapv -O3 -Wall -O2
> -std=c11 -Werror=implicit-function-declaration -fvisibility=hidden
> -I/home/sam/git/cpython/Include/internal -IObjects -IInclude -IPython -I.
> -I/home/sam/git/cpython/Include -DPy_BUILD_CORE -o Python/compile.o
> /home/sam/git/cpython/Python/compile.c -fdump-rtl-fold_mem_offsets-all
> ```
>
> If I instrument certain functions in compile.c with no optimisation
> attribuet or build the file with -fno-fold-mem-offsets, Python works, so I'm
> reasonably sure this is the relevant object.
Thanks for the dump file! There are 66 folded/eliminated instructions in this
object file; I did look at each case and there doesn't seem to be anything
strange. In fact most of the transformations are straightforward:
- All except a couple of cases don't involve any arithmetic, so it's just
moving a constant around.
- The majority of the transformations are 'trivial' and consist of a single
add and then a memory operation: a sequence like X = Y + Const, R = MEM[X + 0]
is folded to X = Y, R = MEM[X + Const]. I wonder why so many of these exist and
are not optimized elsewhere.
- There are some cases with negative offsets, but the calculations look
correct.
- There are few more complicated cases, but I've done these on paper and also
look correct.
Of course I could be missing some more complicated effect, but what I want to
say is that everything looks sensible in this particular file.
> Thanks! You are very welcome to have access to some HPPA machines for
> this kind of work. Please email me an SSH public key + desired username
> if that sounds helpful.
Yes, since I couldn't find anything interesting in the dump, that would
definitely be helpful. Thanks!
Manolis
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (15 preceding siblings ...)
2023-11-08 10:09 ` manolis.tsamis at vrull dot eu
@ 2023-11-08 14:42 ` jeffreyalaw at gmail dot com
2023-11-08 18:59 ` dave.anglin at bell dot net
` (38 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: jeffreyalaw at gmail dot com @ 2023-11-08 14:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #16 from Jeffrey A. Law <jeffreyalaw at gmail dot com> ---
On 11/8/23 03:09, manolis.tsamis at vrull dot eu wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
>
> --- Comment #15 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
> (In reply to Sam James from comment #13)
>> Created attachment 56527 [details]
>> compile.c.323r.fold_mem_offsets.bad.xz
>>
>> Output from
>> ```
>> hppa2.0-unknown-linux-gnu-gcc -c -DNDEBUG -g -fwrapv -O3 -Wall -O2
>> -std=c11 -Werror=implicit-function-declaration -fvisibility=hidden
>> -I/home/sam/git/cpython/Include/internal -IObjects -IInclude -IPython -I.
>> -I/home/sam/git/cpython/Include -DPy_BUILD_CORE -o Python/compile.o
>> /home/sam/git/cpython/Python/compile.c -fdump-rtl-fold_mem_offsets-all
>> ```
>>
>> If I instrument certain functions in compile.c with no optimisation
>> attribuet or build the file with -fno-fold-mem-offsets, Python works, so I'm
>> reasonably sure this is the relevant object.
>
> Thanks for the dump file! There are 66 folded/eliminated instructions in this
> object file; I did look at each case and there doesn't seem to be anything
> strange. In fact most of the transformations are straightforward:
>
> - All except a couple of cases don't involve any arithmetic, so it's just
> moving a constant around.
> - The majority of the transformations are 'trivial' and consist of a single
> add and then a memory operation: a sequence like X = Y + Const, R = MEM[X + 0]
> is folded to X = Y, R = MEM[X + Const]. I wonder why so many of these exist and
> are not optimized elsewhere.
> - There are some cases with negative offsets, but the calculations look
> correct.
> - There are few more complicated cases, but I've done these on paper and also
> look correct.
The PA port is "weird". It's addressing modes aren't a good match for
GCC (they're not symmetrical across loads vs stores and across fp vs
integer) and they have the implicit space register problem. But I don't
immediately recall needing to avoid propagation of constants into memory
references or anything like that.
I'd probably continue with the process of narrowing down what code is
affected using the attributes. We already know the file, narrowing it
down to a function might help considerably with the evaluation effort.
Note that QEMU has a functional PA port. So you might be able to just
take a root filesystem, add the tarball referenced earlier and play
around to narrow things down further.
I haven't done work on the PA in about 20 years at this point, but I can
probably still grok its code. Between David and myself I'm sure we can
help interpret what's going on
Jeff
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (16 preceding siblings ...)
2023-11-08 14:42 ` jeffreyalaw at gmail dot com
@ 2023-11-08 18:59 ` dave.anglin at bell dot net
2023-11-08 19:07 ` pinskia at gcc dot gnu.org
` (37 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-08 18:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #17 from dave.anglin at bell dot net ---
On 2023-11-08 9:42 a.m., jeffreyalaw at gmail dot com wrote:
> I'd probably continue with the process of narrowing down what code is
> affected using the attributes. We already know the file, narrowing it
> down to a function might help considerably with the evaluation effort.
The problem seems to be in compiler_visit_expr().
-static int compiler_visit_expr(struct compiler *, expr_ty);
+static int compiler_visit_expr(struct compiler *, expr_ty)
__attribute__((optimize("no-inline-small-functions")));
Python builds okay if this function is not inlined, if it is compiled at -O1,
or if -fno-inline-small-functions is
specified as above. Can't specify -fno-fold-mem-offsets as a function
attribute.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (17 preceding siblings ...)
2023-11-08 18:59 ` dave.anglin at bell dot net
@ 2023-11-08 19:07 ` pinskia at gcc dot gnu.org
2023-11-08 19:16 ` law at gcc dot gnu.org
` (36 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-08 19:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #18 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I wonder if -fno-strict-aliasing works around the issue too?
I get the feeling that `fold mem offset pass` allows the aliasing code to have
a better time with the offset and that might be expose more aliasing issues.
The other thing to try is add `-fno-schedule-insns2 -fno-schedule-insns`
instead of `-fno-strict-aliasing` as the scheduler is normally where the
aliasing issues are exposed on the RTL level ...
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (18 preceding siblings ...)
2023-11-08 19:07 ` pinskia at gcc dot gnu.org
@ 2023-11-08 19:16 ` law at gcc dot gnu.org
2023-11-08 19:40 ` dave.anglin at bell dot net
` (35 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: law at gcc dot gnu.org @ 2023-11-08 19:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #19 from Jeffrey A. Law <law at gcc dot gnu.org> ---
f-m-o runs post-allocation, so the scope of where it's behavior can change
things is narrower. So testing with -fno-schedule-insns isn't going to be
useful, but -fno-schedule-insns2 might.
I'm a bit concerned that we can't turn off f-m-o with an attribute. That would
indicating something isn't wired up right in the options handling.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (19 preceding siblings ...)
2023-11-08 19:16 ` law at gcc dot gnu.org
@ 2023-11-08 19:40 ` dave.anglin at bell dot net
2023-11-08 23:33 ` pinskia at gcc dot gnu.org
` (34 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-08 19:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #20 from dave.anglin at bell dot net ---
On 2023-11-08 2:07 p.m., pinskia at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
>
> --- Comment #18 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> I wonder if -fno-strict-aliasing works around the issue too?
> I get the feeling that `fold mem offset pass` allows the aliasing code to have
> a better time with the offset and that might be expose more aliasing issues.
>
> The other thing to try is add `-fno-schedule-insns2 -fno-schedule-insns`
> instead of `-fno-strict-aliasing` as the scheduler is normally where the
> aliasing issues are exposed on the RTL level ...
Both -fno-strict-aliasing and -fno-schedule-insns2 applied to
compiler_visit_expr()
work around issue.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (20 preceding siblings ...)
2023-11-08 19:40 ` dave.anglin at bell dot net
@ 2023-11-08 23:33 ` pinskia at gcc dot gnu.org
2023-11-08 23:40 ` danglin at gcc dot gnu.org
` (33 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-08 23:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #21 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to dave.anglin from comment #20)
> Both -fno-strict-aliasing and -fno-schedule-insns2 applied to
> compiler_visit_expr()
> work around issue.
The other option to try is -fstack-reuse=none. There is definitely known issues
with the code that coalesces stack variables together too (see PR 111843 for
examples).
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (21 preceding siblings ...)
2023-11-08 23:33 ` pinskia at gcc dot gnu.org
@ 2023-11-08 23:40 ` danglin at gcc dot gnu.org
2023-11-08 23:51 ` sjames at gcc dot gnu.org
` (32 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-08 23:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #22 from John David Anglin <danglin at gcc dot gnu.org> ---
Created attachment 56542
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56542&action=edit
Preprocessed source and assembly files for Python/compile.c
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (22 preceding siblings ...)
2023-11-08 23:40 ` danglin at gcc dot gnu.org
@ 2023-11-08 23:51 ` sjames at gcc dot gnu.org
2023-11-09 0:00 ` dave.anglin at bell dot net
` (31 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-08 23:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #23 from Sam James <sjames at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #21)
> The other option to try is -fstack-reuse=none. There is definitely known
> issues with the code that coalesces stack variables together too (see PR
> 111843 for examples).
I had a good feeling about this but no, didn't help when applied to compile.o.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (23 preceding siblings ...)
2023-11-08 23:51 ` sjames at gcc dot gnu.org
@ 2023-11-09 0:00 ` dave.anglin at bell dot net
2023-11-09 0:02 ` sjames at gcc dot gnu.org
` (30 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-09 0:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #24 from dave.anglin at bell dot net ---
On 2023-11-08 6:51 p.m., sjames at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
>
> --- Comment #23 from Sam James <sjames at gcc dot gnu.org> ---
> (In reply to Andrew Pinski from comment #21)
>> The other option to try is -fstack-reuse=none. There is definitely known
>> issues with the code that coalesces stack variables together too (see PR
>> 111843 for examples).
> I had a good feeling about this but no, didn't help when applied to compile.o.
At this point, I don't know whether this is a python or gcc bug. I scanned for
unions in compile.i
that might be problematic but I didn't find anything obvious.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (24 preceding siblings ...)
2023-11-09 0:00 ` dave.anglin at bell dot net
@ 2023-11-09 0:02 ` sjames at gcc dot gnu.org
2023-11-09 0:07 ` law at gcc dot gnu.org
` (29 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-09 0:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #25 from Sam James <sjames at gcc dot gnu.org> ---
I am having the same thoughts. It would not be the first time Python had
something dubious, like...
* https://wiki.gentoo.org/wiki/Project:Python/Strict_aliasing ->
https://www.python.org/dev/peps/pep-3123/
* https://github.com/python/cpython/issues/111178
So far, I did not see this failure on any other target (-> makes me think it's
a gcc bug). But also, I didn't yet see any other software break on hppa (->
makes me think it might be a Python bug).
I tried ubsan on amd64 with Python 3.12 at least and got a lot of different
errors, although ubsan does not diagnose aliasing issues...
I am undecided myself still.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (25 preceding siblings ...)
2023-11-09 0:02 ` sjames at gcc dot gnu.org
@ 2023-11-09 0:07 ` law at gcc dot gnu.org
2023-11-09 0:08 ` dave.anglin at bell dot net
` (28 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: law at gcc dot gnu.org @ 2023-11-09 0:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #26 from Jeffrey A. Law <law at gcc dot gnu.org> ---
As a compiler junkie, I tend to think compiler first until I can prove it
otherwise. I wouldn't get too hung up on aliasing issues and such at this
point.
Do we already have a dump for the key function? Presumably f-m-o doesn't
trigger *that* much. And if this is triggering w/o LTO we can probably move to
cross debugging and analysis of those dump files and assembly code with and
without f-m-o enabled, narrowing our focus on the key function.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (26 preceding siblings ...)
2023-11-09 0:07 ` law at gcc dot gnu.org
@ 2023-11-09 0:08 ` dave.anglin at bell dot net
2023-11-09 0:23 ` dave.anglin at bell dot net
` (27 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-09 0:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #27 from dave.anglin at bell dot net ---
On 2023-11-08 7:00 p.m., John David Anglin wrote:
> On 2023-11-08 6:51 p.m., sjames at gcc dot gnu.org wrote:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
>>
>> --- Comment #23 from Sam James <sjames at gcc dot gnu.org> ---
>> (In reply to Andrew Pinski from comment #21)
>>> The other option to try is -fstack-reuse=none. There is definitely known
>>> issues with the code that coalesces stack variables together too (see PR
>>> 111843 for examples).
>> I had a good feeling about this but no, didn't help when applied to compile.o.
> At this point, I don't know whether this is a python or gcc bug. I scanned for unions in compile.i
> that might be problematic but I didn't find anything obvious.
Note -no-strict-aliasing affects the inlining of compiler_visit_expr. It is
not inlined with -no-strict-aliasing.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (27 preceding siblings ...)
2023-11-09 0:08 ` dave.anglin at bell dot net
@ 2023-11-09 0:23 ` dave.anglin at bell dot net
2023-11-09 18:04 ` danglin at gcc dot gnu.org
` (26 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-09 0:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #28 from dave.anglin at bell dot net ---
On 2023-11-08 7:07 p.m., law at gcc dot gnu.org wrote:
> Do we already have a dump for the key function? Presumably f-m-o doesn't
> trigger*that* much. And if this is triggering w/o LTO we can probably move to
> cross debugging and analysis of those dump files and assembly code with and
> without f-m-o enabled, narrowing our focus on the key function.
I tried looking at the difference with and without f-m-o and it was quite
large. The difference
with and without strict aliasing is much smaller. The main differences that I
saw relate to the
inlining of compiler_visit_expr and compiler_visit_expr1.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (28 preceding siblings ...)
2023-11-09 0:23 ` dave.anglin at bell dot net
@ 2023-11-09 18:04 ` danglin at gcc dot gnu.org
2023-11-09 19:17 ` danglin at gcc dot gnu.org
` (25 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-09 18:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #29 from John David Anglin <danglin at gcc dot gnu.org> ---
The miscompilation is in compiler_visit_expr:
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program:
/home/dave/debian/python3.11/python3.11-3.11.6/build-static/Programs/_freeze_module
importlib._bootstrap ../Lib/importlib/_bootstrap.py
Python/frozen_modules/importlib._bootstrap.h
warning: Unable to find libthread_db matching inferior's thread library, thread
debugging will not be available.
Breakpoint 2, compiler_jump_if (c=0xf8f02508, e=0x5763f8, next=0xfaeaa908,
cond=0) at ../Python/compile.c:2898
2898 {
(gdb) watch *0xfaea51b8
Watchpoint 3: *0xfaea51b8
(gdb) c
Continuing.
Watchpoint 3: *0xfaea51b8
Old value = -85046408
New value = 43
0x0019c688 in compiler_visit_expr (e=0x576308, c=0xf8f02508) at
../Python/compile.c:5968
5968 SET_LOC(c, e);
(gdb) bt
#0 0x0019c688 in compiler_visit_expr (e=0x576308, c=0xf8f02508)
at ../Python/compile.c:5968
#1 compiler_call_helper (c=0xf8f02508, n=0, args=<optimized out>,
keywords=0x0) at ../Python/compile.c:5138
#2 0x0019ec70 in compiler_visit_expr (e=<optimized out>, c=0xf8f02508)
at ../Python/compile.c:5969
#3 compiler_jump_if (c=0xf8f02508, e=<optimized out>, next=0x0,
cond=<optimized out>) at ../Python/compile.c:2988
#4 0x001a0770 in compiler_if (s=0x0, c=0x5763c0) at ../Python/compile.c:3090
#5 compiler_visit_stmt (c=0x5763c0, s=0x0) at ../Python/compile.c:4118
#6 0x001a1378 in compiler_for (s=0x0, c=0x5763c0) at ../Python/compile.c:3124
#7 compiler_visit_stmt (c=0x5763c0, s=0x0) at ../Python/compile.c:4114
#8 0x001a3170 in compiler_function (c=0x2, s=<optimized out>,
is_async=<optimized out>) at ../Python/compile.c:2670
#9 0x001a3438 in compiler_body (c=0x0, stmts=0x5763c0)
at ../Python/compile.c:2180
#10 0x001a5cdc in compiler_mod (mod=0x0, c=0xf8f02528)
at ../Python/compile.c:2197
#11 _PyAST_Compile (mod=0x0, filename=0xf8f02528, flags=<optimized out>,
optimize=<optimized out>, arena=<optimized out>) at ../Python/compile.c:581
#12 0x001dea00 in Py_CompileStringObject (optimize=0, flags=0x5763c0, start=0,
filename=0x2, str=0x0) at ../Python/pythonrun.c:1799
#13 Py_CompileStringExFlags (str=0x0, filename_str=<optimized out>, start=0,
--Type <RET> for more, q to quit, c to continue without paging--
flags=0x5763c0, optimize=<optimized out>) at ../Python/pythonrun.c:1812
#14 0x000167a4 in compile_and_marshal (text=0x0,
name=0x2 <error: Cannot access memory at address 0x2>)
at ../Programs/_freeze_module.c:125
#15 main (argc=0, argv=<optimized out>) at ../Programs/_freeze_module.c:230
(gdb) diass $pc-16,$pc+16
Undefined command: "diass". Try "help".
(gdb) disass $pc-16,$pc+16
Dump of assembler code from 0x19c678 to 0x19c698:
0x0019c678 <compiler_call_helper+576>: ldw 14(r25),ret1
0x0019c67c <compiler_call_helper+580>: ldw 18(r25),r31
0x0019c680 <compiler_call_helper+584>: ldw 1c(r25),ret0
0x0019c684 <compiler_call_helper+588>: stw r23,0(r22)
=> 0x0019c688 <compiler_call_helper+592>: stw ret1,0(r21)
0x0019c68c <compiler_call_helper+596>: stw r31,0(r20)
0x0019c690 <compiler_call_helper+600>: b,l 0x198d58
<compiler_visit_expr1>,rp
0x0019c694 <compiler_call_helper+604>: stw ret0,0(r19)
End of assembler dump.
The code at 0x0019c688 clobbers the value at c->u->u_ste:
(gdb) p/x $r21
$35 = 0xfaea51b8
(gdb) p/x *c
$36 = {c_filename = 0xfaed9480, c_st = 0xfaeafd10, c_future = 0xfaef7030,
c_flags = 0xf8f02544, c_optimize = 0x0, c_interactive = 0x0,
c_nestlevel = 0x2, c_const_cache = 0xfae81280, u = 0xfaea51b8,
c_stack = 0xfae57a88, c_arena = 0xfaec0c90}
(gdb) p/x *c->u
$37 = {u_ste = 0x2b, u_name = 0xfae7ff80, u_qualname = 0xfae7ff80,
u_scope_type = 0x2, u_consts = 0xfaeaa7f8, u_names = 0xfaeaa7d0,
u_varnames = 0xfaeaa780, u_cellvars = 0xfaeaa7a8, u_freevars = 0xfaeaa758,
u_private = 0x0, u_argcount = 0x2, u_posonlyargcount = 0x0,
u_kwonlyargcount = 0x0, u_blocks = 0xfaeaa908, u_curblock = 0xfaeaa868,
u_nfblocks = 0x1, u_fblock = {{fb_type = 0x1, fb_block = 0xfaeaa840,
fb_exit = 0xfaeaa8b8, fb_datum = 0x0}, {fb_type = 0x0, fb_block = 0x0,
fb_exit = 0x0, fb_datum = 0x0} <repeats 19 times>},
u_firstlineno = 0x28, u_lineno = 0x2b, u_col_offset = 0xb,
u_end_lineno = 0x2b, u_end_col_offset = 0x20,
u_need_new_implicit_block = 0x0}
(gdb) p/x $r23
$38 = 0x2b
#define SET_LOC(c, x) \
(c)->u->u_lineno = (x)->lineno; \
(c)->u->u_col_offset = (x)->col_offset; \
(c)->u->u_end_lineno = (x)->end_lineno; \
(c)->u->u_end_col_offset = (x)->end_col_offset;
(gdb) p/x *e
$40 = {kind = 0x18, v = {BoolOp = {op = 0xfaeb8b60, values = 0x1},
NamedExpr = {target = 0xfaeb8b60, value = 0x1}, BinOp = {
left = 0xfaeb8b60, op = 0x1, right = 0x0}, UnaryOp = {op = 0xfaeb8b60,
operand = 0x1}, Lambda = {args = 0xfaeb8b60, body = 0x1}, IfExp = {
test = 0xfaeb8b60, body = 0x1, orelse = 0x0}, Dict = {keys = 0xfaeb8b60,
values = 0x1}, Set = {elts = 0xfaeb8b60}, ListComp = {elt = 0xfaeb8b60,
generators = 0x1}, SetComp = {elt = 0xfaeb8b60, generators = 0x1},
DictComp = {key = 0xfaeb8b60, value = 0x1, generators = 0x0},
GeneratorExp = {elt = 0xfaeb8b60, generators = 0x1}, Await = {
value = 0xfaeb8b60}, Yield = {value = 0xfaeb8b60}, YieldFrom = {
value = 0xfaeb8b60}, Compare = {left = 0xfaeb8b60, ops = 0x1,
comparators = 0x0}, Call = {func = 0xfaeb8b60, args = 0x1,
keywords = 0x0}, FormattedValue = {value = 0xfaeb8b60, conversion = 0x1,
format_spec = 0x0}, JoinedStr = {values = 0xfaeb8b60}, Constant = {
value = 0xfaeb8b60, kind = 0x1}, Attribute = {value = 0xfaeb8b60,
attr = 0x1, ctx = 0x0}, Subscript = {value = 0xfaeb8b60, slice = 0x1,
ctx = 0x0}, Starred = {value = 0xfaeb8b60, ctx = 0x1}, Name = {
id = 0xfaeb8b60, ctx = 0x1}, List = {elts = 0xfaeb8b60, ctx = 0x1},
Tuple = {elts = 0xfaeb8b60, ctx = 0x1}, Slice = {lower = 0xfaeb8b60,
upper = 0x1, step = 0x0}}, lineno = 0x2b, col_offset = 0x18,
end_lineno = 0x2b, end_col_offset = 0x1f}
Seems like an offset issue.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (29 preceding siblings ...)
2023-11-09 18:04 ` danglin at gcc dot gnu.org
@ 2023-11-09 19:17 ` danglin at gcc dot gnu.org
2023-11-09 20:28 ` law at gcc dot gnu.org
` (24 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-09 19:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #30 from John David Anglin <danglin at gcc dot gnu.org> ---
0x0019c684 <+588>: stw r23,0(r22)
=> 0x0019c688 <+592>: stw ret1,0(r21)
0x0019c68c <+596>: stw r31,0(r20)
0x0019c690 <+600>: b,l 0x198d58 <compiler_visit_expr1>,rp
0x0019c694 <+604>: stw ret0,0(r19)
These instructions are in a loop:
/* No * or ** args, so can use faster calling sequence */
for (i = 0; i < nelts; i++) {
expr_ty elt = asdl_seq_GET(args, i);
assert(elt->kind != Starred_kind);
VISIT(c, expr, elt);
}
r21 is clobbered by VISIT call. Value is okay in first iteration.
The initialization instructions are outside the loop:
0x0019c638 <+512>: ldo 184(r19),r22
0x0019c63c <+516>: ldw 184(r19),r14
0x0019c640 <+520>: ldo 188(r19),r21
0x0019c644 <+524>: ldw 188(r19),r13
0x0019c648 <+528>: ldo 18c(r19),r20
0x0019c64c <+532>: ldw 18c(r19),r12
0x0019c650 <+536>: ldw 190(r19),r11
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (30 preceding siblings ...)
2023-11-09 19:17 ` danglin at gcc dot gnu.org
@ 2023-11-09 20:28 ` law at gcc dot gnu.org
2023-11-09 20:41 ` dave.anglin at bell dot net
` (23 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: law at gcc dot gnu.org @ 2023-11-09 20:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #31 from Jeffrey A. Law <law at gcc dot gnu.org> ---
IIRC r21 is call-clobbered. So I guess the question turns into what was the
sequence before f-m-o got involved -- was it assuming r21 would be preserved,
or did f-m-o make r21 live across the call?
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (31 preceding siblings ...)
2023-11-09 20:28 ` law at gcc dot gnu.org
@ 2023-11-09 20:41 ` dave.anglin at bell dot net
2023-11-09 23:41 ` danglin at gcc dot gnu.org
` (22 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-09 20:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #32 from dave.anglin at bell dot net ---
At this point, I don't have gcc-14 builds that bracket the f-m-o change. Maybe
Sam can check.
I'm trying to determine RTL pass where things go bad.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (32 preceding siblings ...)
2023-11-09 20:41 ` dave.anglin at bell dot net
@ 2023-11-09 23:41 ` danglin at gcc dot gnu.org
2023-11-11 19:40 ` danglin at gcc dot gnu.org
` (21 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-09 23:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #33 from John David Anglin <danglin at gcc dot gnu.org> ---
Created attachment 56549
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56549&action=edit
ira and reload dumps for compiler_call_helper
The incorrect code for insn 246 in compiler_call_helper appears in the reload
pass.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (33 preceding siblings ...)
2023-11-09 23:41 ` danglin at gcc dot gnu.org
@ 2023-11-11 19:40 ` danglin at gcc dot gnu.org
2023-11-11 19:51 ` sjames at gcc dot gnu.org
` (20 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-11 19:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #34 from John David Anglin <danglin at gcc dot gnu.org> ---
Same wrong code is generated with x86-64 cross to hppa-linux-gnu. This it seems
this bug is not due to gcc being miscompiled.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (34 preceding siblings ...)
2023-11-11 19:40 ` danglin at gcc dot gnu.org
@ 2023-11-11 19:51 ` sjames at gcc dot gnu.org
2023-11-11 20:00 ` danglin at gcc dot gnu.org
` (19 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-11 19:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #35 from Sam James <sjames at gcc dot gnu.org> ---
If you still need dumps off me, please let me know which. I've attached those
w/ f-o-m on for the fold-mem-offsets pass. If you need others, just say.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (35 preceding siblings ...)
2023-11-11 19:51 ` sjames at gcc dot gnu.org
@ 2023-11-11 20:00 ` danglin at gcc dot gnu.org
2023-11-11 20:06 ` danglin at gcc dot gnu.org
` (18 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-11 20:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #36 from John David Anglin <danglin at gcc dot gnu.org> ---
Created attachment 56562
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56562&action=edit
fold_mem_offsets, prop_hardreg, rtl_dce and bbro dumps
Comment #33 is wrong. The issue is not reload. It's okay to pick a
call clobbered register as the code stands.
The initialization of the register used for the store at
offset 392B ends up outside the loop. It ends up in a call clobbered
register and clobbered by the call to compiler_visit_expr1 in the loop.
This occurs around the second call to compiler_visit_expr1 in
compiler_call_helper
Various initializations get moved out of the loop between the f-m-o and bbro
passes. I think it's the bbro pass that's at fault but it could be something
that happens before that causes the initialization to get moved outside the
loop.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (36 preceding siblings ...)
2023-11-11 20:00 ` danglin at gcc dot gnu.org
@ 2023-11-11 20:06 ` danglin at gcc dot gnu.org
2023-11-11 20:19 ` sjames at gcc dot gnu.org
` (17 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-11 20:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #37 from John David Anglin <danglin at gcc dot gnu.org> ---
(In reply to Sam James from comment #35)
> If you still need dumps off me, please let me know which. I've attached
> those w/ f-o-m on for the fold-mem-offsets pass. If you need others, just
> say.
I have a set of dumps. The problem is determining where the wrong RTL
occurs in compiler_call_helper. It changes a lot in pass to pass.
Many of the changes in f-m-o seem to get destroyed by later transformations.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (37 preceding siblings ...)
2023-11-11 20:06 ` danglin at gcc dot gnu.org
@ 2023-11-11 20:19 ` sjames at gcc dot gnu.org
2023-11-11 21:54 ` danglin at gcc dot gnu.org
` (16 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-11 20:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
Sam James <sjames at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2023-11-11
--- Comment #38 from Sam James <sjames at gcc dot gnu.org> ---
Confirming since Dave repro'd too.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (38 preceding siblings ...)
2023-11-11 20:19 ` sjames at gcc dot gnu.org
@ 2023-11-11 21:54 ` danglin at gcc dot gnu.org
2023-11-12 15:05 ` danglin at gcc dot gnu.org
` (15 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-11 21:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #39 from John David Anglin <danglin at gcc dot gnu.org> ---
In the f-m-o pass, the following three insns that set call clobbered
registers r20-r22 are pulled from loop:
(insn 186 183 190 29 (set (reg/f:SI 22 %r22 [478])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 388 [0x184]))) "../Python/compile.c":5964:9 120 {addsi3}
(nil))
(insn 190 186 187 29 (set (reg/f:SI 21 %r21 [479])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 392 [0x188]))) "../Python/compile.c":5964:9 120 {addsi3}
(nil))
(insn 194 191 195 29 (set (reg/f:SI 20 %r20 [480])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 396 [0x18c]))) "../Python/compile.c":5964:9 120 {addsi3}
(nil))
They are used in the following insns before call to compiler_visit_expr1:
(insn 242 238 258 32 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int
*)prephit
mp_37 + 388B]+0 S4 A32])
(reg:SI 23 %r23 [orig:173 vect__102.2442 ] [173]))
"../Python/compile.c"
:5968:22 42 {*pa.md:2193}
(expr_list:REG_DEAD (reg:SI 23 %r23 [orig:173 vect__102.2442 ] [173])
(expr_list:REG_DEAD (reg/f:SI 22 %r22 [478])
(nil))))
(insn 258 242 246 32 (set (reg:SI 26 %r26)
(reg/v/f:SI 5 %r5 [orig:198 c ] [198])) "../Python/compile.c":5969:15
42 {*pa.md:2193}
(nil))
(insn 246 258 250 32 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int
*)prephitmp_37 + 392B]+0 S4 A32])
(reg:SI 29 %r29 [orig:169 vect__102.2443 ] [169]))
"../Python/compile.c":5968:22 42 {*pa.md:2193}
(expr_list:REG_DEAD (reg:SI 29 %r29 [orig:169 vect__102.2443 ] [169])
(expr_list:REG_DEAD (reg/f:SI 21 %r21 [479])
(nil))))
(insn 250 246 254 32 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int
*)prephitmp_37 + 396B]+0 S4 A32])
(reg:SI 31 %r31 [orig:145 vect__102.2444 ] [145]))
"../Python/compile.c":5968:22 42 {*pa.md:2193}
(expr_list:REG_DEAD (reg:SI 31 %r31 [orig:145 vect__102.2444 ] [145])
(expr_list:REG_DEAD (reg/f:SI 20 %r20 [480])
(nil))))
After the call, we have:
(insn 1241 269 273 30 (set (reg/f:SI 22 %r22 [478])
(reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
"../Python/compile.c":5970:20 -1
(nil))
(insn 273 1241 1242 30 (set (mem:SI (plus:SI (reg/f:SI 22 %r22 [478])
(const_int 388 [0x184])) [4 MEM[(int *)_107 + 388B]+0 S4 A32])
(reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
(insn 1242 273 277 30 (set (reg/f:SI 21 %r21 [479])
(reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
"../Python/compile.c":5970:20 -1
(nil))
(insn 277 1242 1243 30 (set (mem:SI (plus:SI (reg/f:SI 21 %r21 [479])
(const_int 392 [0x188])) [4 MEM[(int *)_107 + 392B]+0 S4 A32])
(reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
(insn 1243 277 281 30 (set (reg/f:SI 20 %r20 [480])
(reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
"../Python/compile.c":5970:20 -1
(nil))
(insn 281 1243 299 30 (set (mem:SI (plus:SI (reg/f:SI 20 %r20 [480])
(const_int 396 [0x18c])) [4 MEM[(int *)_107 + 396B]+0 S4 A32])
(reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
We have lost the offsets that were added initially to r20, r21 and r22.
Previous ce3 pass had:
(insn 272 269 273 30 (set (reg/f:SI 22 %r22 [478])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 388 [0x184]))) "../Python/compile.c":5970:20 120
{addsi3}
(nil))
(insn 273 272 276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int *)_107 +
388B]+0 S4 A32])
(reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
(insn 276 273 277 30 (set (reg/f:SI 21 %r21 [479])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 392 [0x188]))) "../Python/compile.c":5970:20 120
{addsi3}
(nil))
(insn 277 276 280 30 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int *)_107 +
392B]+0 S4 A32])
(reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
(insn 280 277 281 30 (set (reg/f:SI 20 %r20 [480])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 396 [0x18c]))) "../Python/compile.c":5970:20 120
{addsi3}
(nil))
(insn 281 280 284 30 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int *)_107 +
396B]+0 S4 A32])
(reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
So, this is a f-m-o bug.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (39 preceding siblings ...)
2023-11-11 21:54 ` danglin at gcc dot gnu.org
@ 2023-11-12 15:05 ` danglin at gcc dot gnu.org
2023-11-12 15:54 ` law at gcc dot gnu.org
` (14 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-12 15:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #40 from John David Anglin <danglin at gcc dot gnu.org> ---
Jeff,
I don't think these split instructions make a lot of sense on PA-RISC.
(insn 280 277 281 30 (set (reg/f:SI 20 %r20 [480])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 396 [0x18c]))) "../Python/compile.c":5970:20 120
{addsi3}
(nil))
(insn 281 280 284 30 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int *)_107 +
396B]+0 S4 A32])
(reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
They increase code size and register pressure. That may lead to unnecessary
spills and longer branches. They increase probability of problems like the
one in this PR.
I suspect the two instructions generated are actually slower than one with a
nonzero memory offset. It's not clear that memory accesses with a zero offset
are faster than ones with nonzero offsets.
Integer loads and stores on pa support fairly large offsets.
I think we need to look at why this happens frequently.
Thoughts?
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (40 preceding siblings ...)
2023-11-12 15:05 ` danglin at gcc dot gnu.org
@ 2023-11-12 15:54 ` law at gcc dot gnu.org
2023-11-12 23:59 ` danglin at gcc dot gnu.org
` (13 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: law at gcc dot gnu.org @ 2023-11-12 15:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #41 from Jeffrey A. Law <law at gcc dot gnu.org> ---
I would agree. In fact,the whole point of the f-m-o pass is to bring those
immediates into the memory reference. It'd be really useful to know why that
isn't happening.
The only thing I can think of would be if multiple instructions needed the %r20
in the RTL you attached. Which might point to a refinement we should make in
f-m-o, specifically the transformation isn't likely profitable if we aren't
able to fold away a term or fold a constant term into the actual memory
reference.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (41 preceding siblings ...)
2023-11-12 15:54 ` law at gcc dot gnu.org
@ 2023-11-12 23:59 ` danglin at gcc dot gnu.org
2023-11-13 0:24 ` law at gcc dot gnu.org
` (12 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-12 23:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #42 from John David Anglin <danglin at gcc dot gnu.org> ---
The problem is we are limiting displacements to five bits in
pa_legitimate_address_p. The comment is somewhat confusing but
we may have reload issues if we allow 14-bit displacements before
reload completes. Testing a patch to see if we can allow 14-bit
displacements before reload.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (42 preceding siblings ...)
2023-11-12 23:59 ` danglin at gcc dot gnu.org
@ 2023-11-13 0:24 ` law at gcc dot gnu.org
2023-11-13 9:33 ` manolis.tsamis at vrull dot eu
` (11 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: law at gcc dot gnu.org @ 2023-11-13 0:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #43 from Jeffrey A. Law <law at gcc dot gnu.org> ---
I would expect allowing larger offsets before reload to be a significant
problem.
The core issue is integer memory operations allow 14 bits while FP only allows
5. During reloading we don't know if any given memory reference is FP or
integer. xmpyu plays a role here too since it's going to require FP registers
in integer modes.
But what I don't understand is why f-m-o fails to push the offset into the
memory reference -- it should be conditional on the insn being recognized. And
since it's after reload we know if we're doing an FP or integer load.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (43 preceding siblings ...)
2023-11-13 0:24 ` law at gcc dot gnu.org
@ 2023-11-13 9:33 ` manolis.tsamis at vrull dot eu
2023-11-13 9:37 ` manolis.tsamis at vrull dot eu
` (10 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2023-11-13 9:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #44 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
(In reply to John David Anglin from comment #39)
> In the f-m-o pass, the following three insns that set call clobbered
> registers r20-r22 are pulled from loop:
>
> (insn 186 183 190 29 (set (reg/f:SI 22 %r22 [478])
> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> (const_int 388 [0x184]))) "../Python/compile.c":5964:9 120
> {addsi3}
> (nil))
> (insn 190 186 187 29 (set (reg/f:SI 21 %r21 [479])
> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> (const_int 392 [0x188]))) "../Python/compile.c":5964:9 120
> {addsi3}
> (nil))
> (insn 194 191 195 29 (set (reg/f:SI 20 %r20 [480])
> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> (const_int 396 [0x18c]))) "../Python/compile.c":5964:9 120
> {addsi3}
> (nil))
>
> They are used in the following insns before call to compiler_visit_expr1:
>
> (insn 242 238 258 32 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int
> *)prephit
> mp_37 + 388B]+0 S4 A32])
> (reg:SI 23 %r23 [orig:173 vect__102.2442 ] [173]))
> "../Python/compile.c"
> :5968:22 42 {*pa.md:2193}
> (expr_list:REG_DEAD (reg:SI 23 %r23 [orig:173 vect__102.2442 ] [173])
> (expr_list:REG_DEAD (reg/f:SI 22 %r22 [478])
> (nil))))
> (insn 258 242 246 32 (set (reg:SI 26 %r26)
> (reg/v/f:SI 5 %r5 [orig:198 c ] [198]))
> "../Python/compile.c":5969:15 42 {*pa.md:2193}
> (nil))
> (insn 246 258 250 32 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int
> *)prephitmp_37 + 392B]+0 S4 A32])
> (reg:SI 29 %r29 [orig:169 vect__102.2443 ] [169]))
> "../Python/compile.c":5968:22 42 {*pa.md:2193}
> (expr_list:REG_DEAD (reg:SI 29 %r29 [orig:169 vect__102.2443 ] [169])
> (expr_list:REG_DEAD (reg/f:SI 21 %r21 [479])
> (nil))))
> (insn 250 246 254 32 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int
> *)prephitmp_37 + 396B]+0 S4 A32])
> (reg:SI 31 %r31 [orig:145 vect__102.2444 ] [145]))
> "../Python/compile.c":5968:22 42 {*pa.md:2193}
> (expr_list:REG_DEAD (reg:SI 31 %r31 [orig:145 vect__102.2444 ] [145])
> (expr_list:REG_DEAD (reg/f:SI 20 %r20 [480])
> (nil))))
>
> After the call, we have:
>
> (insn 1241 269 273 30 (set (reg/f:SI 22 %r22 [478])
> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
> "../Python/compile.c":5970:20 -1
> (nil))
> (insn 273 1241 1242 30 (set (mem:SI (plus:SI (reg/f:SI 22 %r22 [478])
> (const_int 388 [0x184])) [4 MEM[(int *)_107 + 388B]+0 S4
> A32])
> (reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167]))
> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> (nil))
> (insn 1242 273 277 30 (set (reg/f:SI 21 %r21 [479])
> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
> "../Python/compile.c":5970:20 -1
> (nil))
> (insn 277 1242 1243 30 (set (mem:SI (plus:SI (reg/f:SI 21 %r21 [479])
> (const_int 392 [0x188])) [4 MEM[(int *)_107 + 392B]+0 S4
> A32])
> (reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156]))
> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> (nil))
> (insn 1243 277 281 30 (set (reg/f:SI 20 %r20 [480])
> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
> "../Python/compile.c":5970:20 -1
> (nil))
> (insn 281 1243 299 30 (set (mem:SI (plus:SI (reg/f:SI 20 %r20 [480])
> (const_int 396 [0x18c])) [4 MEM[(int *)_107 + 396B]+0 S4
> A32])
> (reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> (nil))
>
> We have lost the offsets that were added initially to r20, r21 and r22.
>
> Previous ce3 pass had:
>
> (insn 272 269 273 30 (set (reg/f:SI 22 %r22 [478])
> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> (const_int 388 [0x184]))) "../Python/compile.c":5970:20 120
> {addsi3}
> (nil))
> (insn 273 272 276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int
> *)_107 + 388B]+0 S4 A32])
> (reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167]))
> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> (nil))
> (insn 276 273 277 30 (set (reg/f:SI 21 %r21 [479])
> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> (const_int 392 [0x188]))) "../Python/compile.c":5970:20 120
> {addsi3}
> (nil))
> (insn 277 276 280 30 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int
> *)_107 + 392B]+0 S4 A32])
> (reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156]))
> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> (nil))
> (insn 280 277 281 30 (set (reg/f:SI 20 %r20 [480])
> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> (const_int 396 [0x18c]))) "../Python/compile.c":5970:20 120
> {addsi3}
> (nil))
> (insn 281 280 284 30 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int
> *)_107 + 396B]+0 S4 A32])
> (reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> (nil))
>
> So, this is a f-m-o bug.
Hi Dave,
I don't see an f-m-o bug here. The offsets aren't lost, they're just moved in
the corresponding memory loads/stores. If you look the stores in ce3 they
don't have offsets whereas after f-m-o they have. E.g. in ce3: (insn 273 272
276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) ...) but in f-m-o it is (insn 273
1241 1242 30 (set (mem:SI (plus:SI (reg/f:SI 22 %r22 [478]) (const_int 388
[0x184]) ...).
This is the way that f-m-o works. It can also be seen in the f-m-o dumps, where
offsets changes to memory ops are reported as 'Memory offset changed' and
instructions which got their offset propagated (like insns 272, 276, 280) are
reported as 'Instruction folded':
Memory offset changed from 0 to 388 for instruction:
(insn 273 272 276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int *)_107 +
388B]+0 S4 A32])
(reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
deferring rescan insn with uid = 273.
Memory offset changed from 0 to 392 for instruction:
(insn 277 276 280 30 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int *)_107 +
392B]+0 S4 A32])
(reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
deferring rescan insn with uid = 277.
Memory offset changed from 0 to 396 for instruction:
(insn 281 280 284 30 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int *)_107 +
396B]+0 S4 A32])
(reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
deferring rescan insn with uid = 281.
Memory offset changed from 0 to 400 for instruction:
(insn 285 301 286 30 (set (mem:SI (reg/f:SI 19 %r19 [481]) [4 MEM[(int *)_107 +
400B]+0 S4 A32])
(reg:SI 11 %r11 [orig:133 vect_pretmp_36.2453 ] [133]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
deferring rescan insn with uid = 285.
Instruction folded:(insn 272 269 273 30 (set (reg/f:SI 22 %r22 [478])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 388 [0x184]))) "../Python/compile.c":5970:20 120
{addsi3}
(nil))
deferring rescan insn with uid = 1241.
deferring rescan insn with uid = 1241.
deferring deletion of insn with uid = 272.
Instruction folded:(insn 276 273 277 30 (set (reg/f:SI 21 %r21 [479])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 392 [0x188]))) "../Python/compile.c":5970:20 120
{addsi3}
(nil))
deferring rescan insn with uid = 1242.
deferring rescan insn with uid = 1242.
deferring deletion of insn with uid = 276.
Instruction folded:(insn 280 277 281 30 (set (reg/f:SI 20 %r20 [480])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 396 [0x18c]))) "../Python/compile.c":5970:20 120
{addsi3}
(nil))
deferring rescan insn with uid = 1243.
deferring rescan insn with uid = 1243.
deferring deletion of insn with uid = 280.
Instruction folded:(insn 284 281 299 30 (set (reg/f:SI 19 %r19 [481])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 400 [0x190]))) "../Python/compile.c":5970:20 120
{addsi3}
(nil))
If I'm missing something that makes this illegal please explain it to me.
Thanks,
Manolis
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (44 preceding siblings ...)
2023-11-13 9:33 ` manolis.tsamis at vrull dot eu
@ 2023-11-13 9:37 ` manolis.tsamis at vrull dot eu
2023-11-13 13:20 ` manolis.tsamis at vrull dot eu
` (9 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2023-11-13 9:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #45 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
(In reply to Jeffrey A. Law from comment #41)
> I would agree. In fact,the whole point of the f-m-o pass is to bring those
> immediates into the memory reference. It'd be really useful to know why
> that isn't happening.
>
> The only thing I can think of would be if multiple instructions needed the
> %r20 in the RTL you attached. Which might point to a refinement we should
> make in f-m-o, specifically the transformation isn't likely profitable if we
> aren't able to fold away a term or fold a constant term into the actual
> memory reference.
Jeff,
I'm confused about "It'd be really useful to know why that isn't happening.".
It can be seen in Dave's dumps that it *is* happening, e.g.:
Memory offset changed from 0 to 396 for instruction:
(insn 281 280 284 30 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int *)_107 +
396B]+0 S4 A32])
(reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
"../Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
Instruction folded:(insn 280 277 281 30 (set (reg/f:SI 20 %r20 [480])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 396 [0x18c]))) "../Python/compile.c":5970:20 120
{addsi3}
(nil))
If you looks at the RTL in f-m-o all these offsets are indeed moved in the
respective load/store. I don't know if cprop afterwards manages to eliminate
the unwanted move, but f-m-o does what it's supposed to do in this case.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (45 preceding siblings ...)
2023-11-13 9:37 ` manolis.tsamis at vrull dot eu
@ 2023-11-13 13:20 ` manolis.tsamis at vrull dot eu
2023-11-13 15:06 ` dave.anglin at bell dot net
` (8 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2023-11-13 13:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #46 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
I have reproduced the segfault with f-m-o limited to only fold insn 272 from
compiler_call_helper. The exact transformation is:
Memory offset changed from 0 to 388 for instruction:
(insn 273 272 276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(intD.1
*)_107 + 388B]+0 S4 A32])
(reg:SI 14 %r14 [orig:167 vect_pretmp_36.2448D.32932 ] [167]))
"Python/compile.c":5970:20 42 {*pa.md:2193}
(nil))
deferring rescan insn with uid = 273.
Instruction folded:(insn 272 269 273 30 (set (reg/f:SI 22 %r22 [478])
(plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
(const_int 388 [0x184]))) "Python/compile.c":5970:20 120 {addsi3}
(nil))
This instruction is also included to the ones that Dave mentioned. Again, if
I'm missing something as to why this transformation is illegal please tell me.
Given these are also consecutive instructions, I'm just seeing here that
%r22 = %r19 + 388
[%r22] = %r14
is transformed to
%r22 = %r19
[%r22 + 388] = %r14
I haven't tracked all other uses of %r22 yet, but in theory if there was any
non-foldable use of that register then the transformation wouldn't be made.
Manolis
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (46 preceding siblings ...)
2023-11-13 13:20 ` manolis.tsamis at vrull dot eu
@ 2023-11-13 15:06 ` dave.anglin at bell dot net
2023-11-13 15:26 ` manolis.tsamis at vrull dot eu
` (7 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: dave.anglin at bell dot net @ 2023-11-13 15:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #47 from dave.anglin at bell dot net ---
On 2023-11-13 4:33 a.m., manolis.tsamis at vrull dot eu wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
>
> --- Comment #44 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
> (In reply to John David Anglin from comment #39)
>> In the f-m-o pass, the following three insns that set call clobbered
>> registers r20-r22 are pulled from loop:
>>
>> (insn 186 183 190 29 (set (reg/f:SI 22 %r22 [478])
>> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
>> (const_int 388 [0x184]))) "../Python/compile.c":5964:9 120
>> {addsi3}
>> (nil))
>> (insn 190 186 187 29 (set (reg/f:SI 21 %r21 [479])
>> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
>> (const_int 392 [0x188]))) "../Python/compile.c":5964:9 120
>> {addsi3}
>> (nil))
>> (insn 194 191 195 29 (set (reg/f:SI 20 %r20 [480])
>> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
>> (const_int 396 [0x18c]))) "../Python/compile.c":5964:9 120
>> {addsi3}
>> (nil))
>>
>> They are used in the following insns before call to compiler_visit_expr1:
>>
>> (insn 242 238 258 32 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int
>> *)prephit
>> mp_37 + 388B]+0 S4 A32])
>> (reg:SI 23 %r23 [orig:173 vect__102.2442 ] [173]))
>> "../Python/compile.c"
>> :5968:22 42 {*pa.md:2193}
>> (expr_list:REG_DEAD (reg:SI 23 %r23 [orig:173 vect__102.2442 ] [173])
>> (expr_list:REG_DEAD (reg/f:SI 22 %r22 [478])
>> (nil))))
>> (insn 258 242 246 32 (set (reg:SI 26 %r26)
>> (reg/v/f:SI 5 %r5 [orig:198 c ] [198]))
>> "../Python/compile.c":5969:15 42 {*pa.md:2193}
>> (nil))
>> (insn 246 258 250 32 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int
>> *)prephitmp_37 + 392B]+0 S4 A32])
>> (reg:SI 29 %r29 [orig:169 vect__102.2443 ] [169]))
>> "../Python/compile.c":5968:22 42 {*pa.md:2193}
>> (expr_list:REG_DEAD (reg:SI 29 %r29 [orig:169 vect__102.2443 ] [169])
>> (expr_list:REG_DEAD (reg/f:SI 21 %r21 [479])
>> (nil))))
>> (insn 250 246 254 32 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int
>> *)prephitmp_37 + 396B]+0 S4 A32])
>> (reg:SI 31 %r31 [orig:145 vect__102.2444 ] [145]))
>> "../Python/compile.c":5968:22 42 {*pa.md:2193}
>> (expr_list:REG_DEAD (reg:SI 31 %r31 [orig:145 vect__102.2444 ] [145])
>> (expr_list:REG_DEAD (reg/f:SI 20 %r20 [480])
>> (nil))))
>>
>> After the call, we have:
>>
>> (insn 1241 269 273 30 (set (reg/f:SI 22 %r22 [478])
>> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
>> "../Python/compile.c":5970:20 -1
>> (nil))
>> (insn 273 1241 1242 30 (set (mem:SI (plus:SI (reg/f:SI 22 %r22 [478])
>> (const_int 388 [0x184])) [4 MEM[(int *)_107 + 388B]+0 S4
>> A32])
>> (reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167]))
>> "../Python/compile.c":5970:20 42 {*pa.md:2193}
>> (nil))
>> (insn 1242 273 277 30 (set (reg/f:SI 21 %r21 [479])
>> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
>> "../Python/compile.c":5970:20 -1
>> (nil))
>> (insn 277 1242 1243 30 (set (mem:SI (plus:SI (reg/f:SI 21 %r21 [479])
>> (const_int 392 [0x188])) [4 MEM[(int *)_107 + 392B]+0 S4
>> A32])
>> (reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156]))
>> "../Python/compile.c":5970:20 42 {*pa.md:2193}
>> (nil))
>> (insn 1243 277 281 30 (set (reg/f:SI 20 %r20 [480])
>> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
>> "../Python/compile.c":5970:20 -1
>> (nil))
>> (insn 281 1243 299 30 (set (mem:SI (plus:SI (reg/f:SI 20 %r20 [480])
>> (const_int 396 [0x18c])) [4 MEM[(int *)_107 + 396B]+0 S4
>> A32])
>> (reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
>> "../Python/compile.c":5970:20 42 {*pa.md:2193}
>> (nil))
>>
>> We have lost the offsets that were added initially to r20, r21 and r22.
>>
>> Previous ce3 pass had:
>>
>> (insn 272 269 273 30 (set (reg/f:SI 22 %r22 [478])
>> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
>> (const_int 388 [0x184]))) "../Python/compile.c":5970:20 120
>> {addsi3}
>> (nil))
>> (insn 273 272 276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int
>> *)_107 + 388B]+0 S4 A32])
>> (reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167]))
>> "../Python/compile.c":5970:20 42 {*pa.md:2193}
>> (nil))
>> (insn 276 273 277 30 (set (reg/f:SI 21 %r21 [479])
>> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
>> (const_int 392 [0x188]))) "../Python/compile.c":5970:20 120
>> {addsi3}
>> (nil))
>> (insn 277 276 280 30 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int
>> *)_107 + 392B]+0 S4 A32])
>> (reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156]))
>> "../Python/compile.c":5970:20 42 {*pa.md:2193}
>> (nil))
>> (insn 280 277 281 30 (set (reg/f:SI 20 %r20 [480])
>> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
>> (const_int 396 [0x18c]))) "../Python/compile.c":5970:20 120
>> {addsi3}
>> (nil))
>> (insn 281 280 284 30 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int
>> *)_107 + 396B]+0 S4 A32])
>> (reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
>> "../Python/compile.c":5970:20 42 {*pa.md:2193}
>> (nil))
>>
>> So, this is a f-m-o bug.
> Hi Dave,
>
> I don't see an f-m-o bug here. The offsets aren't lost, they're just moved in
> the corresponding memory loads/stores. If you look the stores in ce3 they
> don't have offsets whereas after f-m-o they have. E.g. in ce3: (insn 273 272
> 276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) ...) but in f-m-o it is (insn 273
> 1241 1242 30 (set (mem:SI (plus:SI (reg/f:SI 22 %r22 [478]) (const_int 388
> [0x184]) ...).
>
> This is the way that f-m-o works. It can also be seen in the f-m-o dumps, where
> offsets changes to memory ops are reported as 'Memory offset changed' and
> instructions which got their offset propagated (like insns 272, 276, 280) are
> reported as 'Instruction folded':
Hi Manolis,
If you look at the f-m-o transformation applied to insn 272 and insn 273, you
will see that
"reg/f:SI 22 %r22 [478]" is not dead after these insns. The transformation
changes the value
of r22 which is wrong without changing all uses of the register and adjusting
the other sets
for the register. It only changed the use in insn 273 and not the uses earlier
in the loop.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (47 preceding siblings ...)
2023-11-13 15:06 ` dave.anglin at bell dot net
@ 2023-11-13 15:26 ` manolis.tsamis at vrull dot eu
2023-11-13 21:46 ` danglin at gcc dot gnu.org
` (6 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2023-11-13 15:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #48 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
(In reply to dave.anglin from comment #47)
> On 2023-11-13 4:33 a.m., manolis.tsamis at vrull dot eu wrote:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
> >
> > --- Comment #44 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
> > (In reply to John David Anglin from comment #39)
> >> In the f-m-o pass, the following three insns that set call clobbered
> >> registers r20-r22 are pulled from loop:
> >>
> >> (insn 186 183 190 29 (set (reg/f:SI 22 %r22 [478])
> >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> >> (const_int 388 [0x184]))) "../Python/compile.c":5964:9 120
> >> {addsi3}
> >> (nil))
> >> (insn 190 186 187 29 (set (reg/f:SI 21 %r21 [479])
> >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> >> (const_int 392 [0x188]))) "../Python/compile.c":5964:9 120
> >> {addsi3}
> >> (nil))
> >> (insn 194 191 195 29 (set (reg/f:SI 20 %r20 [480])
> >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> >> (const_int 396 [0x18c]))) "../Python/compile.c":5964:9 120
> >> {addsi3}
> >> (nil))
> >>
> >> They are used in the following insns before call to compiler_visit_expr1:
> >>
> >> (insn 242 238 258 32 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int
> >> *)prephit
> >> mp_37 + 388B]+0 S4 A32])
> >> (reg:SI 23 %r23 [orig:173 vect__102.2442 ] [173]))
> >> "../Python/compile.c"
> >> :5968:22 42 {*pa.md:2193}
> >> (expr_list:REG_DEAD (reg:SI 23 %r23 [orig:173 vect__102.2442 ] [173])
> >> (expr_list:REG_DEAD (reg/f:SI 22 %r22 [478])
> >> (nil))))
> >> (insn 258 242 246 32 (set (reg:SI 26 %r26)
> >> (reg/v/f:SI 5 %r5 [orig:198 c ] [198]))
> >> "../Python/compile.c":5969:15 42 {*pa.md:2193}
> >> (nil))
> >> (insn 246 258 250 32 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int
> >> *)prephitmp_37 + 392B]+0 S4 A32])
> >> (reg:SI 29 %r29 [orig:169 vect__102.2443 ] [169]))
> >> "../Python/compile.c":5968:22 42 {*pa.md:2193}
> >> (expr_list:REG_DEAD (reg:SI 29 %r29 [orig:169 vect__102.2443 ] [169])
> >> (expr_list:REG_DEAD (reg/f:SI 21 %r21 [479])
> >> (nil))))
> >> (insn 250 246 254 32 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int
> >> *)prephitmp_37 + 396B]+0 S4 A32])
> >> (reg:SI 31 %r31 [orig:145 vect__102.2444 ] [145]))
> >> "../Python/compile.c":5968:22 42 {*pa.md:2193}
> >> (expr_list:REG_DEAD (reg:SI 31 %r31 [orig:145 vect__102.2444 ] [145])
> >> (expr_list:REG_DEAD (reg/f:SI 20 %r20 [480])
> >> (nil))))
> >>
> >> After the call, we have:
> >>
> >> (insn 1241 269 273 30 (set (reg/f:SI 22 %r22 [478])
> >> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
> >> "../Python/compile.c":5970:20 -1
> >> (nil))
> >> (insn 273 1241 1242 30 (set (mem:SI (plus:SI (reg/f:SI 22 %r22 [478])
> >> (const_int 388 [0x184])) [4 MEM[(int *)_107 + 388B]+0 S4
> >> A32])
> >> (reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167]))
> >> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> >> (nil))
> >> (insn 1242 273 277 30 (set (reg/f:SI 21 %r21 [479])
> >> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
> >> "../Python/compile.c":5970:20 -1
> >> (nil))
> >> (insn 277 1242 1243 30 (set (mem:SI (plus:SI (reg/f:SI 21 %r21 [479])
> >> (const_int 392 [0x188])) [4 MEM[(int *)_107 + 392B]+0 S4
> >> A32])
> >> (reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156]))
> >> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> >> (nil))
> >> (insn 1243 277 281 30 (set (reg/f:SI 20 %r20 [480])
> >> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]))
> >> "../Python/compile.c":5970:20 -1
> >> (nil))
> >> (insn 281 1243 299 30 (set (mem:SI (plus:SI (reg/f:SI 20 %r20 [480])
> >> (const_int 396 [0x18c])) [4 MEM[(int *)_107 + 396B]+0 S4
> >> A32])
> >> (reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
> >> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> >> (nil))
> >>
> >> We have lost the offsets that were added initially to r20, r21 and r22.
> >>
> >> Previous ce3 pass had:
> >>
> >> (insn 272 269 273 30 (set (reg/f:SI 22 %r22 [478])
> >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> >> (const_int 388 [0x184]))) "../Python/compile.c":5970:20 120
> >> {addsi3}
> >> (nil))
> >> (insn 273 272 276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int
> >> *)_107 + 388B]+0 S4 A32])
> >> (reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167]))
> >> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> >> (nil))
> >> (insn 276 273 277 30 (set (reg/f:SI 21 %r21 [479])
> >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> >> (const_int 392 [0x188]))) "../Python/compile.c":5970:20 120
> >> {addsi3}
> >> (nil))
> >> (insn 277 276 280 30 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int
> >> *)_107 + 392B]+0 S4 A32])
> >> (reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156]))
> >> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> >> (nil))
> >> (insn 280 277 281 30 (set (reg/f:SI 20 %r20 [480])
> >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])
> >> (const_int 396 [0x18c]))) "../Python/compile.c":5970:20 120
> >> {addsi3}
> >> (nil))
> >> (insn 281 280 284 30 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int
> >> *)_107 + 396B]+0 S4 A32])
> >> (reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134]))
> >> "../Python/compile.c":5970:20 42 {*pa.md:2193}
> >> (nil))
> >>
> >> So, this is a f-m-o bug.
> > Hi Dave,
> >
> > I don't see an f-m-o bug here. The offsets aren't lost, they're just moved in
> > the corresponding memory loads/stores. If you look the stores in ce3 they
> > don't have offsets whereas after f-m-o they have. E.g. in ce3: (insn 273 272
> > 276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) ...) but in f-m-o it is (insn 273
> > 1241 1242 30 (set (mem:SI (plus:SI (reg/f:SI 22 %r22 [478]) (const_int 388
> > [0x184]) ...).
> >
> > This is the way that f-m-o works. It can also be seen in the f-m-o dumps, where
> > offsets changes to memory ops are reported as 'Memory offset changed' and
> > instructions which got their offset propagated (like insns 272, 276, 280) are
> > reported as 'Instruction folded':
> Hi Manolis,
>
> If you look at the f-m-o transformation applied to insn 272 and insn 273,
> you will see that
> "reg/f:SI 22 %r22 [478]" is not dead after these insns. The transformation
> changes the value
> of r22 which is wrong without changing all uses of the register and
> adjusting the other sets
> for the register. It only changed the use in insn 273 and not the uses
> earlier in the loop.
I see, thanks for pointing that out! I'll debug this further and see why it
misses f-m-o's use detection code.
Manolis
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (48 preceding siblings ...)
2023-11-13 15:26 ` manolis.tsamis at vrull dot eu
@ 2023-11-13 21:46 ` danglin at gcc dot gnu.org
2023-11-16 17:43 ` cvs-commit at gcc dot gnu.org
` (5 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2023-11-13 21:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #49 from John David Anglin <danglin at gcc dot gnu.org> ---
Created attachment 56576
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56576&action=edit
Patch to improve reg+d address handling
This patch revise pa_legitimate_address_p to allow 14-bit displacements
for all memory accesses before reload. Comments and flow in this routine
are improved.
So far, I haven't seen any issues related to reloading out-of-range
floating-point accesses
This significantly improves code generation and saves more than two
thousand instructions in compile.s. I was able to successfully build
python with the patched compiler.
This is version two of the change and it still needs more testing.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (49 preceding siblings ...)
2023-11-13 21:46 ` danglin at gcc dot gnu.org
@ 2023-11-16 17:43 ` cvs-commit at gcc dot gnu.org
2023-11-27 20:55 ` sjames at gcc dot gnu.org
` (4 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-16 17:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #50 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by John David Anglin <danglin@gcc.gnu.org>:
https://gcc.gnu.org/g:d2934eb6ae92471484469d8ddd039eb34ef400b1
commit r14-5538-gd2934eb6ae92471484469d8ddd039eb34ef400b1
Author: John David Anglin <danglin@gcc.gnu.org>
Date: Thu Nov 16 17:42:26 2023 +0000
hppa: Revise REG+D address support to allow long displacements before
reload
In analyzing PR rtl-optimization/112415, I realized that restricting
REG+D offsets to 5-bits before reload results in very poor code and
complexities in optimizing these instructions after reload. The
general problem is long displacements are not allowed for floating
point accesses when generating PA 1.1 code. Even with PA 2.0, there
is a ELF linker bug that prevents using long displacements for
floating point loads and stores.
In the past, enabling long displacements before reload caused issues
in reload. However, there have been fixes in the handling of reloads
for floating-point accesses. This change allows long displacements
before reload and corrects a couple of issues in the constraint
handling for integer and floating-point accesses.
2023-11-16 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR rtl-optimization/112415
* config/pa/pa.cc (pa_legitimate_address_p): Allow 14-bit
displacements before reload. Simplify logic flow. Revise
comments.
* config/pa/pa.h (TARGET_ELF64): New define.
(INT14_OK_STRICT): Update define and comment.
* config/pa/pa64-linux.h (TARGET_ELF64): Define.
* config/pa/predicates.md (base14_operand): Don't check
alignment of short displacements.
(integer_store_memory_operand): Don't return true when
reload_in_progress is true. Remove INT_5_BITS check.
(floating_point_store_memory_operand): Don't return true when
reload_in_progress is true. Use INT14_OK_STRICT to check
whether long displacements are always okay.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (50 preceding siblings ...)
2023-11-16 17:43 ` cvs-commit at gcc dot gnu.org
@ 2023-11-27 20:55 ` sjames at gcc dot gnu.org
2023-11-28 12:39 ` manolis.tsamis at vrull dot eu
` (3 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-11-27 20:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #51 from Sam James <sjames at gcc dot gnu.org> ---
manolis, did you have a chance to look at the remaining pass issue? You'll need
to revert Dave's commit locally which made the issue latent for building
Python.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (51 preceding siblings ...)
2023-11-27 20:55 ` sjames at gcc dot gnu.org
@ 2023-11-28 12:39 ` manolis.tsamis at vrull dot eu
2024-03-18 0:22 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
55 siblings, 0 replies; 57+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2023-11-28 12:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #52 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
(In reply to Sam James from comment #51)
> manolis, did you have a chance to look at the remaining pass issue? You'll
> need to revert Dave's commit locally which made the issue latent for
> building Python.
Hi Sam, I had to work on some other things so I didn't get to find a fix yet,
but I'll be working on that again now (in light of the new info from PR111601
too).
Thanks for the ping,
Manolis
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (52 preceding siblings ...)
2023-11-28 12:39 ` manolis.tsamis at vrull dot eu
@ 2024-03-18 0:22 ` cvs-commit at gcc dot gnu.org
2024-03-18 0:39 ` danglin at gcc dot gnu.org
2024-03-22 13:34 ` law at gcc dot gnu.org
55 siblings, 0 replies; 57+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-03-18 0:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #53 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by John David Anglin <danglin@gcc.gnu.org>:
https://gcc.gnu.org/g:f0fda1aff0b752e4182c009c5526b9306bd35f7c
commit r14-9511-gf0fda1aff0b752e4182c009c5526b9306bd35f7c
Author: John David Anglin <danglin@gcc.gnu.org>
Date: Mon Mar 18 00:19:36 2024 +0000
hppa: Improve handling of REG+D addresses when generating PA 2.0 code
In looking at PR 112415, it became clear that improvements could be
made in the handling of loads and stores using REG+D addresses. A
change in 2002 conflated two issues:
1) We can't generate insns with 14-bit displacements before reload
completes when generating PA 1.x code since floating-point loads and
stores only support 5-bit offsets in PA 1.x.
2) The GNU ELF 32-bit linker lacks relocation support for PA 2.0
floating point instructions with 14-bit displacements. These
relocations affect instructions with symbolic references.
The result of the change was to block creation of PA 2.0 instructions
with 14-bit REG_D displacements for SImode, DImode, SFmode and DFmode
on the GNU linux target before reload. This was unnecessary as these
instructions don't need relocation.
This change revises the INT14_OK_STRICT define to allow creation
of instructions with 14-bit REG+D addresses before reload when
generating PA 2.0 code.
2024-03-17 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR rtl-optimization/112415
* config/pa/pa.cc (pa_emit_move_sequence): Revise condition
for symbolic memory operands.
(pa_legitimate_address_p): Revise LO_SUM condition.
* config/pa/pa.h (INT14_OK_STRICT): Revise define. Move
comment about GNU linker to predicates.md.
* config/pa/predicates.md (floating_point_store_memory_operand):
Revise condition for symbolic memory operands. Update
comment.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (53 preceding siblings ...)
2024-03-18 0:22 ` cvs-commit at gcc dot gnu.org
@ 2024-03-18 0:39 ` danglin at gcc dot gnu.org
2024-03-22 13:34 ` law at gcc dot gnu.org
55 siblings, 0 replies; 57+ messages in thread
From: danglin at gcc dot gnu.org @ 2024-03-18 0:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #54 from John David Anglin <danglin at gcc dot gnu.org> ---
The f-m-o issue is probably fixed.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
` (54 preceding siblings ...)
2024-03-18 0:39 ` danglin at gcc dot gnu.org
@ 2024-03-22 13:34 ` law at gcc dot gnu.org
55 siblings, 0 replies; 57+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-22 13:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
Jeffrey A. Law <law at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #55 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Per c#54. If it turns out we're wrong, we can always reopen or file a new
report.
^ permalink raw reply [flat|nested] 57+ messages in thread
end of thread, other threads:[~2024-03-22 13:34 UTC | newest]
Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-06 21:00 [Bug rtl-optimization/112415] New: [14 regression] Python 3.11 miscompiled with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 sjames at gcc dot gnu.org
2023-11-06 21:00 ` [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA " sjames at gcc dot gnu.org
2023-11-06 21:01 ` sjames at gcc dot gnu.org
2023-11-06 21:03 ` pinskia at gcc dot gnu.org
2023-11-06 21:31 ` dave.anglin at bell dot net
2023-11-06 22:09 ` sjames at gcc dot gnu.org
2023-11-06 22:11 ` sjames at gcc dot gnu.org
2023-11-06 22:20 ` law at gcc dot gnu.org
2023-11-06 22:33 ` dave.anglin at bell dot net
2023-11-06 22:49 ` sjames at gcc dot gnu.org
2023-11-06 23:11 ` sjames at gcc dot gnu.org
2023-11-06 23:18 ` dave.anglin at bell dot net
2023-11-07 14:08 ` manolis.tsamis at vrull dot eu
2023-11-07 21:12 ` sjames at gcc dot gnu.org
2023-11-08 1:36 ` sjames at gcc dot gnu.org
2023-11-08 2:24 ` dave.anglin at bell dot net
2023-11-08 10:09 ` manolis.tsamis at vrull dot eu
2023-11-08 14:42 ` jeffreyalaw at gmail dot com
2023-11-08 18:59 ` dave.anglin at bell dot net
2023-11-08 19:07 ` pinskia at gcc dot gnu.org
2023-11-08 19:16 ` law at gcc dot gnu.org
2023-11-08 19:40 ` dave.anglin at bell dot net
2023-11-08 23:33 ` pinskia at gcc dot gnu.org
2023-11-08 23:40 ` danglin at gcc dot gnu.org
2023-11-08 23:51 ` sjames at gcc dot gnu.org
2023-11-09 0:00 ` dave.anglin at bell dot net
2023-11-09 0:02 ` sjames at gcc dot gnu.org
2023-11-09 0:07 ` law at gcc dot gnu.org
2023-11-09 0:08 ` dave.anglin at bell dot net
2023-11-09 0:23 ` dave.anglin at bell dot net
2023-11-09 18:04 ` danglin at gcc dot gnu.org
2023-11-09 19:17 ` danglin at gcc dot gnu.org
2023-11-09 20:28 ` law at gcc dot gnu.org
2023-11-09 20:41 ` dave.anglin at bell dot net
2023-11-09 23:41 ` danglin at gcc dot gnu.org
2023-11-11 19:40 ` danglin at gcc dot gnu.org
2023-11-11 19:51 ` sjames at gcc dot gnu.org
2023-11-11 20:00 ` danglin at gcc dot gnu.org
2023-11-11 20:06 ` danglin at gcc dot gnu.org
2023-11-11 20:19 ` sjames at gcc dot gnu.org
2023-11-11 21:54 ` danglin at gcc dot gnu.org
2023-11-12 15:05 ` danglin at gcc dot gnu.org
2023-11-12 15:54 ` law at gcc dot gnu.org
2023-11-12 23:59 ` danglin at gcc dot gnu.org
2023-11-13 0:24 ` law at gcc dot gnu.org
2023-11-13 9:33 ` manolis.tsamis at vrull dot eu
2023-11-13 9:37 ` manolis.tsamis at vrull dot eu
2023-11-13 13:20 ` manolis.tsamis at vrull dot eu
2023-11-13 15:06 ` dave.anglin at bell dot net
2023-11-13 15:26 ` manolis.tsamis at vrull dot eu
2023-11-13 21:46 ` danglin at gcc dot gnu.org
2023-11-16 17:43 ` cvs-commit at gcc dot gnu.org
2023-11-27 20:55 ` sjames at gcc dot gnu.org
2023-11-28 12:39 ` manolis.tsamis at vrull dot eu
2024-03-18 0:22 ` cvs-commit at gcc dot gnu.org
2024-03-18 0:39 ` danglin at gcc dot gnu.org
2024-03-22 13:34 ` law at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).