public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/50856] New: ARM: suboptimal code for absolute difference calculation
@ 2011-10-24 14:53 siarhei.siamashka at gmail dot com
2011-10-24 17:13 ` [Bug target/50856] " rearnsha at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: siarhei.siamashka at gmail dot com @ 2011-10-24 14:53 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50856
Bug #: 50856
Summary: ARM: suboptimal code for absolute difference
calculation
Classification: Unclassified
Product: gcc
Version: 4.7.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: siarhei.siamashka@gmail.com
gcc generates suboptimal code on ARM for "abs(a - b)" type of operation, which
is used for example in paeth png filter: http://www.w3.org/TR/PNG-Filters.html
Given the following test code:
int absolute_difference1(unsigned char a, unsigned char b)
{
return a > b ? a - b : b - a;
}
int absolute_difference2(unsigned char a, unsigned char b)
{
int tmp = a;
if ((tmp -= b) < 0)
tmp = -tmp;
return tmp;
}
The current gcc svn trunk (r180383) generates the following code for -O2 and
-Os optimizations:
.cpu arm10tdmi
.eabi_attribute 27, 3
.eabi_attribute 28, 1
.fpu vfp
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 2
.eabi_attribute 30, 4
.eabi_attribute 34, 0
.eabi_attribute 18, 4
.file "test.c"
.text
.align 2
.global absolute_difference1
.type absolute_difference1, %function
absolute_difference1:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
cmp r0, r1
rsbhi r0, r1, r0
rsbls r0, r0, r1
bx lr
.size absolute_difference1, .-absolute_difference1
.align 2
.global absolute_difference2
.type absolute_difference2, %function
absolute_difference2:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
rsb r0, r1, r0
cmp r0, #0
rsblt r0, r0, #0
bx lr
.size absolute_difference2, .-absolute_difference2
.ident "GCC: (GNU) 4.7.0 20111024 (experimental)"
.section .note.GNU-stack,"",%progbits
Even in the quite explicit second code variant ('absolute_difference2'
function), gcc does not generate the expected SUBS + NEGLT pair of
instructions. Also for ARMv6 capable processors even a single USAD8 instruction
could be used here if both operands are known to have values in [0-255] range
and if high latency of this instruction can be hidden.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/50856] ARM: suboptimal code for absolute difference calculation
2011-10-24 14:53 [Bug target/50856] New: ARM: suboptimal code for absolute difference calculation siarhei.siamashka at gmail dot com
@ 2011-10-24 17:13 ` rearnsha at gcc dot gnu.org
2022-02-03 4:44 ` [Bug tree-optimization/50856] " pinskia at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2011-10-24 17:13 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50856
Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |arm
Priority|P3 |P4
Status|UNCONFIRMED |NEW
Last reconfirmed| |2011-10-24
Ever Confirmed|0 |1
--- Comment #1 from Richard Earnshaw <rearnsha at gcc dot gnu.org> 2011-10-24 17:13:14 UTC ---
Confirmed. With the second case we get quite close to getting it right (having
detected that there is an ABS expression, but we fail to merge it with the
original subtract.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/50856] ARM: suboptimal code for absolute difference calculation
2011-10-24 14:53 [Bug target/50856] New: ARM: suboptimal code for absolute difference calculation siarhei.siamashka at gmail dot com
2011-10-24 17:13 ` [Bug target/50856] " rearnsha at gcc dot gnu.org
@ 2022-02-03 4:44 ` pinskia at gcc dot gnu.org
2023-10-19 23:42 ` pinskia at gcc dot gnu.org
2023-10-20 5:15 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-03 4:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50856
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|target |tree-optimization
Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is a generic issue and not really arm related.
In phiopt4 for absolute_difference1 we have:
_14 = (int) b_7(D);
_12 = (int) a_6(D);
if (a_6(D) > b_7(D))
goto <bb 3>; [50.00%]
else
goto <bb 4>; [50.00%]
<bb 3> [local count: 536870913]:
iftmp.0_9 = _12 - _14;
goto <bb 5>; [100.00%]
<bb 4> [local count: 536870913]:
iftmp.0_8 = _14 - _12;
<bb 5> [local count: 1073741824]:
# iftmp.0_5 = PHI <iftmp.0_9(3), iftmp.0_8(4)>
While for absolute_difference2 we have in phiopt1
tmp_4 = (int) a_3(D);
_1 = (int) b_5(D);
tmp_6 = tmp_4 - _1;
if (tmp_6 < 0)
goto <bb 3>; [INV]
else
goto <bb 4>; [INV]
<bb 3> :
tmp_8 = _1 - tmp_4;
<bb 4> :
# tmp_2 = PHI <tmp_6(2), tmp_8(3)>
absolute_difference2 is easier to handle, let me try to handle both cases/
The second case we have:
(a - b) < 0 ? (b - a) : (a - b)
Or:
(cond (lt (minus@0 @1 @2) zero_p) (minus @2 @1) @0)
The first case we have is more complex.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/50856] ARM: suboptimal code for absolute difference calculation
2011-10-24 14:53 [Bug target/50856] New: ARM: suboptimal code for absolute difference calculation siarhei.siamashka at gmail dot com
2011-10-24 17:13 ` [Bug target/50856] " rearnsha at gcc dot gnu.org
2022-02-03 4:44 ` [Bug tree-optimization/50856] " pinskia at gcc dot gnu.org
@ 2023-10-19 23:42 ` pinskia at gcc dot gnu.org
2023-10-20 5:15 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-19 23:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50856
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The second case will be solved by updating the patch at:
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574892.html
For the review at
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574948.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/50856] ARM: suboptimal code for absolute difference calculation
2011-10-24 14:53 [Bug target/50856] New: ARM: suboptimal code for absolute difference calculation siarhei.siamashka at gmail dot com
` (2 preceding siblings ...)
2023-10-19 23:42 ` pinskia at gcc dot gnu.org
@ 2023-10-20 5:15 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-20 5:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50856
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Here is a full testcase (f3 is caught via fold_cond_expr_with_comparison):
```
int f(int a, int b)
{
int t = a - b;
if (t > 0) return t;
return b - a;
}
int f1(int a, int b)
{
if (a > b) return a - b;
return b - a;
}
int f2(int a, int b)
{
return (a > b) ? a - b : b - a;
}
int f3(int a, int b)
{
return (a - b) > 0 ? a - b : b - a;
}
```
I should note we currently have the following note in match.pd about not
folding `(a - b) > 0` because of catching f3:
/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.
??? The transformation is valid for the other operators if overflow
is undefined for the type, but performing it here badly interacts
with the transformation in fold_cond_expr_with_comparison which
attempts to synthetize ABS_EXPR. */
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-10-20 5:15 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-24 14:53 [Bug target/50856] New: ARM: suboptimal code for absolute difference calculation siarhei.siamashka at gmail dot com
2011-10-24 17:13 ` [Bug target/50856] " rearnsha at gcc dot gnu.org
2022-02-03 4:44 ` [Bug tree-optimization/50856] " pinskia at gcc dot gnu.org
2023-10-19 23:42 ` pinskia at gcc dot gnu.org
2023-10-20 5:15 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).