public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/106952] New: Missed optimization: x < y ? x : y not lowered to minss
@ 2022-09-15 15:30 tavianator at gmail dot com
  2022-09-15 15:42 ` [Bug target/106952] " amonakov at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: tavianator at gmail dot com @ 2022-09-15 15:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106952

            Bug ID: 106952
           Summary: Missed optimization: x < y ? x : y not lowered to
                    minss
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tavianator at gmail dot com
  Target Milestone: ---

Created attachment 53580
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53580&action=edit
Assembly from gcc -O3 -S bug.c

The following is an implementation of a ray/axis-aligned box intersection test:

struct ray {
    float origin[3];
    float dir_inv[3];
};

struct box {
    float min[3];
    float max[3];
};

static inline float min(float x, float y) {
    return x < y ? x : y;
}

static inline float max(float x, float y) {
    return x < y ? x : y;
}

_Bool intersection(const struct ray *ray, const struct box *box) {
    float tmin = 0.0, tmax = 1.0 / 0.0;

    for (int i = 0; i < 3; ++i) {
        float t1 = (box->min[i] - ray->origin[i]) * ray->dir_inv[i];
        float t2 = (box->max[i] - ray->origin[i]) * ray->dir_inv[i];

        tmin = min(max(t1, tmin), max(t2, tmin));
        tmax = max(min(t1, tmax), min(t2, tmax));
    }

    return tmin < tmax;
}

However, gcc -O3 doesn't use minss/maxss for every min()/max().  Instead, some
of them are lowered to conditional jumps which regresses performance
significantly since the branches are unpredictable.

Simpler variants like

        tmin = max(tmin, min(t1, t2));
        tmax = min(tmax, max(t1, t2));

get the desired codegen, but that behaves differently if t1 or t2 is NaN.

"Bisecting" with godbolt.org, it seems this is an old regression: 4.8.5 was
good, but 4.9.0 was bad.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-07-21  8:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-15 15:30 [Bug target/106952] New: Missed optimization: x < y ? x : y not lowered to minss tavianator at gmail dot com
2022-09-15 15:42 ` [Bug target/106952] " amonakov at gcc dot gnu.org
2022-09-15 15:54 ` tavianator at gmail dot com
2022-09-17  4:47 ` rguenth at gcc dot gnu.org
2023-07-18 10:39 ` rguenth at gcc dot gnu.org
2023-07-20  8:39 ` rguenth at gcc dot gnu.org
2023-07-21  8:18 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).