public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114032] New: ifcvt may introduce UB calls to __builtin_clz(0)
@ 2024-02-21 14:03 kristerw at gcc dot gnu.org
  2024-02-21 14:37 ` [Bug tree-optimization/114032] " jakub at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: kristerw at gcc dot gnu.org @ 2024-02-21 14:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114032

            Bug ID: 114032
           Summary: ifcvt may introduce UB calls to __builtin_clz(0)
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kristerw at gcc dot gnu.org
  Target Milestone: ---

The ifcvt pass may make the code more UB, which can be seen by compiling the
following function with -O3 for X86_64:


int a, b, i;
int scaleValueSaturate(int value) {
  if (value) {
    int result = __builtin_clz(value);
    if (-result <= a)
      return 0;
  }
  return b;
}
short dst;
short *src;
void scaleValuesSaturate() {
  for (; i; i++)
    dst = scaleValueSaturate(src[i]);
}


What is happening here is that the code for .LOOP_VECTORIZED (1, 2) != 0 always
calls __builtin_clz, even when value is 0:

  <bb 5> [local count: 955630224]:
  # i.5_21 = PHI <_7(9), i.5_20(24)>
  _2 = (long unsigned int) i.5_21;
  _3 = _2 * 2;
  _4 = src.2_1 + _3;
  _5 = *_4;
  value.0_11 = (unsigned int) _5;
  result_14 = __builtin_clz (value.0_11);
  _47 = (unsigned int) result_14;
  _48 = -_47;
  _15 = (int) _48;
  _23 = _5 != 0;
  _28 = _15 <= a.1_16;
  _46 = _23 & _28;
  prephitmp_31 = _46 ? 0 : _30;
  dst = prephitmp_31;
  _7 = i.5_21 + 1;
  i = _7;
  if (_7 != 0)
    goto <bb 9>; [89.00%]
  else
    goto <bb 14>; [11.00%]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/114032] ifcvt may introduce UB calls to __builtin_clz(0)
  2024-02-21 14:03 [Bug tree-optimization/114032] New: ifcvt may introduce UB calls to __builtin_clz(0) kristerw at gcc dot gnu.org
@ 2024-02-21 14:37 ` jakub at gcc dot gnu.org
  2024-02-22 10:50 ` rguenth at gcc dot gnu.org
  2024-02-22 11:03 ` jakub at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-21 14:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114032

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Indeed, this feels like a bug, though unlikely to actually trigger anything
wrong.
I wouldn't worry that much about ifcvt IL, but about what makes it out of the
vectorizer.

void
foo (unsigned *p)
{
  for (int i = 0; i < 1024; ++i)
    p[i] = p[i] ? __builtin_clz (p[i]) : 42;
}

void
bar (unsigned *p)
{
  for (int i = 0; i < 1024; ++i)
    p[i] = p[i] ? __builtin_clz (p[i]) : 32;
}

with -O3 -mavx512{cd,bw,vl,dq} -mlzcnt -mbmi -mbmi2 -fvect-cost-model=unlimited
Before ifcvt we have conditional __builtin_clz in foo and .CLZ (_4, 32); in
bar,
and out of the vectorizer get the same .CLZ (vect__4.10_37, 32); in both cases,
that looks ok.  But without the -mlzcnt -mbmi -mbmi2 options, we have
conditional __builtin_clz in both cases, and vectorizer results in .CLZ
(vect__4.10_37) in both cases.  We don't have value ranges on vectors though,
so not really sure what could misbehave during the optimizations later on and I
bet all the targets which support vector .CLZ/.CTZ actually have some well
defined value for zero.
Maybe just the backends even for cases like -mlzcnt -mbmi -mbmi2 should
announce C?Z_VALUE_DEFINED_AT_ZERO for vector modes if it supports vector c?z
optabs? even
if it isn't defined for scalar?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/114032] ifcvt may introduce UB calls to __builtin_clz(0)
  2024-02-21 14:03 [Bug tree-optimization/114032] New: ifcvt may introduce UB calls to __builtin_clz(0) kristerw at gcc dot gnu.org
  2024-02-21 14:37 ` [Bug tree-optimization/114032] " jakub at gcc dot gnu.org
@ 2024-02-22 10:50 ` rguenth at gcc dot gnu.org
  2024-02-22 11:03 ` jakub at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-22 10:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114032

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2024-02-22
     Ever confirmed|0                           |1
                 CC|                            |rguenth at gcc dot gnu.org
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
With other cases the fear is we'd completely unroll and decompose to scalars
again and then VRP eventually deriving wrong ranges.

So yeah, ifcvt would have to use a .COND_CALL here or some other tricks to
avoid the situation - like a .CLZ with the out-of-range behavior specified
according to what the target does.  OTOH that target behavior might differ
between scalar and vector modes (maybe even different vector modes ...).

Do we actually treat CLZ (0) as invoking undefined behavior?  IIRC we have
match patterns that rely on that but conditional on that target hook?
IL semantics based on hooks is ... bad.

Anyway - did we create a meta-bug to link all these issues together?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/114032] ifcvt may introduce UB calls to __builtin_clz(0)
  2024-02-21 14:03 [Bug tree-optimization/114032] New: ifcvt may introduce UB calls to __builtin_clz(0) kristerw at gcc dot gnu.org
  2024-02-21 14:37 ` [Bug tree-optimization/114032] " jakub at gcc dot gnu.org
  2024-02-22 10:50 ` rguenth at gcc dot gnu.org
@ 2024-02-22 11:03 ` jakub at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-22 11:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114032

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
In GCC 13 and earlier, .CLZ(0) was UB depending on CLZ_VALUE_DEFINED_AT_ZERO
(...) != 2.
In GCC 14, it is always UB, and well defined is only .CLZ(0, 32) (but with the
current
requirement that the second argument, value at zero, matches
CLZ_VALUE_DEFINED_AT_ZERO (...) == 2 filled in value for scalars (exception for
bitints).
If we wanted, we could (but probably stage1 material) allow any constant second
argument and just expand to conditional if the value doesn't match
CLZ_VALUE_DEFINED_AT_ZERO (...) != 0 filled in value.
But, still, for the ifcvt and vectorization it would be useful to know what is
the well defined value that will be cheapest to compute.
The backends could do that e.g. by supplying details for
CLZ_VALUE_DEFINED_AT_ZERO on vector modes even if scalar is undefined.
Or have some third way which would say this is not UB at 0, but we don't really
care what value at zero it has (but then e.g. VRP would need to give up on
that).

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-02-22 11:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-21 14:03 [Bug tree-optimization/114032] New: ifcvt may introduce UB calls to __builtin_clz(0) kristerw at gcc dot gnu.org
2024-02-21 14:37 ` [Bug tree-optimization/114032] " jakub at gcc dot gnu.org
2024-02-22 10:50 ` rguenth at gcc dot gnu.org
2024-02-22 11:03 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).