public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/114596] New: [OpenMP] "declare variant" scoring seems incorrect for construct selectors
@ 2024-04-05  3:02 sandra at gcc dot gnu.org
  2024-04-05  3:03 ` [Bug middle-end/114596] " sandra at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: sandra at gcc dot gnu.org @ 2024-04-05  3:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

            Bug ID: 114596
           Summary: [OpenMP] "declare variant" scoring seems incorrect for
                    construct selectors
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: sandra at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57882
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57882&action=edit
reduced test case

Background: I've been working on integrating the dynamic selector support added
for "metadirective" (patches previously submitted but not yet approved) with
the existing support for "declare variant", rewriting a bunch of code so that
they both share the same representations and functions for matching, scoring,
and sorting.  Those patches aren't yet ready to submit and are stage 1
material, but in comparing differences in test results between my new
implementation and the previous behavior, it looks to me like the scoring for
construct selectors in "declare variant" is pretty borked on mainline.

The attached testcase is a reduced version of
c-c++-common/gomp/declare-variant-12.c.  The "16+8+4" comment on f14 agrees
with what the code is doing, but it seems incorrect according to the spec.  By
my reading, the OpenMP context at the point of call to f17 is {teams, parallel,
for, parallel, for} with associated scores {1, 2, 4, 8, 16}, per sections 7.1
and 7.3 of the 5.2 spec, respectively.  My understanding of the latter part of
item (1) in 7.3

"if the traits that correspond to the construct selector set appear multiple
times in the OpenMP context, the highest valued subset of context traits that
contains all selectors in the same order are used"

is that the matching selectors don't have to appear contiguously, so the score
would be 1+8+16.  In discussion with Tobias, he thinks there is still some
ambiguity about which of the duplicates is preferred for the match, though.

I added some instrumentation to the code to try to figure out how it was coming
up with "16+8+4"; the patch is attached, and produced output 

$ install/bin/x86_64-linux-gnu-gcc declare-variant-12.c -fopenmp
-foffload=disable -S
<snip>
codes[0] = omp_for
codes[1] = omp_parallel
codes[2] = omp_for
codes[3] = omp_parallel
codes[4] = omp_teams
constructs[0] = omp_for
constructs[1] = omp_parallel
constructs[2] = omp_teams
omp_teams, i = 2, j = 4, position = 4
omp_parallel, i = 1, j = 3, position = 3
omp_for, i = 0, j = 2, position = 2
constructs[0] = omp_for, scores[0] = 4, score = 16
constructs[1] = omp_parallel, scores[1] = 3, score = 8
constructs[2] = omp_teams, scores[2] = 2, score = 4

For selector construct = {teams, parallel, for}
  score1 = 29   score2 = 29

The "codes" array is the OpenMP context, "constructs" are the traits in the
construct selector, and the "scores" array does not contain scores, but rather
the positions of the elements of the "constructs" array in the "codes" array.

I believe there are at least three bugs here:

(1) The constructs array is being processed backwards from n-1 to 0, but the
scores array is indexed from 0 to n-1, and later it's assumed both arrays are
in the same order.

(2) The codes array is backwards and so are the position values, and there's no
adjustment to subtract the position from the length of the "codes" array in
computing the corresponding score.

(3) It's choosing the wrong subset from the duplicate traits, preferring the
should-be-lower-valued outer ones over the higher-valued inner ones.

I don't know if anybody wants to tackle a bug fix for this code for GCC 14.  As
I've said, I've got a complete rewrite I'm planning to submit early in stage 1,
which I hope will be an improvement from an engineering perspective in terms of
readability/maintainability, with cleaner and better-documented interfaces and
references to the spec to explain what it is doing.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
  2024-04-05  3:02 [Bug middle-end/114596] New: [OpenMP] "declare variant" scoring seems incorrect for construct selectors sandra at gcc dot gnu.org
@ 2024-04-05  3:03 ` sandra at gcc dot gnu.org
  2024-04-05  8:06 ` burnus at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: sandra at gcc dot gnu.org @ 2024-04-05  3:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #1 from sandra at gcc dot gnu.org ---
Created attachment 57883
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57883&action=edit
patch to add instrumentation as diagnostic aid

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
  2024-04-05  3:02 [Bug middle-end/114596] New: [OpenMP] "declare variant" scoring seems incorrect for construct selectors sandra at gcc dot gnu.org
  2024-04-05  3:03 ` [Bug middle-end/114596] " sandra at gcc dot gnu.org
@ 2024-04-05  8:06 ` burnus at gcc dot gnu.org
  2024-04-05  9:10 ` burnus at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: burnus at gcc dot gnu.org @ 2024-04-05  8:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #2 from Tobias Burnus <burnus at gcc dot gnu.org> ---
Crossref to the OpenMP spec discussions [not publicly available]
related to the scoring:

* Jakub asked about this testcase in an omp-lang in the email 2019/016447.html
(w/o getting a reply.
* He then opened Issue #2028 for three dozen issues/questions
  (issue lifetime: Oct 2019 to Aug 2020; it has 52 comments)
→ The scoring issue of this PR seems to run under '30' – but while I do see the
question, I have trouble finding/understanding Alex' answer in that thread
→ 'Nov 12, 2019' seems to be the entry that has all quotes and seems to finally
settle the issue.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
  2024-04-05  3:02 [Bug middle-end/114596] New: [OpenMP] "declare variant" scoring seems incorrect for construct selectors sandra at gcc dot gnu.org
  2024-04-05  3:03 ` [Bug middle-end/114596] " sandra at gcc dot gnu.org
  2024-04-05  8:06 ` burnus at gcc dot gnu.org
@ 2024-04-05  9:10 ` burnus at gcc dot gnu.org
  2024-04-05  9:46 ` burnus at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: burnus at gcc dot gnu.org @ 2024-04-05  9:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #3 from Tobias Burnus <burnus at gcc dot gnu.org> ---
The scoring is according to TR12:

> Each trait selector for which the corresponding trait appears
> in the construct trait set in the OpenMP context is given the
> value 2^(p−1) where p is the position of the corresponding trait,
> c_p, in the construct trait set

And:

> Specifically, the ordering of the set of constructs is
> c1, . . . , cN , where c1 is the construct at the outermost
> nesting level and cN is the construct at the innermost nesting level.

and

> construct trait set — The trait set that consists of all enclosing
> constructs at a given point in an OpenMP program up to a target construct.
>
> enclosing context — For C/C++, the innermost scope enclosing a directive.


At the call site:
  {teams, parallel, for, parallel, for}

Selector in declare variant:
  {teams, parallel, for}

Looking at 'context traits that contains all selectors in the same order are
used', I see:

(1)  *teams* *parallel* *for* parallel for
(2)  *teams* *parallel* for parallel *for*
(3)  *teams* parallel for *parallel* *for*

Sandra wrote:
> By my reading, the OpenMP context at the point of call to f17 is
> {teams, parallel, for, parallel, for} with associated scores
> {1, 2, 4, 8, 16}

In my reading (see second quote above), the order of enclosing contexts is
inside out and the scores are:
p = {5,4,3,2,1} → score = {16,8,4,2,1}

Thus:
(1) 16 + 8 + 4 = 28
(2) 16 + 8 + 1 = 25
(3) 16 + 2 + 1 = 19

where (1) wins because of:

> if the traits that correspond to the construct selector set appear
> multiple times in the OpenMP context, the highest valued subset of
> context traits that contains all trait selectors in the same order
> are used;

* * *

This matches what the testcase has:

> match (construct={teams,parallel,for}) /* 16+8+4 */  (= 28)

* * *

-----<quote>----------
$ install/bin/x86_64-linux-gnu-gcc declare-variant-12.c -fopenmp
-foffload=disable -S
...
constructs[0] = omp_for, scores[0] = 4, score = 16
constructs[1] = omp_parallel, scores[1] = 3, score = 8
constructs[2] = omp_teams, scores[2] = 2, score = 4

For selector construct = {teams, parallel, for}
  score1 = 29   score2 = 29
-----<quote>----------

The constructs[i]'s score look fine.

But I wonder why score == 29 and not 28.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
  2024-04-05  3:02 [Bug middle-end/114596] New: [OpenMP] "declare variant" scoring seems incorrect for construct selectors sandra at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-04-05  9:10 ` burnus at gcc dot gnu.org
@ 2024-04-05  9:46 ` burnus at gcc dot gnu.org
  2024-04-05 14:55 ` sandra at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: burnus at gcc dot gnu.org @ 2024-04-05  9:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #4 from Tobias Burnus <burnus at gcc dot gnu.org> ---
> For selector construct = {teams, parallel, for}
>   score1 = 29   score2 = 29
> -----<quote>----------
>
> The constructs[i]'s score look fine.
>
> But I wonder why score == 29 and not 28.

Answer: omp_context_compute_score starts with:

  *score = 1;

which is set (only) in the error case to "-1" and otherwise, scores only get
added. — I think a constant offset to all scores does not harm (as it is only
used internally) but I still find it confusing.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
  2024-04-05  3:02 [Bug middle-end/114596] New: [OpenMP] "declare variant" scoring seems incorrect for construct selectors sandra at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-04-05  9:46 ` burnus at gcc dot gnu.org
@ 2024-04-05 14:55 ` sandra at gcc dot gnu.org
  2024-04-05 15:26 ` burnus at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: sandra at gcc dot gnu.org @ 2024-04-05 14:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #5 from sandra at gcc dot gnu.org ---
Tobias, it looks to me like you missed the connection between the first half of
item (1) in 7.3 (I'm still looking at the 5.2 spec):

"Each trait selector for which the corresponding trait appears in the construct
trait set in the OpenMP context is given the value 2**(p−1) where p is the
position of the corresponding trait, cp, in the context construct trait set..."

And this part of section 7.1:

"Specifically, the ordering of the set of constructs is c1 , ... , cN , where
c1 is the construct at the outermost nesting level and cN is the construct at
the innermost nesting level."

So, the outermost construct has the *lowest* position and *lowest* score.  In
this case, the "teams" construct is outermost, has p=1, and contributes 2**0 to
the score.  And to get the highest score for the duplicated construct traits,
it should prefer the innermost ones.

Re the adding one to the final score, that is also explicit in the spec, item 6
at the very end of section 7.3.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
  2024-04-05  3:02 [Bug middle-end/114596] New: [OpenMP] "declare variant" scoring seems incorrect for construct selectors sandra at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-04-05 14:55 ` sandra at gcc dot gnu.org
@ 2024-04-05 15:26 ` burnus at gcc dot gnu.org
  2024-04-05 16:06 ` sandra at gcc dot gnu.org
  2024-05-13 17:48 ` sandra at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: burnus at gcc dot gnu.org @ 2024-04-05 15:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #6 from Tobias Burnus <burnus at gcc dot gnu.org> ---
(In reply to sandra from comment #5)
> Tobias, it looks to me like you missed the connection ...

No, I didn't - I did link it (cf. top of comment 3) — but I just cannot read
:-/

Hence, for
(1)  *teams* *parallel* *for* parallel for
(2)  *teams* *parallel* for parallel *for*
(3)  *teams* parallel for *parallel* *for*

we now have
p = {1,2,3,4,5} → score = {1,2,4,8,16}

Thus:
(1) 1 + 2 + 4 = 9
(2) 1 + 2 + 16 = 19
(3) 1 + 8 + 16 = 25

such that (3) wins → and, hence, either of

 (f15) match (construct={parallel},user={condition(score(19):1)}) /* 8+19 */
 (f16) match (implementation={atomic_default_mem_order(score(27):seq_cst)})

as 27 > 25 - and OpenMP states that it is implementation defined which of the
score = 27 variant is used.

* * *

Thus, I concur that there is an ordering and, hence, scoring bug.

* * * 

> I don't know if anybody wants to tackle a bug fix for this code for GCC 14.

I think we don't - we are too close to the release. This has been implemented
since 2019, 'score()' is not implemented in Clang 18, and code like that is
unlikely to be often used. And the code is non-trivial. - All speaks against
rushing to do an implementation for GCC 14.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
  2024-04-05  3:02 [Bug middle-end/114596] New: [OpenMP] "declare variant" scoring seems incorrect for construct selectors sandra at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2024-04-05 15:26 ` burnus at gcc dot gnu.org
@ 2024-04-05 16:06 ` sandra at gcc dot gnu.org
  2024-05-13 17:48 ` sandra at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: sandra at gcc dot gnu.org @ 2024-04-05 16:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #7 from sandra at gcc dot gnu.org ---
OK, I will do no more work on the old implementation, adjust the broken
testcases, and proceed with getting the my new implementation ready for stage 1
submission.  I don't know if I'll be able to easily disentangle the bug fixes
from the new functionality, though....  everything is kind of mixed up together
with changes in representation and interfaces to get metadirectives and
"declare variant" using the same functions to score and sort the candidate
variants.  (My previously-posted set of metadirective patches also had a
scoring bug that I'm addressing with this set of changes.)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/114596] [OpenMP] "declare variant" scoring seems incorrect for construct selectors
  2024-04-05  3:02 [Bug middle-end/114596] New: [OpenMP] "declare variant" scoring seems incorrect for construct selectors sandra at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2024-04-05 16:06 ` sandra at gcc dot gnu.org
@ 2024-05-13 17:48 ` sandra at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: sandra at gcc dot gnu.org @ 2024-05-13 17:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114596

--- Comment #8 from sandra at gcc dot gnu.org ---
This bug is addressed in the metadirective/dynamic selector patch set I posted
here:

https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650725.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-05-13 17:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-05  3:02 [Bug middle-end/114596] New: [OpenMP] "declare variant" scoring seems incorrect for construct selectors sandra at gcc dot gnu.org
2024-04-05  3:03 ` [Bug middle-end/114596] " sandra at gcc dot gnu.org
2024-04-05  8:06 ` burnus at gcc dot gnu.org
2024-04-05  9:10 ` burnus at gcc dot gnu.org
2024-04-05  9:46 ` burnus at gcc dot gnu.org
2024-04-05 14:55 ` sandra at gcc dot gnu.org
2024-04-05 15:26 ` burnus at gcc dot gnu.org
2024-04-05 16:06 ` sandra at gcc dot gnu.org
2024-05-13 17:48 ` sandra at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).