public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
@ 2016-05-16 17:25 Pekka Jääskeläinen
  2016-05-18  0:21 ` Joseph Myers
  2016-08-01 16:37 ` Martin Jambor
  0 siblings, 2 replies; 12+ messages in thread
From: Pekka Jääskeläinen @ 2016-05-16 17:25 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 68 bytes --]

The BRIG frontend itself.

-- 
Pekka Jääskeläinen
Parmance

[-- Attachment #2: 002-brig-fe-new-files.patch.gz --]
[-- Type: application/gzip, Size: 67725 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-05-16 17:25 [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself Pekka Jääskeläinen
@ 2016-05-18  0:21 ` Joseph Myers
  2016-05-18 17:00   ` Pekka Jääskeläinen
  2016-08-01 16:37 ` Martin Jambor
  1 sibling, 1 reply; 12+ messages in thread
From: Joseph Myers @ 2016-05-18  0:21 UTC (permalink / raw)
  To: Pekka Jääskeläinen; +Cc: gcc-patches

This patch has many improperly formatted diagnostic messages (e.g. 
starting with capital letters, ending with '.' or failing to use %q for 
quoting).  I also note cases where you use %lu as a format for size_t, 
which is not correct (you'd need to add pretty-print.c support for %zu 
before you could use that, however).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-05-18  0:21 ` Joseph Myers
@ 2016-05-18 17:00   ` Pekka Jääskeläinen
  2016-05-18 18:36     ` David Malcolm
  0 siblings, 1 reply; 12+ messages in thread
From: Pekka Jääskeläinen @ 2016-05-18 17:00 UTC (permalink / raw)
  To: Joseph Myers; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 564 bytes --]

Hi Joseph,

Thanks for the comments.  Updated&rebased patch attached. Hopefully I
didn't miss any diags.

On Wed, May 18, 2016 at 3:20 AM, Joseph Myers <joseph@codesourcery.com> wrote:
> This patch has many improperly formatted diagnostic messages (e.g.
> starting with capital letters, ending with '.' or failing to use %q for
> quoting).  I also note cases where you use %lu as a format for size_t,
> which is not correct (you'd need to add pretty-print.c support for %zu
> before you could use that, however).
>
> --
> Joseph S. Myers
> joseph@codesourcery.com

[-- Attachment #2: 002-brig-fe-new-files.patch.gz --]
[-- Type: application/x-gzip, Size: 75 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-05-18 17:00   ` Pekka Jääskeläinen
@ 2016-05-18 18:36     ` David Malcolm
  2016-05-19  8:48       ` Pekka Jääskeläinen
  0 siblings, 1 reply; 12+ messages in thread
From: David Malcolm @ 2016-05-18 18:36 UTC (permalink / raw)
  To: Pekka Jääskeläinen, Joseph Myers; +Cc: gcc-patches

On Wed, 2016-05-18 at 19:59 +0300, Pekka Jääskeläinen wrote:
> Hi Joseph,
> 
> Thanks for the comments.  Updated&rebased patch attached. Hopefully I
> didn't miss any diags.

It looks like the attachment doesn't contain the patch; on unzipping I
just see a 27 byte file reading "The BRIG frontend itself."

A similar problem seems to have happened with the update to 1/4:
https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01385.html
where the "patch" just has the descriptive text, but not the actual
changes.

Looks like you need to resend both.

Hope this is helpful; good luck
Dave

> On Wed, May 18, 2016 at 3:20 AM, Joseph Myers <
> joseph@codesourcery.com> wrote:
> > This patch has many improperly formatted diagnostic messages (e.g.
> > starting with capital letters, ending with '.' or failing to use %q
> > for
> > quoting).  I also note cases where you use %lu as a format for
> > size_t,
> > which is not correct (you'd need to add pretty-print.c support for
> > %zu
> > before you could use that, however).
> > 
> > --
> > Joseph S. Myers
> > joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-05-18 18:36     ` David Malcolm
@ 2016-05-19  8:48       ` Pekka Jääskeläinen
  0 siblings, 0 replies; 12+ messages in thread
From: Pekka Jääskeläinen @ 2016-05-19  8:48 UTC (permalink / raw)
  To: David Malcolm; +Cc: Joseph Myers, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 738 bytes --]

On Wed, May 18, 2016 at 9:36 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> On Wed, 2016-05-18 at 19:59 +0300, Pekka Jääskeläinen wrote:
>> Hi Joseph,
>>
>> Thanks for the comments.  Updated&rebased patch attached. Hopefully I
>> didn't miss any diags.
>
> It looks like the attachment doesn't contain the patch; on unzipping I
> just see a 27 byte file reading "The BRIG frontend itself."
>
> A similar problem seems to have happened with the update to 1/4:
> https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01385.html
> where the "patch" just has the descriptive text, but not the actual
> changes.
>
> Looks like you need to resend both.

Interesting, I wonder what happened there. Trying again.

Thanks,
Pekka

[-- Attachment #2: 002-brig-fe-new-files.patch.gz --]
[-- Type: application/x-gzip, Size: 67632 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-05-16 17:25 [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself Pekka Jääskeläinen
  2016-05-18  0:21 ` Joseph Myers
@ 2016-08-01 16:37 ` Martin Jambor
  2016-08-15 13:22   ` Pekka Jääskeläinen
  1 sibling, 1 reply; 12+ messages in thread
From: Martin Jambor @ 2016-08-01 16:37 UTC (permalink / raw)
  To: Pekka Jääskeläinen; +Cc: gcc-patches

Hi,

On Mon, May 16, 2016 at 08:25:57PM +0300, Pekka Jääskeläinen wrote:
> The BRIG frontend itself.

thanks making the effort to submit patches against trunk.  I would be
very glad to see this included in the upcoming gcc version, not least
because it would make it easy to do basic testing of the HSA back-end
on any computer.

Even though I have spent considerable time reading the patches, I have
not yet read it all, but I think I have accumulated enough notes to
post them back.  Also please note that I have actually never written a
front-end myself so it can easily happen that on many accounts you
will be right and my feedback wrong.  I hope it will still be useful,
though.

Generally speaking, I think you have made quite a good effort trying
to adhere to the GNU coding standard (you know, blanks before
parenthesis and stuff), but:

  - I would definitely appreciate more comments.  All but the most
    trivial functions should have one that describes what the function
    does, what are its arguments and what it returns.  (And there
    should be one blank line between the comment and the function
    itself).

  - We very much prefer c-style comments to c++ ones.  I hope they can
    be converted easily in some automated way.

So far I have the following specific comments:

- brig-c.h
  + please remove commented out extern GTY (()) tree brig_non_zero_struct

- brig-lang.c:
  + In the long run I would like to get rid of
      opts->x_flag_whole_program = 0 in
    brig_langhook_init_options_struct, when did it cause issues, when
    you tried LTO?  Since there obviously is no linker-plugin support,
    I think we can leave it this way for now, though.
  + brig_langhook_handle_option has argument scode marked as unused 
    but it is used.
  + brig_langhook_type_for_size uses both supposedly unused arguments
    and I am surprised that handling just 64 bits is sufficient.
  + brig_langhook_type_for_mode: the "FIXME: This static_cast should
    be in machmode.h" can be removed, the cast is in machmode.h
  + brig_langhook_eh_personality comment refers to file
    libbrig/runtime/brig-unwind.c which does not seem to exist?
  + convert has attributes marked as unused but they are used
  + The "FIXME: This is a hack to preserve trees that we create from
    the garbage collector." IMHO does not describe any real issue,
    using GTY roots for that is common practice.

- brigspec.c:
  + lang_specific_driver: if (len > 3 && strcmp (arg + len - 3,
    ".brig") == 0) can never be true, even if str end with brig.
    Consequently, the if branch marked by FIXME later on in the
    function never gets executed.  So perhaps it is not necessary?

- brigfrontend/*.cc in general:

  + A lot of functions should have a comment.  I prefer to have them
    above the function body but if it is a method, a comment in the
    class definition is fin e too (I think).  Often you have helpful
    comments inside the function but it really helps if you know what
    to expect before you start reading the function.  For example,
    brig_to_generic::add_global_variable needs a comment that it adds
    VAR_DECL to the list of global variables and if there is a "host
    def var" (I guess I know what that means but an explanation
    somewhere would not hurt either), it makes sure it points to the
    new VAR_DECL.  Another example: call_builtin is very difficult to
    review without a comment explaining what arguments it expects.
    Please make sure that all functions have a comment somewhere,
    perhaps with the exception of only the most trivial and
    self-evident.

  + Is there any reason why you use internal_error instead of
    more common gcc_assert and/or gcc_unreachable?


- brigfrontend/brig_to_generic.cc: 

  + Why does this filename have underscores while all the others have
    dashes? :-)
  + What should the sanity check of data secion in parse accomplish?
  + In build_reinterpret_cast, it would be best if you could avoid
    constructing VIEW_CONVERT_EXPRs that have type with a different
    size from the type of its operand (even though I think that Ada
    also does this, it is considered unfortunate).  What are the cases
    when this happens?

    OK, adding another note later: For converting scalars (anything
    !AGGREGATE_TYPE_P), I think you pretty much always want to use
    NOP_EXPR rather than generating a V_C_E.  V_C_E is mainly used to
    change how we interpret aggregates (or type-cast between
    aggregates and scalars).  In particular, NOP_EXPR will also
    sign-extend properly when you are extending intergers according to
    the type, whereas what will be in the "extended" part of a V_C_E
    is basically undefined.

  + in brig_to_generic::append_group_variable, I think you should not
    add align_padding to m_next_group_offset twice.  You do not do
    that in append_private_variable.

  + Remove the commented out debug_function in
    brig_to_generic::finish_function.

  + call_builtin - difficult to review without a comment explaining
    the parameters.  Apart from that, I really wonder why you go
    through the hoops of identifying builtins by names and caching
    them.  I believe you should be able to use builtin_decl_explicit
    to get the decl for the builtin code. Or am I wrong?
  
- brigfrontend/brig-code-entry-handler.cc:

  + I don't know, maybe it is standard C++ practice, but putting
    whole-front-end initialization (built-in creation) to a
    constructor of an abstract ancestor of streaming class seems a bit
    unfortunate to me.

  + Speaking about built-ins.  I know that GO FE creates them like you
    do, but I wonder whether following how c-family/c-common.c creates
    them would not be better, at least for those builtins that are in
    builtins.def and files included from it.  That way you also get
    proper function attributes without copying them over, for example.
    However, is is entirely possible there is a reason to do it
    your/Go way, I was just surprised.

    You should probably also go forward with the TODO and make the
    add_custom_builtin calls into normal gcc builtins, otherwise LTO
    will not work with BRIG.

  + brig_code_entry_handler::build_tree_operand: Please make it just
    one switch, rather than two if's followed by a switch, all of them
    based on the same field.

  + brig_code_entry_handler::build_address_operand: I think that

        gcc_assert (arg_symbol->base.kind == BRIG_KIND_DIRECTIVE_VARIABLE);

    checks for input error, rather than internal compiler error, and
    so if the symbol is not a symbol, you should call error() and bail
    out instead.  I think that given the nature of brig - there is
    little point in trying to continue parsing after an error -
    terminating the executable would be fine too.

    Also, in the calculation of the symbol_base for private segment
    symbols, I bet the third call to expand_or_call_builtin:

	  local_size
	    = build2 (MULT_EXPR, uint32_type_node,
		      expand_or_call_builtin (BRIG_OPCODE_WORKGROUPSIZE,
					      BRIG_TYPE_U32, uint32_type_node,
					      uint32_1),
		      local_size);

    should have uint32_2 as its last argument to get the 3rd dimension
    of the grid.

    How come that the "segment == BRIG_SEGMENT_ARG" is the only case
    in which you have to handle arrays?

    It also seems that if the address is from a KERNARG segment but
    also has a register component, you will overwrite the kernel
    base stored to ptr_base when dealing with the register.

    The code processing the register value should be done differently,
    I'm afraid.  First and foremost, you should not be using
    VIEW_CONVERT_EXPR (generated by build_reinterpret_cast) to
    sign-extend offsets, you should use NOP_EXPR for that.  I also do
    not think there is any need to be constructing build_pointer_type
    (char_type_node) all the time.  GIMPLE/GENERIC is not C, you can
    use any pointer, for example ptr_type_node.

    When you add pointer and offsets, you should be using
    POINTER_PLUS_EXPR rather than just PLUS_EXPR.

    Finally, if you want to type-cast the final address to the type
    specified by the instruction, again use NOP_EXPR.  However, I
    don't think it is really necessary, in the middle-end, it is the
    type of the loads and stores that matter, not really the used
    pointer.

  + brig_code_entry_handler::get_tree_expr_type_for_hsa_type: Needs a
    comment.  Why do we need it, in what way is it different than the
    previous function?  It special-cases F16 vectors, but differently,
    so which function should be used in which situation?

  + brig_code_entry_handler::add_temp_var: Seems like he coment is
    wrong, the assignment is appended to the current function but not
    returned.

  + brig_code_entry_handler::unpack: Please use BITS_PER_UNIT to refer
    to the number of bits in a byte instead of using the number 8
    directly.

  + brig_code_entry_handler::vbuild_builtin: A comment would help
    greatly.  Moreover, it seems weird that you are building an
    arbitrary(?)  type for a builtin, why don't you get it from the
    decl?

  + brig_code_entry_handler::build_operands: Please use
    BRIG_TYPE_BASE_MASK instead of the 0x01f constant when masking
    types.

    The need to always encapsulate constant(?) operands in a V_C_E is
    very strange and I really think we need to figure out what is
    going on.

  + brig_code_entry_handler::build_output_assignment: Again, Please
    use BRIG_TYPE_BASE_MASK instead of the 0x01f and BITS_PER_UNIT
    instead of 8.  The most worisome, however, is:

	  if (CONVERT_EXPR_P (inst_expr) && POINTER_TYPE_P (inst_expr))
	    {
	      // convert_to_integer crashes when converting a view convert
	      // expr to a pointer.  First cast it to a large enough int
	      // and let the next integer conversion do the clipping.
	      inst_expr = convert_to_integer (size_type_node, inst_expr);
	    }

    First of all, the above should not help, because V_C_E is not a
    CONVERT_EXPR_P.  Mainly, however, this should either be reported
    as a bug and fixed or probably avoided by not creating so many
    V_C_Es in the first place.

- brigfrontend/brig-atomic-inst-handler.cc: generate_tree - I have not
  read this file in much detail mainly because I'm still not convinced
  that organizing builtins based on their name strings is a good idea.

  However, I'd like alto to point out that you should try to avoid
  building trees if you are not going to use them, especially if it is
  really simple like for example mem_ref in this function, because it
  strains the garbage collector for no reason.

- brig-basic-inst-handler.cc, operator():
  
  if (brig_inst->opcode == BRIG_OPCODE_NOP)
    {
      return base->byteCount;
    }

  tere should be no braces when there is only one statement in a
  branch (and there is no potential confusion about elses).

  Again, please do not use 0x01F to mask brug type base, use
  BRIG_TYPE_BASE_MASK.  Uou can also use BRIG_TYPE_PACK_MASK instead
  of or'ing all the pack values.

I'll continue reading the rest,

Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-08-01 16:37 ` Martin Jambor
@ 2016-08-15 13:22   ` Pekka Jääskeläinen
  2016-08-30 16:50     ` Pekka Jääskeläinen
  0 siblings, 1 reply; 12+ messages in thread
From: Pekka Jääskeläinen @ 2016-08-15 13:22 UTC (permalink / raw)
  To: Pekka Jääskeläinen, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 14285 bytes --]

Thanks a lot for the review Martin,

Responses inline:

On Mon, Aug 1, 2016 at 7:37 PM, Martin Jambor <mjambor@suse.cz> wrote:
>
>   - I would definitely appreciate more comments.  All but the most
>     trivial functions should have one that describes what the function
>     does, what are its arguments and what it returns.  (And there
>     should be one blank line between the comment and the function
>     itself).


I added more comments. Please let me know if I missed a non-trivial one.

>
>
>   - We very much prefer c-style comments to c++ ones.  I hope they can
>     be converted easily in some automated way.


Converted.

>
>
> So far I have the following specific comments:
>
> - brig-c.h
>   + please remove commented out extern GTY (()) tree brig_non_zero_struct


Done.

>
>
> - brig-lang.c:
>   + In the long run I would like to get rid of
>       opts->x_flag_whole_program = 0 in
>     brig_langhook_init_options_struct, when did it cause issues, when
>     you tried LTO?  Since there obviously is no linker-plugin support,
>     I think we can leave it this way for now, though.


Agreed. I have not tried the current status of LTO, but it can be implemented
later.


>
>   + brig_langhook_handle_option has argument scode marked as unused
>     but it is used.


Fixed.

>
>   + brig_langhook_type_for_size uses both supposedly unused arguments
>     and I am surprised that handling just 64 bits is sufficient.


It is now being called for 32b too due to introducing the builtins like
they should.

>
>   + brig_langhook_type_for_mode: the "FIXME: This static_cast should
>     be in machmode.h" can be removed, the cast is in machmode.h
>
>   + brig_langhook_eh_personality comment refers to file
>     libbrig/runtime/brig-unwind.c which does not seem to exist?
>   + convert has attributes marked as unused but they are used


Fixed these.

>
>   + The "FIXME: This is a hack to preserve trees that we create from
>     the garbage collector." IMHO does not describe any real issue,
>     using GTY roots for that is common practice.


Removed the FIXME.

> - brigspec.c:
>   + lang_specific_driver: if (len > 3 && strcmp (arg + len - 3,
>     ".brig") == 0) can never be true, even if str end with brig.
>     Consequently, the if branch marked by FIXME later on in the
>     function never gets executed.  So perhaps it is not necessary?


True. A copy-paste error.
>
>
> - brigfrontend/*.cc in general:
>
>   + A lot of functions should have a comment.  I prefer to have them
>     above the function body but if it is a method, a comment in the
>     class definition is fin e too (I think).  Often you have helpful
>     comments inside the function but it really helps if you know what
>     to expect before you start reading the function.  For example,
>     brig_to_generic::add_global_variable needs a comment that it adds
>     VAR_DECL to the list of global variables and if there is a "host
>     def var" (I guess I know what that means but an explanation
>     somewhere would not hurt either), it makes sure it points to the
>     new VAR_DECL.  Another example: call_builtin is very difficult to
>     review without a comment explaining what arguments it expects.
>     Please make sure that all functions have a comment somewhere,
>     perhaps with the exception of only the most trivial and
>     self-evident.


Fixed these. I also scanned the sources and added new comments
where e.g. the function name didn't describe the purpose in a self clear
way.


>
>   + Is there any reason why you use internal_error instead of
>     more common gcc_assert and/or gcc_unreachable?


I think it was because of the possibility to add a format str & a message
more easily while developing. I now converted them to asserts/unreachable
as the frontend can be considered feature complete.


>
> - brigfrontend/brig_to_generic.cc:
>
>   + Why does this filename have underscores while all the others have
>     dashes? :-)


Renamed.

>
>   + What should the sanity check of data secion in parse accomplish?


Nothing anymore. A leftover from debugging sessions, I believe.
Removed.

>   + In build_reinterpret_cast, it would be best if you could avoid
>     constructing VIEW_CONVERT_EXPRs that have type with a different
>     size from the type of its operand (even though I think that Ada
>     also does this, it is considered unfortunate).  What are the cases
>     when this happens?


This was needed due to the special treatment of f16 which is stored in
a 32b reg (variable). I added a comment about this.

>
>     OK, adding another note later: For converting scalars (anything
>     !AGGREGATE_TYPE_P), I think you pretty much always want to use
>     NOP_EXPR rather than generating a V_C_E.  V_C_E is mainly used to
>     change how we interpret aggregates (or type-cast between
>     aggregates and scalars).  In particular, NOP_EXPR will also
>     sign-extend properly when you are extending intergers according to
>     the type, whereas what will be in the "extended" part of a V_C_E
>     is basically undefined.


I use VCE heavily on purpose in many places because of the untyped HSAIL
register variables (represented as unsigned integer local variables) of which
contents are treated as different types depending on the HSAIL opcode.
The integer (extending) casts are done with separate HSAIL instructions.
I don't want to sign extend unless explicitly stated so.

>
>   + in brig_to_generic::append_group_variable, I think you should not
>     add align_padding to m_next_group_offset twice.  You do not do
>     that in append_private_variable.


Good catch. Fixed.

>
>
>   + Remove the commented out debug_function in
>     brig_to_generic::finish_function.


I put it inside #if 0 ... #endif instead as it's a very useful snippet
to have handy when
debugging the output.

>   + call_builtin - difficult to review without a comment explaining
>     the parameters.  Apart from that, I really wonder why you go
>     through the hoops of identifying builtins by names and caching
>     them.  I believe you should be able to use builtin_decl_explicit
>     to get the decl for the builtin code. Or am I wrong?


Converted to use the standard way of importing builtins.def.
I mimicked lto-lang.c.

>
> - brigfrontend/brig-code-entry-handler.cc:
>
>   + I don't know, maybe it is standard C++ practice, but putting
>     whole-front-end initialization (built-in creation) to a
>     constructor of an abstract ancestor of streaming class seems a bit
>     unfortunate to me.


This is now reduced to initialization of the builtin index map from the
hsail-builtin.defs and the actual builtins are initialized in brig-lang.c.

>
>   + Speaking about built-ins.  I know that GO FE creates them like you
>     do, but I wonder whether following how c-family/c-common.c creates
>     them would not be better, at least for those builtins that are in
>     builtins.def and files included from it.  That way you also get
>     proper function attributes without copying them over, for example.
>     However, is is entirely possible there is a reason to do it
>     your/Go way, I was just surprised.


Moved all builtins HSAIL use to hsail-builtins.def.

>     You should probably also go forward with the TODO and make the
>     add_custom_builtin calls into normal gcc builtins, otherwise LTO
>     will not work with BRIG.


Done.

>   + brig_code_entry_handler::build_tree_operand: Please make it just
>     one switch, rather than two if's followed by a switch, all of them
>     based on the same field.


Done.

>
>   + brig_code_entry_handler::build_address_operand: I think that
>
>         gcc_assert (arg_symbol->base.kind == BRIG_KIND_DIRECTIVE_VARIABLE);
>
>     checks for input error, rather than internal compiler error, and
>     so if the symbol is not a symbol, you should call error() and bail
>     out instead.  I think that given the nature of brig - there is
>     little point in trying to continue parsing after an error -
>     terminating the executable would be fine too.


Makes sense. Removed the assert.

>
>     Also, in the calculation of the symbol_base for private segment
>     symbols, I bet the third call to expand_or_call_builtin:
>
>           local_size
>             = build2 (MULT_EXPR, uint32_type_node,
>                       expand_or_call_builtin (BRIG_OPCODE_WORKGROUPSIZE,
>                                               BRIG_TYPE_U32, uint32_type_node,
>                                               uint32_1),
>                       local_size);
>
>     should have uint32_2 as its last argument to get the 3rd dimension
>     of the grid.


Yes, this was already fixed recently.

>     How come that the "segment == BRIG_SEGMENT_ARG" is the only case
>     in which you have to handle arrays?


Added this comment there to clarify it:

       /* Two different type of array references in case of arguments
depending where they are referred at.  In the caller (argument
segment), the reference is to an array object and
in the callee, the array object has been passed as a pointer
to the array object.  */

For the other segments, we handle array refs via addressing the first
element in which case it doesn't differ from "scalar" variables.

>     It also seems that if the address is from a KERNARG segment but
>     also has a register component, you will overwrite the kernel
>     base stored to ptr_base when dealing with the register.


 Good catch. Fixed and added a test case.
>
>
>     The code processing the register value should be done differently,
>     I'm afraid.  First and foremost, you should not be using
>     VIEW_CONVERT_EXPR (generated by build_reinterpret_cast) to
>     sign-extend offsets, you should use NOP_EXPR for that.  I also do
>
>     not think there is any need to be constructing build_pointer_type
>     (char_type_node) all the time.  GIMPLE/GENERIC is not C, you can
>     use any pointer, for example ptr_type_node.


Thanks for the simplification tips. I converted this part to pretty much a call
to convert_to_pointer().

>
>     When you add pointer and offsets, you should be using
>     POINTER_PLUS_EXPR rather than just PLUS_EXPR.


Fixed. For some reason I thought POINTER_PLUS_EXPR requires a constant
integer offset.

>
>     Finally, if you want to type-cast the final address to the type
>     specified by the instruction, again use NOP_EXPR.  However, I
>     don't think it is really necessary, in the middle-end, it is the
>     type of the loads and stores that matter, not really the used
>     pointer.


Not sure of this. Isn't the mem access' size&alignment dictated by the
pointer's type? At least that's the impression I get from MEM_REF's
comment.

>
>   + brig_code_entry_handler::get_tree_expr_type_for_hsa_type: Needs a
>     comment.  Why do we need it, in what way is it different than the
>     previous function?  It special-cases F16 vectors, but differently,
>     so which function should be used in which situation?


Improved the comment.

>
>   + brig_code_entry_handler::add_temp_var: Seems like he coment is
>     wrong, the assignment is appended to the current function but not
>     returned.


Fixed. It should have returned it, and gcc's default behavior of
returning the last expression came into play, I think.

>   + brig_code_entry_handler::unpack: Please use BITS_PER_UNIT to refer
>     to the number of bits in a byte instead of using the number 8
>     directly.


Fixed.

>   + brig_code_entry_handler::vbuild_builtin: A comment would help
>     greatly.  Moreover, it seems weird that you are building an
>     arbitrary(?)  type for a builtin, why don't you get it from the
>     decl?


This method is not needed anymore, thanks to the builtin rework.

>
>   + brig_code_entry_handler::build_operands: Please use
>     BRIG_TYPE_BASE_MASK instead of the 0x01f constant when masking
>     types.


Fixed, also elsewhere.

>
>     The need to always encapsulate constant(?) operands in a V_C_E is
>     very strange and I really think we need to figure out what is
>     going on.


I implemented the FE originally on an earlier gcc version and added
some workarounds
such as this, hoping that the issues were fixed in the later versions.

Seems this workaround is not needed anymore with gcc trunk, removed...

>
>
>   + brig_code_entry_handler::build_output_assignment: Again, Please
>     use BRIG_TYPE_BASE_MASK instead of the 0x01f and BITS_PER_UNIT
>     instead of 8.  The most worisome, however, is:
>
>
>           if (CONVERT_EXPR_P (inst_expr) && POINTER_TYPE_P (inst_expr))
>             {
>               // convert_to_integer crashes when converting a view convert
>               // expr to a pointer.  First cast it to a large enough int
>               // and let the next integer conversion do the clipping.
>               inst_expr = convert_to_integer (size_type_node, inst_expr);
>             }
>
>     First of all, the above should not help, because V_C_E is not a
>     CONVERT_EXPR_P.  Mainly, however, this should either be reported
>     as a bug and fixed or probably avoided by not creating so many
>     V_C_Es in the first place.


...ditto.

> - brigfrontend/brig-atomic-inst-handler.cc: generate_tree - I have not
>   read this file in much detail mainly because I'm still not convinced
>   that organizing builtins based on their name strings is a good idea.


Reworked.

>   However, I'd like alto to point out that you should try to avoid
>   building trees if you are not going to use them, especially if it is
>   really simple like for example mem_ref in this function, because it
>   strains the garbage collector for no reason.


Fixed.

>
> - brig-basic-inst-handler.cc, operator():
>
>   if (brig_inst->opcode == BRIG_OPCODE_NOP)
>     {
>       return base->byteCount;
>     }
>
>   tere should be no braces when there is only one statement in a
>   branch (and there is no potential confusion about elses).


Fixed.

>
>   Again, please do not use 0x01F to mask brug type base, use
>   BRIG_TYPE_BASE_MASK.  Uou can also use BRIG_TYPE_PACK_MASK instead
>   of or'ing all the pack values.


Fixed.

Attached an updated patch. Also the other patches got touched a bit, I attach
them to the respective threads.

Thanks again,
Pekka

[-- Attachment #2: 002-brig-fe-gcc.patch.gz --]
[-- Type: application/x-gzip, Size: 70433 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-08-15 13:22   ` Pekka Jääskeläinen
@ 2016-08-30 16:50     ` Pekka Jääskeläinen
  2016-10-20 15:10       ` Martin Jambor
  0 siblings, 1 reply; 12+ messages in thread
From: Pekka Jääskeläinen @ 2016-08-30 16:50 UTC (permalink / raw)
  To: Pekka Jääskeläinen, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 15537 bytes --]

Hi,

Updated patch attached. Cleaned up issues found by contrib/check-GNU-style and
other misc. updates.

All changes to the whole patch set in comparison to the previous
submission are in the 005 diff.


On Mon, Aug 15, 2016 at 4:22 PM, Pekka Jääskeläinen <pekka@parmance.com> wrote:
> Thanks a lot for the review Martin,
>
> Responses inline:
>
> On Mon, Aug 1, 2016 at 7:37 PM, Martin Jambor <mjambor@suse.cz> wrote:
>>
>>   - I would definitely appreciate more comments.  All but the most
>>     trivial functions should have one that describes what the function
>>     does, what are its arguments and what it returns.  (And there
>>     should be one blank line between the comment and the function
>>     itself).
>
>
> I added more comments. Please let me know if I missed a non-trivial one.
>
>>
>>
>>   - We very much prefer c-style comments to c++ ones.  I hope they can
>>     be converted easily in some automated way.
>
>
> Converted.
>
>>
>>
>> So far I have the following specific comments:
>>
>> - brig-c.h
>>   + please remove commented out extern GTY (()) tree brig_non_zero_struct
>
>
> Done.
>
>>
>>
>> - brig-lang.c:
>>   + In the long run I would like to get rid of
>>       opts->x_flag_whole_program = 0 in
>>     brig_langhook_init_options_struct, when did it cause issues, when
>>     you tried LTO?  Since there obviously is no linker-plugin support,
>>     I think we can leave it this way for now, though.
>
>
> Agreed. I have not tried the current status of LTO, but it can be implemented
> later.
>
>
>>
>>   + brig_langhook_handle_option has argument scode marked as unused
>>     but it is used.
>
>
> Fixed.
>
>>
>>   + brig_langhook_type_for_size uses both supposedly unused arguments
>>     and I am surprised that handling just 64 bits is sufficient.
>
>
> It is now being called for 32b too due to introducing the builtins like
> they should.
>
>>
>>   + brig_langhook_type_for_mode: the "FIXME: This static_cast should
>>     be in machmode.h" can be removed, the cast is in machmode.h
>>
>>   + brig_langhook_eh_personality comment refers to file
>>     libbrig/runtime/brig-unwind.c which does not seem to exist?
>>   + convert has attributes marked as unused but they are used
>
>
> Fixed these.
>
>>
>>   + The "FIXME: This is a hack to preserve trees that we create from
>>     the garbage collector." IMHO does not describe any real issue,
>>     using GTY roots for that is common practice.
>
>
> Removed the FIXME.
>
>> - brigspec.c:
>>   + lang_specific_driver: if (len > 3 && strcmp (arg + len - 3,
>>     ".brig") == 0) can never be true, even if str end with brig.
>>     Consequently, the if branch marked by FIXME later on in the
>>     function never gets executed.  So perhaps it is not necessary?
>
>
> True. A copy-paste error.
>>
>>
>> - brigfrontend/*.cc in general:
>>
>>   + A lot of functions should have a comment.  I prefer to have them
>>     above the function body but if it is a method, a comment in the
>>     class definition is fin e too (I think).  Often you have helpful
>>     comments inside the function but it really helps if you know what
>>     to expect before you start reading the function.  For example,
>>     brig_to_generic::add_global_variable needs a comment that it adds
>>     VAR_DECL to the list of global variables and if there is a "host
>>     def var" (I guess I know what that means but an explanation
>>     somewhere would not hurt either), it makes sure it points to the
>>     new VAR_DECL.  Another example: call_builtin is very difficult to
>>     review without a comment explaining what arguments it expects.
>>     Please make sure that all functions have a comment somewhere,
>>     perhaps with the exception of only the most trivial and
>>     self-evident.
>
>
> Fixed these. I also scanned the sources and added new comments
> where e.g. the function name didn't describe the purpose in a self clear
> way.
>
>
>>
>>   + Is there any reason why you use internal_error instead of
>>     more common gcc_assert and/or gcc_unreachable?
>
>
> I think it was because of the possibility to add a format str & a message
> more easily while developing. I now converted them to asserts/unreachable
> as the frontend can be considered feature complete.
>
>
>>
>> - brigfrontend/brig_to_generic.cc:
>>
>>   + Why does this filename have underscores while all the others have
>>     dashes? :-)
>
>
> Renamed.
>
>>
>>   + What should the sanity check of data secion in parse accomplish?
>
>
> Nothing anymore. A leftover from debugging sessions, I believe.
> Removed.
>
>>   + In build_reinterpret_cast, it would be best if you could avoid
>>     constructing VIEW_CONVERT_EXPRs that have type with a different
>>     size from the type of its operand (even though I think that Ada
>>     also does this, it is considered unfortunate).  What are the cases
>>     when this happens?
>
>
> This was needed due to the special treatment of f16 which is stored in
> a 32b reg (variable). I added a comment about this.
>
>>
>>     OK, adding another note later: For converting scalars (anything
>>     !AGGREGATE_TYPE_P), I think you pretty much always want to use
>>     NOP_EXPR rather than generating a V_C_E.  V_C_E is mainly used to
>>     change how we interpret aggregates (or type-cast between
>>     aggregates and scalars).  In particular, NOP_EXPR will also
>>     sign-extend properly when you are extending intergers according to
>>     the type, whereas what will be in the "extended" part of a V_C_E
>>     is basically undefined.
>
>
> I use VCE heavily on purpose in many places because of the untyped HSAIL
> register variables (represented as unsigned integer local variables) of which
> contents are treated as different types depending on the HSAIL opcode.
> The integer (extending) casts are done with separate HSAIL instructions.
> I don't want to sign extend unless explicitly stated so.
>
>>
>>   + in brig_to_generic::append_group_variable, I think you should not
>>     add align_padding to m_next_group_offset twice.  You do not do
>>     that in append_private_variable.
>
>
> Good catch. Fixed.
>
>>
>>
>>   + Remove the commented out debug_function in
>>     brig_to_generic::finish_function.
>
>
> I put it inside #if 0 ... #endif instead as it's a very useful snippet
> to have handy when
> debugging the output.
>
>>   + call_builtin - difficult to review without a comment explaining
>>     the parameters.  Apart from that, I really wonder why you go
>>     through the hoops of identifying builtins by names and caching
>>     them.  I believe you should be able to use builtin_decl_explicit
>>     to get the decl for the builtin code. Or am I wrong?
>
>
> Converted to use the standard way of importing builtins.def.
> I mimicked lto-lang.c.
>
>>
>> - brigfrontend/brig-code-entry-handler.cc:
>>
>>   + I don't know, maybe it is standard C++ practice, but putting
>>     whole-front-end initialization (built-in creation) to a
>>     constructor of an abstract ancestor of streaming class seems a bit
>>     unfortunate to me.
>
>
> This is now reduced to initialization of the builtin index map from the
> hsail-builtin.defs and the actual builtins are initialized in brig-lang.c.
>
>>
>>   + Speaking about built-ins.  I know that GO FE creates them like you
>>     do, but I wonder whether following how c-family/c-common.c creates
>>     them would not be better, at least for those builtins that are in
>>     builtins.def and files included from it.  That way you also get
>>     proper function attributes without copying them over, for example.
>>     However, is is entirely possible there is a reason to do it
>>     your/Go way, I was just surprised.
>
>
> Moved all builtins HSAIL use to hsail-builtins.def.
>
>>     You should probably also go forward with the TODO and make the
>>     add_custom_builtin calls into normal gcc builtins, otherwise LTO
>>     will not work with BRIG.
>
>
> Done.
>
>>   + brig_code_entry_handler::build_tree_operand: Please make it just
>>     one switch, rather than two if's followed by a switch, all of them
>>     based on the same field.
>
>
> Done.
>
>>
>>   + brig_code_entry_handler::build_address_operand: I think that
>>
>>         gcc_assert (arg_symbol->base.kind == BRIG_KIND_DIRECTIVE_VARIABLE);
>>
>>     checks for input error, rather than internal compiler error, and
>>     so if the symbol is not a symbol, you should call error() and bail
>>     out instead.  I think that given the nature of brig - there is
>>     little point in trying to continue parsing after an error -
>>     terminating the executable would be fine too.
>
>
> Makes sense. Removed the assert.
>
>>
>>     Also, in the calculation of the symbol_base for private segment
>>     symbols, I bet the third call to expand_or_call_builtin:
>>
>>           local_size
>>             = build2 (MULT_EXPR, uint32_type_node,
>>                       expand_or_call_builtin (BRIG_OPCODE_WORKGROUPSIZE,
>>                                               BRIG_TYPE_U32, uint32_type_node,
>>                                               uint32_1),
>>                       local_size);
>>
>>     should have uint32_2 as its last argument to get the 3rd dimension
>>     of the grid.
>
>
> Yes, this was already fixed recently.
>
>>     How come that the "segment == BRIG_SEGMENT_ARG" is the only case
>>     in which you have to handle arrays?
>
>
> Added this comment there to clarify it:
>
>        /* Two different type of array references in case of arguments
> depending where they are referred at.  In the caller (argument
> segment), the reference is to an array object and
> in the callee, the array object has been passed as a pointer
> to the array object.  */
>
> For the other segments, we handle array refs via addressing the first
> element in which case it doesn't differ from "scalar" variables.
>
>>     It also seems that if the address is from a KERNARG segment but
>>     also has a register component, you will overwrite the kernel
>>     base stored to ptr_base when dealing with the register.
>
>
>  Good catch. Fixed and added a test case.
>>
>>
>>     The code processing the register value should be done differently,
>>     I'm afraid.  First and foremost, you should not be using
>>     VIEW_CONVERT_EXPR (generated by build_reinterpret_cast) to
>>     sign-extend offsets, you should use NOP_EXPR for that.  I also do
>>
>>     not think there is any need to be constructing build_pointer_type
>>     (char_type_node) all the time.  GIMPLE/GENERIC is not C, you can
>>     use any pointer, for example ptr_type_node.
>
>
> Thanks for the simplification tips. I converted this part to pretty much a call
> to convert_to_pointer().
>
>>
>>     When you add pointer and offsets, you should be using
>>     POINTER_PLUS_EXPR rather than just PLUS_EXPR.
>
>
> Fixed. For some reason I thought POINTER_PLUS_EXPR requires a constant
> integer offset.
>
>>
>>     Finally, if you want to type-cast the final address to the type
>>     specified by the instruction, again use NOP_EXPR.  However, I
>>     don't think it is really necessary, in the middle-end, it is the
>>     type of the loads and stores that matter, not really the used
>>     pointer.
>
>
> Not sure of this. Isn't the mem access' size&alignment dictated by the
> pointer's type? At least that's the impression I get from MEM_REF's
> comment.
>
>>
>>   + brig_code_entry_handler::get_tree_expr_type_for_hsa_type: Needs a
>>     comment.  Why do we need it, in what way is it different than the
>>     previous function?  It special-cases F16 vectors, but differently,
>>     so which function should be used in which situation?
>
>
> Improved the comment.
>
>>
>>   + brig_code_entry_handler::add_temp_var: Seems like he coment is
>>     wrong, the assignment is appended to the current function but not
>>     returned.
>
>
> Fixed. It should have returned it, and gcc's default behavior of
> returning the last expression came into play, I think.
>
>>   + brig_code_entry_handler::unpack: Please use BITS_PER_UNIT to refer
>>     to the number of bits in a byte instead of using the number 8
>>     directly.
>
>
> Fixed.
>
>>   + brig_code_entry_handler::vbuild_builtin: A comment would help
>>     greatly.  Moreover, it seems weird that you are building an
>>     arbitrary(?)  type for a builtin, why don't you get it from the
>>     decl?
>
>
> This method is not needed anymore, thanks to the builtin rework.
>
>>
>>   + brig_code_entry_handler::build_operands: Please use
>>     BRIG_TYPE_BASE_MASK instead of the 0x01f constant when masking
>>     types.
>
>
> Fixed, also elsewhere.
>
>>
>>     The need to always encapsulate constant(?) operands in a V_C_E is
>>     very strange and I really think we need to figure out what is
>>     going on.
>
>
> I implemented the FE originally on an earlier gcc version and added
> some workarounds
> such as this, hoping that the issues were fixed in the later versions.
>
> Seems this workaround is not needed anymore with gcc trunk, removed...
>
>>
>>
>>   + brig_code_entry_handler::build_output_assignment: Again, Please
>>     use BRIG_TYPE_BASE_MASK instead of the 0x01f and BITS_PER_UNIT
>>     instead of 8.  The most worisome, however, is:
>>
>>
>>           if (CONVERT_EXPR_P (inst_expr) && POINTER_TYPE_P (inst_expr))
>>             {
>>               // convert_to_integer crashes when converting a view convert
>>               // expr to a pointer.  First cast it to a large enough int
>>               // and let the next integer conversion do the clipping.
>>               inst_expr = convert_to_integer (size_type_node, inst_expr);
>>             }
>>
>>     First of all, the above should not help, because V_C_E is not a
>>     CONVERT_EXPR_P.  Mainly, however, this should either be reported
>>     as a bug and fixed or probably avoided by not creating so many
>>     V_C_Es in the first place.
>
>
> ...ditto.
>
>> - brigfrontend/brig-atomic-inst-handler.cc: generate_tree - I have not
>>   read this file in much detail mainly because I'm still not convinced
>>   that organizing builtins based on their name strings is a good idea.
>
>
> Reworked.
>
>>   However, I'd like alto to point out that you should try to avoid
>>   building trees if you are not going to use them, especially if it is
>>   really simple like for example mem_ref in this function, because it
>>   strains the garbage collector for no reason.
>
>
> Fixed.
>
>>
>> - brig-basic-inst-handler.cc, operator():
>>
>>   if (brig_inst->opcode == BRIG_OPCODE_NOP)
>>     {
>>       return base->byteCount;
>>     }
>>
>>   tere should be no braces when there is only one statement in a
>>   branch (and there is no potential confusion about elses).
>
>
> Fixed.
>
>>
>>   Again, please do not use 0x01F to mask brug type base, use
>>   BRIG_TYPE_BASE_MASK.  Uou can also use BRIG_TYPE_PACK_MASK instead
>>   of or'ing all the pack values.
>
>
> Fixed.
>
> Attached an updated patch. Also the other patches got touched a bit, I attach
> them to the respective threads.
>
> Thanks again,
> Pekka

[-- Attachment #2: 002-brig-fe-gcc.patch.gz --]
[-- Type: application/x-gzip, Size: 70373 bytes --]

[-- Attachment #3: 005-diff-to-previous.patch.gz --]
[-- Type: application/x-gzip, Size: 41275 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-08-30 16:50     ` Pekka Jääskeläinen
@ 2016-10-20 15:10       ` Martin Jambor
  2016-10-29 11:59         ` Pekka Jääskeläinen
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Jambor @ 2016-10-20 15:10 UTC (permalink / raw)
  To: Pekka Jääskeläinen; +Cc: gcc-patches

Hi,

sorry for the delay.  Couple of weeks ago, I have cloned the github
repo, lookaed at the diff against the trunk and reviewed that, which
resulted in the following pile of notes.

I hope they will be at least mostly understandable, if a little bit
disorganized and perhaps sometimes inconsistent.  I have been
interrupted by a surprisingly high number of issues which required my
quick attention and a disk failure during that period.

- Still quite few things need to be documented better, e.g.:
  + brig_to_generic::get_mangled_name_tmpl and to a lesser extent
    brig_to_generic::get_mangled_name.  It should be clear what is the
    intended difference in usage of the two (specially since the
    former is a template and so the parameter does not give that much
    of a hint to the reader)
  + the visitor classes need some description so that the first time
    readers see them, they understand what they are for and what they
    visit (i.e. what "visiting" even means).

- I know it was me who told you to use gcc_assert and gcc_unreachable
  instead of internal_error.  The thing is, often the error is clearly
  not an internal error but an error in input.  I think that we should
  plan to handle these cases differently and report the issues better
  to the user, give a meaningful error message together width the
  section and the offset there when it was encountered.  I am not
  asking you to audit all asserts now and convert those in this
  category but it would be nice to have a mechanism to easily do so
  (and convert a few obvious places), so that we can convert these as
  we bump into them.

  For asserts that do check for compiler errors in the sense that no
  matter what the input, a condition should never occur, please note
  that gcc_assert (condition) is preferred over

  if (!condition)
     gcc_unreachable ();

  and that gcc_unreachable is a noreturn call and thus there is no
  need to put any return statements after them.  I'm not pointing
  occurrences of this in the review because many of them are actually
  handling input issues and those should probably eventually be calls
  to error() or some encapsulation of it and will probably need a
  bail-out followup.

- A very minor suggestion: In GCC it is customary to write TODO as one
  word.  We generally do not use "TO OPTIMIZE", that is just a TODO
  (as opposed to a FIXME, which hints that something is at least a bit
  wrong here).  I think you can keep your way if you want but for
  example I do have emacs highlighting set up for the traditional
  formats.

- You do not seem to handle BRIG_OPCODE_ICALL, or have I missed it?
  That is fine, I don't think anybody else does that now anyway, I
  just got curious reading the ipa analysis.

- brig-lang.c:
  + x_flag_whole_program = 0; - talk about this with Honza
  + brig_langhook_type_for_size: has several issues.  Either always
    return a type like go or return error_mark_node instead of NULL.
    Also, do not count on long being 64-bit.  I would just copy what
    go does.  Or lto.
  + brig_langhook_type_for_mode: Also please do not depend on knowing
    that long is 8 bytes, or int being 4 bytes long.  For complex
    modes, the correct thing seems to be to return NULL instead of
    void_type_node.  In any case, it would be better to return
    error_mark_node rather than void_type_node.
  + convert: did you avoid using convert_to_vector deliberately?  The
    size check seems genuinely useful.

- brigfrontend/brig-to-generic.cc:
  + brig_to_generic::parse: You seem to be handling LDA instruction
    (or more generally, BRIG_KIND_INST_ADDR instructions, but there is
    just the one) with copy_move_inst_handler, which seems just wrong.
    Its operator() blindly casts the instruction to
    BrigInstSourceType, interpreting the segment as if it was a source
    type... am I missing something?

  + build_reinterpret_cast: gcc_unreachable followed by return
    NULL_TREE does not make sense, please convert the whole thing to
    an assert.  More importantly, I think that you really need to
    solve the case of mismatching sizes differently, and not create a
    (single) V_C_E for it.  For register types, you can V_C_E them to
    an unsigned int of the same size, then do a widening NOP_EXPR to
    unsigned type with the same size as the destination and then do a
    V_C_E to whatever you need.  But perhaps this can be solved better
    at the caller side somewhere?

  + brig_to_generic::finish_function: the ifdef'ed debug_functions
    should not be a part of the final submission.  Checking
    m_cf->m_is_kernel twice looks ugly.

  + brig_to_generic::append_group_variable: Don't put a statement
    guarded with an if statement on the same line as the condition.  I
    also believe align_padding, when the offset is not already
    aligned, should be:
       alignment - m_next_group_offset % alignment

  + brig_to_generic::append_private_variable: The same as above.

  + I would suggest that you change brig_to_generic::dump_function to
    a standalone function taking a file parameter.  It is then much
    easier to call it for example from debugger.

- brig-code-entry-handler.cc
  + brig_code_entry_handler::build_tree_operand: The unreachable
    return at the end looks misleading, just removing it should be
    fine.  Ditto for the breaks after returns, the upcoming
    fall-through warning will notify us if we ever get a potentially
    wrong fall-through.

  + brig_code_entry_handler::get_tree_cst_for_hsa_operand: GCC coding
    standard mandates that if a branch has only one statement in it,
    it should not be encapsulated in braces.

    If, in the condition

      if (type != NULL && TREE_CODE (type) == ARRAY_TYPE)

    type can ever actually be NULL, then the function will segfault
    just before ending, when again checking whether type is an array.
    If it cannot be NULL, then please gcc_checking_assert it instead.

  + brig_code_entry_handler::build_address_operand: I must say I
    really dislike how the end of the function is structured, it is
    terrible difficult to read given that it is not doing anything that
    complicated and I think it does not handle correctly an (LDA of a)
    NULL address, which unfortunately I believe is valid for private
    and group segment addresses.  Dealing with the most complex case
    by converting symbol to size_type looks exactly backwards,
    especially given that you have converted the base of the
    POINTER_PLUS_EXPR only few lines before.  I think the code would
    be a lot nicer and easier to comprehend if you clearly
    distinguished the various cases (symbol_base != NULL (and sub
    cases when ptr_base is or is not NULL), ptrbase != NULL and simple
    constant, even NULL constant, which you do not handle but fail an
    assert, I think) and handled them separately, including all type
    conversions.  ptr_base is an unfortunate name, IMHO, in many cases
    it has the role of a variable offset rather than a base.
    Similarly, ptr_ofset is really a constant_offset.

  + brig_code_entry_handler::get_tree_type_for_hsa_type: I believe it
    is better to and the type with BRIG_TYPE_PACK_MASK if you want to
    determine whether you are looking at a vector (packed) HSAIL
    instruction.  I also think that putting that test into a separate
    function and calling it from all places when you do this would be
    more future-proof (something like hsa_type_packed_p in gcc/hsa.c).

    For the (inner_brig_type == BRIG_TYPE_F16) vector case, you do not
    end up calling build_type_variant (why is it necessary in any
    case?) but for other vector types you do.  Is that intentional?

  + brig_code_entry_handler::expand_or_call_builtin: If I am correct
    that the operands parameter contains only input operands at this
    point, please state so in the function comment.  Since we only
    allow ourselves 80 characters wide code, it is customary not to
    put code into else branches if the if branch returns (or
    breaks/continues) anyway.

    Given that this function builds vectors only to pile them
    element-wise to arguments of variable-argument-length function
    call_builtin, which then builds vectors out if its
    elements... have you considered having an overloaded
    implementation of call_builtin that would not do this?  It seems
    particularly wasteful.

  + brig_code_entry_handler::build_operands: Please make the return
    type consistent with the one in class definition (tree_stl_vec
    instead of explicit std::vector<tree>, assuming you prefer the
    former).  

    GCC coding standard mandates that if a branch has only one
    statement in it, it should not be encapsulated in braces.

    Please replace
      if (operand == NULL_TREE)
   	    gcc_unreachable ();
    with gcc_assert (operand);
   
    Please rewrite conditions like !(TREE_CODE (operand) == TREE_VEC)
    as (TREE_CODE (operand) != TREE_VEC)

    Again you are creating VIEW_CONVERT_EXPR for an operand that is of
    a different size than the result type.  For scalar types, this is
    bound to cause trouble sooner or later and I really think you need
    to avoid it.

- brig-basic-inst-handler.cc:
  + scalarized_sat_arithmetics::scalarized_sat_arithmetics: Do not
    undef macros preemptively.  Instead, undef them right after using
    them, after the include of a .def file.
  + brig_basic_inst_handler::must_be_scalarized: I am intrigued by the
    (elements < 16) part of the condition.  This function would also
    benefit from a comment.
  + scalarized_sat_arithmetics::builtin - please prefix with m_, ditto
    for brig_inst_ (why the trailing underscore?)
  + brig_basic_inst_handler::must_be_scalarized needs a comment
    explaining what it is for.
  + brig_basic_inst_handler::get_raw_type: Unless "raw" is some HSA
    term I have missed, I would strongly suggest that you rename the
    function to something more immediately obvious, like uint_for_type
    or something like that.  Also, don't use literal 8 but
    BITS_PER_UNIT.  Also, do we really need both this function and
    brig_code_entry_handler::get_raw_tree_type ?

  + build_shuffle, build_unpack, build_pack, build_unpack_lo_or_hi and
    build_lower_element_broadcast: I admit that so far I have only
    very briefly skimmed through these functions.  In any way, use
    BITS_PER_UNIT instead of 8.

  + brig_basic_inst_handler::build_instr_expr: Please remove the 'r'
    and make it build_inst_expr for the sake of consistency.  If I
    understand the code correctly, the operands parameter contains
    only input operands.  In that case, please state so in the
    function comment and remove the local variable first_input, it has
    no purpose but to confuse.  Also, please move the definition and
    assignments to local variable input_count (and possibly also
    output_count) down to where it is used.

  + brig_basic_inst_handler::operator (): It seems that the opcode
    local variable is only used to identify the return brig
    instructions which seems wasteful.  Generally, it would be nice to
    clean this function up a little by moving assignments to some of
    the very many local variables down, as close to their first use as
    reasonable.  Surprisingly often, you'd remove the need to compute
    them in many cases at all, e.g. look at element_count and
    element_size_bits.

    Extra points for a function comment explaining how work is divided
    in between operator() itself and its main helpers such as
    build_instr_expr.
  
  + brig_basic_inst_handler::get_tree_code_for_hsa_opcode: The comment
    says the special value returned when it is necessary to use a
    chain of tree expressions or an builtin is NULL_TREE, but the
    function itself returns TREE_LIST or CALL_EXPR.

- brig-cmp-inst-handler.cc:
  + brig_cmp_inst_handler::operator (): the neg_expr seems to be
    something left from earlier times?  Use BITS_PER_UNIT instead of
    8, having both result_width and element_width seems unnecessary
    (and speaking of elements, is that actually even a vector case?),
    and should be initialized only in the case when it is used.  In
    case of vector results, please build either all_ones or all_zeros,
    it is wasteful to allocate both.

- brig-mem-inst-handler.cc: I believe that using the alignment
  modifier is something that we should try to get done as soon as
  possible.
  
- brig-inst-mod-handler.cc: This seems like something that we should
  at least warn about (in case when effectively an unsupported
  operation is requested).

- brig-seg-inst-handler.cc: At this point I'm trying to read quickly
  but it seems to me you do not support conversion between flat and
  global segment... how come?

- brig-copy-move-inst-handler.cc:
  + brig_copy_move_inst_handler::operator (): The function definitely
    should not cast LDA instructions to BrigInstSourceType*!

- brig-atomic-inst-handler.cc: I only skimmed through this, looks
  good, although I should probably come back to take a closer look
  later.

- brig-branch-inst-handler.cc: I believe that as long as the builtins
  representing barriers are not pure, they will not be hoisted out of
  a loop.  Nonduplication might indeed be a problem, although short of
  whole function cloning, I could not think of a transformation gcc
  performs that might pause a problem.  Nevertheless, we probably
  should introduce an attribute for it and look for it in
  gimple_can_duplicate_bb_p (and in cfg_layout_can_duplicate_bb_p?).
  An important issue, but hopefully for later.

- brig-variable-handler.cc:
  + brig_directive_variable_handler::operator (): Please use
    BITS_PER_UNIT instead of 8.
  + build_variable: Likewise.  I am a bit concerned that unlike in
    operator(), you do not make the alignment at least as big as
    natural one, which means that in theory (and probably only on
    malformed BRIG, I suppose), the two functions might disagree about
    alignment?  I think it would be nice to outline the extraction of
    alignment to an independent function and use that from both
    places.

- brig-function-handler.cc:
  + brig_directive_function_handler::operator: Please use gcc_assert
    instead of assert.  (Well, in this case it is clearly input error
    which, eventually, we will want to give nice errors about.  But at
    least do not use assert.)

- brig-function.cc:
  + brig_function::analyze_calls: The first if condition should be
    terminated by a newline

  + brig_function::add_wi_loop:  Is the second TODO now obsolete?

  + brig_function::build_launcher_and_metadata: The ASM directive is
    really an ugly hack.  It is isolated so I am not that much
    concerned, but building a structure and filling it with data (like
    we do for example in hsa_output_libgomp_mapping) seems cleaner.

- brig-util.cc:
  + gccbrig_is_raw_operation: I think that calling the operations
    "bit" operations instead of "raw" would make life of readers of
    the code slightly but noticeably easier.

  + gccbrig_hsa_type_bit_size: If possible, please make the default
    case be gcc_unreachable().  (If zero is expected in some cases,
    then all callers should check for it, so that we for example do
    not divide by zero in
    brig_code_entry_handler::get_tree_type_for_hsa_type.)

  + might_be_host_defined_var: there is no need for the returned
    expression to start on a new line.

- gcc/builtins.def: In the added DEF_HSAIL_*_BUILTIN macros, please
  arrange it so that they only pass true in the last argument of
  DEF_BUILTIN when gcc is configured with BRIG FE.  Builtins are not
  free and should not be added needlessly.

Overall, the code has improved significantly.  As far as I am
concerned, the only real issue I see are the VIEW_CONVERT_EXPRs with
mismatched operands.  They are asking for trouble, only Ada produces
those (although it is acknowledged it should not) and Ada only does it
for aggregates.

If I understood you correctly, both you and your sponsor have already
signed the Copyright assignment, right?  If that is so, I'll ask the
steering committee to approve the intention and then ask a global
reviewer to also peek at it.

Meanwhile, I'll look very quickly at the run-time library and see if
there is any integration with my work to be done.

Thanks for your patience,

Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-10-20 15:10       ` Martin Jambor
@ 2016-10-29 11:59         ` Pekka Jääskeläinen
  2016-11-04 14:58           ` Martin Jambor
  0 siblings, 1 reply; 12+ messages in thread
From: Pekka Jääskeläinen @ 2016-10-29 11:59 UTC (permalink / raw)
  To: Pekka Jääskeläinen, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 22601 bytes --]

Hi Martin,

Thanks for the comments and suggestions. Replies inline:

On Thu, Oct 20, 2016 at 6:10 PM, Martin Jambor <mjambor@suse.cz> wrote:
> - Still quite few things need to be documented better, e.g.:
>   + brig_to_generic::get_mangled_name_tmpl and to a lesser extent
>     brig_to_generic::get_mangled_name.  It should be clear what is the
>     intended difference in usage of the two (specially since the
>     former is a template and so the parameter does not give that much
>     of a hint to the reader)

Added more comments.

>   + the visitor classes need some description so that the first time
>     readers see them, they understand what they are for and what they
>     visit (i.e. what "visiting" even means).

This is an adaptation of the classic gang of four Visitor design pattern.
I added a reference to it in a comment.

> - I know it was me who told you to use gcc_assert and gcc_unreachable
>   instead of internal_error.  The thing is, often the error is clearly
>   not an internal error but an error in input.  I think that we should
>   plan to handle these cases differently and report the issues better
>   to the user, give a meaningful error message together width the
>   section and the offset there when it was encountered.  I am not
>   asking you to audit all asserts now and convert those in this
>   category but it would be nice to have a mechanism to easily do so
>   (and convert a few obvious places), so that we can convert these as
>   we bump into them.

I'm not sure about this. BRIG FE is a rather special case as we
assume HSAILasm has been used to parse and error check the original
HSAIL text to the binary BRIG format which it consumes.

Of course HSAILasm can have bugs, but how much we should produce human
readable error messages to help debugging HSAILasm is another thing.
In case the BRIG FE
fails to consume the input, it means either the BRIG is corrupted for
a reason or another,
but typically is not a human error (those should be caught be HSAILasm).

"File format not recognized" error is one that might be useful though.
I added a check for the BRIG magic number and the supported version (1.0).

Perhaps we should add error printouts later on case by case basis when we
see which error cases can be useful and worth reporting in a human readable
graceful manner? It can be as easy as converting the internal_error
to fatal_error or similar in that case.

> - A very minor suggestion: In GCC it is customary to write TODO as one
>   word.  We generally do not use "TO OPTIMIZE", that is just a TODO
>   (as opposed to a FIXME, which hints that something is at least a bit
>   wrong here).  I think you can keep your way if you want but for
>   example I do have emacs highlighting set up for the traditional
>   formats.

Converted all to TODO.

> - You do not seem to handle BRIG_OPCODE_ICALL, or have I missed it?
>   That is fine, I don't think anybody else does that now anyway, I
>   just got curious reading the ipa analysis.

Right.

> - brig-lang.c:
>   + x_flag_whole_program = 0; - talk about this with Honza

I guess this is note to yourself? As we agreed I didn't try whole program
optimizations yet.  I might later when optimizing for a target. It will
be really useful, I agree. Now that it used the proper builtin way, it
might work more easily.

>   + brig_langhook_type_for_size: has several issues.  Either always
>     return a type like go or return error_mark_node instead of NULL.
>     Also, do not count on long being 64-bit.  I would just copy what
>     go does.  Or lto.

Copied the go's version, it should work with BRIG too.

>   + brig_langhook_type_for_mode: Also please do not depend on knowing
>     that long is 8 bytes, or int being 4 bytes long.  For complex
>     modes, the correct thing seems to be to return NULL instead of
>     void_type_node.  In any case, it would be better to return
>     error_mark_node rather than void_type_node.

Fixed.

>   + convert: did you avoid using convert_to_vector deliberately?  The
>     size check seems genuinely useful.

BRIG/HSAIL is a bit special case due to its "untyped" variables (registers),
I use bit level casts in a lot of places to avoid accidental type conversions
or sign extensions.

> - brigfrontend/brig-to-generic.cc:
>   + brig_to_generic::parse: You seem to be handling LDA instruction
>     (or more generally, BRIG_KIND_INST_ADDR instructions, but there is
>     just the one) with copy_move_inst_handler, which seems just wrong.

What do you mean by 'wrong'? The handlers do not map to instruction types
directly, but often more towards the specification chapters.
It comes straight from:
http://www.hsafoundation.com/html/HSA_Library.htm#PRM/Topics/18_BRIG/BRIG_syntax_copy_move.htm

>     Its operator() blindly casts the instruction to
>     BrigInstSourceType, interpreting the segment as if it was a source
>     type... am I missing something?

I believe you are confused by the BRIG struct name: BrigInstSourceType
is a struct
for "instructions that have different types for their destination and
source operands"
http://www.hsafoundation.com/html/HSA_Library.htm#PRM/Topics/18_BRIG/BrigInstSourceType.htm

That is, it's not a "source type" object, but an instruction type object.

>   + build_reinterpret_cast: gcc_unreachable followed by return
>     NULL_TREE does not make sense, please convert the whole thing to
>     an assert.  More importantly, I think that you really need to

Converted it to an assert.

>     solve the case of mismatching sizes differently, and not create a
>     (single) V_C_E for it.  For register types, you can V_C_E them to
>     an unsigned int of the same size, then do a widening NOP_EXPR to
>     unsigned type with the same size as the destination and then do a
>     V_C_E to whatever you need.  But perhaps this can be solved better
>     at the caller side somewhere?

I added the extra conversion step through unsigned ints.
I hadn't noticed that V_C_E is not guaranteed to work for this case.

>   + brig_to_generic::finish_function: the ifdef'ed debug_functions
>     should not be a part of the final submission.  Checking

Removed.

>     m_cf->m_is_kernel twice looks ugly.

Removed the extra check.

>   + brig_to_generic::append_group_variable: Don't put a statement
>     guarded with an if statement on the same line as the condition.  I
>     also believe align_padding, when the offset is not already
>     aligned, should be:
>        alignment - m_next_group_offset % alignment
>   + brig_to_generic::append_private_variable: The same as above.

Good catch!

>   + I would suggest that you change brig_to_generic::dump_function to
>     a standalone function taking a file parameter.  It is then much
>     easier to call it for example from debugger.

Done.

> - brig-code-entry-handler.cc
>   + brig_code_entry_handler::build_tree_operand: The unreachable
>     return at the end looks misleading, just removing it should be
>     fine.  Ditto for the breaks after returns, the upcoming
>     fall-through warning will notify us if we ever get a potentially
>     wrong fall-through.

Done.

>   + brig_code_entry_handler::get_tree_cst_for_hsa_operand: GCC coding
>     standard mandates that if a branch has only one statement in it,
>     it should not be encapsulated in braces.

Done.

>     If, in the condition
>
>       if (type != NULL && TREE_CODE (type) == ARRAY_TYPE)
>
>     type can ever actually be NULL, then the function will segfault
>     just before ending, when again checking whether type is an array.
>     If it cannot be NULL, then please gcc_checking_assert it instead.

It cannot be NULL anymore at that point. Removed the misleading check.

>   + brig_code_entry_handler::build_address_operand: I must say I
>     really dislike how the end of the function is structured, it is
>     terrible difficult to read given that it is not doing anything that
>     complicated and I think it does not handle correctly an (LDA of a)
>     NULL address, which unfortunately I believe is valid for private
>     and group segment addresses.  Dealing with the most complex case
>     by converting symbol to size_type looks exactly backwards,
>     especially given that you have converted the base of the
>     POINTER_PLUS_EXPR only few lines before.  I think the code would
>     be a lot nicer and easier to comprehend if you clearly
>     distinguished the various cases (symbol_base != NULL (and sub
>     cases when ptr_base is or is not NULL), ptrbase != NULL and simple
>     constant, even NULL constant, which you do not handle but fail an
>     assert, I think) and handled them separately, including all type
>     conversions.  ptr_base is an unfortunate name, IMHO, in many cases
>     it has the role of a variable offset rather than a base.
>     Similarly, ptr_ofset is really a constant_offset.

Renamed ptr_base to var_offset and ptr_offset to const_offset which
indeed are more descriptive.

I also cleaned the if..else mess at the end of the function and made the
different cases very explicit, instead of the spaghetti mess previous
version. It looks much better now to me at least.

>   + brig_code_entry_handler::get_tree_type_for_hsa_type: I believe it
>     is better to and the type with BRIG_TYPE_PACK_MASK if you want to
>     determine whether you are looking at a vector (packed) HSAIL

Done.

>     instruction.  I also think that putting that test into a separate
>     function and calling it from all places when you do this would be
>     more future-proof (something like hsa_type_packed_p in gcc/hsa.c).

I just used that function instead and called if from everywhere.

>     For the (inner_brig_type == BRIG_TYPE_F16) vector case, you do not
>     end up calling build_type_variant (why is it necessary in any
>     case?) but for other vector types you do.  Is that intentional?

I don't remember why the const stripping code was there.
Maybe a leftover of some earlier hack/workaround in a previous
gcc version. I removed it and it seems to work fine.

>   + brig_code_entry_handler::expand_or_call_builtin: If I am correct
>     that the operands parameter contains only input operands at this
>     point, please state so in the function comment.  Since we only
>     allow ourselves 80 characters wide code, it is customary not to
>     put code into else branches if the if branch returns (or
>     breaks/continues) anyway.

Done.

>     Given that this function builds vectors only to pile them
>     element-wise to arguments of variable-argument-length function
>     call_builtin, which then builds vectors out if its
>     elements... have you considered having an overloaded
>     implementation of call_builtin that would not do this?  It seems
>     particularly wasteful.

Optimized by inlining the essential builtin build code to the call site.

>   + brig_code_entry_handler::build_operands: Please make the return
>     type consistent with the one in class definition (tree_stl_vec
>     instead of explicit std::vector<tree>, assuming you prefer the
>     former).

Done.

>     GCC coding standard mandates that if a branch has only one
>     statement in it, it should not be encapsulated in braces.

Cleaned up.

>     Please replace
>       if (operand == NULL_TREE)
>             gcc_unreachable ();
>     with gcc_assert (operand);

Done.

>     Please rewrite conditions like !(TREE_CODE (operand) == TREE_VEC)
>     as (TREE_CODE (operand) != TREE_VEC)

Did.

>     Again you are creating VIEW_CONVERT_EXPR for an operand that is of
>     a different size than the result type.  For scalar types, this is
>     bound to cause trouble sooner or later and I really think you need
>     to avoid it.

Fixed by delegating the conversion to the (now updated)
build_reinterpret_cast().

> - brig-basic-inst-handler.cc:
>   + scalarized_sat_arithmetics::scalarized_sat_arithmetics: Do not
>     undef macros preemptively.  Instead, undef them right after using
>     them, after the include of a .def file.

This seems to be an idiom with the builtin import mechanism elsewhere
also.  builtins.def defines a default macro which one must undef if not
wanting to do anything with that builtin type in that particular import
location.

>   + brig_basic_inst_handler::must_be_scalarized: I am intrigued by the
>     (elements < 16) part of the condition.  This function would also
>     benefit from a comment.

This function black listed the known cases of MULHI with vectors
that have been tested to break with AMD64/x86-64. It seems the
support for vector MULT_HIGHPART_EXPR is flaky and undertested.

The robust thing to do here is to force scalarization always with MULHI for now,
until these issues are  debugged further. I added an exception for 2x64b
MULT_HIGHPART_EXPR to avoid the need for 128b scalar arithmetics,
and as it seems to work for the CPUs I've tested.

The decision should not IMO belong to the frontend, but there'd be
better a step where the vector operations are optionally scalarized if the
target prefers scalarized operations which should be caught by this
step also. Something to fix during optimization work.

>   + scalarized_sat_arithmetics::builtin - please prefix with m_, ditto
>     for brig_inst_ (why the trailing underscore?)

Done.


>   + brig_basic_inst_handler::must_be_scalarized needs a comment
>     explaining what it is for.

Removed the method as unneeded for now (see the comment above about
MULHI).

>   + brig_basic_inst_handler::get_raw_type: Unless "raw" is some HSA
>     term I have missed, I would strongly suggest that you rename the
>     function to something more immediately obvious, like uint_for_type
>     or something like that.  Also, don't use literal 8 but

Renamed to get_unsigned_int_type().

>     BITS_PER_UNIT.  Also, do we really need both this function and

Done.

>     brig_code_entry_handler::get_raw_tree_type ?

Nope.

>   + build_shuffle, build_unpack, build_pack, build_unpack_lo_or_hi and
>     build_lower_element_broadcast: I admit that so far I have only
>     very briefly skimmed through these functions.  In any way, use
>     BITS_PER_UNIT instead of 8.

Done.

>   + brig_basic_inst_handler::build_instr_expr: Please remove the 'r'
>     and make it build_inst_expr for the sake of consistency.  If I

Done.

>     understand the code correctly, the operands parameter contains
>     only input operands.  In that case, please state so in the
>     function comment and remove the local variable first_input, it has
>     no purpose but to confuse.  Also, please move the definition and

Done.

>     assignments to local variable input_count (and possibly also
>     output_count) down to where it is used.

Done.

>   + brig_basic_inst_handler::operator (): It seems that the opcode
>     local variable is only used to identify the return brig
>     instructions which seems wasteful.  Generally, it would be nice to

It's also used to catch MULHI which is generated from multiple brig
opcodes.

>     clean this function up a little by moving assignments to some of
>     the very many local variables down, as close to their first use as
>     reasonable.  Surprisingly often, you'd remove the need to compute
>     them in many cases at all, e.g. look at element_count and
>     element_size_bits.

Moved element_count and is_fp16_operation. element_size_bits is now
used for catching mulhis for 64b elements.

>     Extra points for a function comment explaining how work is divided
>     in between operator() itself and its main helpers such as
>     build_instr_expr.

I added a method comment, but the truth is that the division of work
is a bit artificial, mostly the build_inst_expr() call is there to split
a complex if..else structure to two functions to improve readability.

>   + brig_basic_inst_handler::get_tree_code_for_hsa_opcode: The comment
>     says the special value returned when it is necessary to use a
>     chain of tree expressions or an builtin is NULL_TREE, but the
>     function itself returns TREE_LIST or CALL_EXPR.

Corrected.

> - brig-cmp-inst-handler.cc:
>   + brig_cmp_inst_handler::operator (): the neg_expr seems to be
>     something left from earlier times?  Use BITS_PER_UNIT instead of

Correct. Removed.

>     8, having both result_width and element_width seems unnecessary

Removed element_width and moved result_width definition
closer to its use.

>     (and speaking of elements, is that actually even a vector case?),
>     and should be initialized only in the case when it is used.  In
>     case of vector results, please build either all_ones or all_zeros,
>     it is wasteful to allocate both.

They both are used to produce the HSA required all_ones/all_zeros
output.

> - brig-mem-inst-handler.cc: I believe that using the alignment
>   modifier is something that we should try to get done as soon as
>   possible.

I agree. Probably one of the first things that will pop up during optimizing
the performance.

> - brig-inst-mod-handler.cc: This seems like something that we should
>   at least warn about (in case when effectively an unsupported
>   operation is requested).

If there will be an upgrade for the frontend to support the 'full' profile (it's
only supporting 'base' now) with all the rounding modifiers, a better way might
be found than injecting fesetround() calls around all float
expressions.  Probably
in that case all float ops must be converted to builtin calls that
ensure the wanted
rounding (ungh!).

> - brig-seg-inst-handler.cc: At this point I'm trying to read quickly
>   but it seems to me you do not support conversion between flat and
>   global segment... how come?

Global address is already a flat address. Check the description tab in
http://www.hsafoundation.com/html/HSA_Library.htm#PRM/Topics/05_Arithmetic/segment_conversion.htm?Highlight=global

> - brig-copy-move-inst-handler.cc:
>   + brig_copy_move_inst_handler::operator (): The function definitely
>     should not cast LDA instructions to BrigInstSourceType*!

Yes it should. Like I explained above, the struct name is misleading,
but it's actually an instruction calls not a source type.

> - brig-branch-inst-handler.cc: I believe that as long as the builtins
>   representing barriers are not pure, they will not be hoisted out of
>   a loop.  Nonduplication might indeed be a problem, although short of
>   whole function cloning, I could not think of a transformation gcc
>   performs that might pause a problem.  Nevertheless, we probably
>   should introduce an attribute for it and look for it in
>   gimple_can_duplicate_bb_p (and in cfg_layout_can_duplicate_bb_p?).
>   An important issue, but hopefully for later.

Agreed.

> - brig-variable-handler.cc:
>   + brig_directive_variable_handler::operator (): Please use
>     BITS_PER_UNIT instead of 8.

Done.

>   + build_variable: Likewise.  I am a bit concerned that unlike in
>     operator(), you do not make the alignment at least as big as
>     natural one, which means that in theory (and probably only on
>     malformed BRIG, I suppose), the two functions might disagree about
>     alignment?  I think it would be nice to outline the extraction of
>     alignment to an independent function and use that from both
>     places.

Done.

> - brig-function-handler.cc:
>   + brig_directive_function_handler::operator: Please use gcc_assert
>     instead of assert.  (Well, in this case it is clearly input error
>     which, eventually, we will want to give nice errors about.  But at
>     least do not use assert.)

Converted to gcc_assert() for now.

> - brig-function.cc:
>   + brig_function::analyze_calls: The first if condition should be
>     terminated by a newline

Done.

>   + brig_function::add_wi_loop:  Is the second TODO now obsolete?

Yes, removed.

>   + brig_function::build_launcher_and_metadata: The ASM directive is
>     really an ugly hack.  It is isolated so I am not that much
>     concerned, but building a structure and filling it with data (like
>     we do for example in hsa_output_libgomp_mapping) seems cleaner.

Hmm. I don't generate any metadata structs to data section, but a separate
custom ELF section per kernel. I agree the ASM directive is not ideal
in general,
but I don't think there's a generic way to add custom ELF sections.

I'm not sure how much building the structure field by field would be a better
approach in comparison to a raw dump as the point is to just transfer the
metadata to the HSA runtime with the finalized binary. The runtime should use
exactly the same struct layout, otherwise it won't work anyways. If I
do a raw dump,
at least I ensure that if the struct is updated I won't forget to update the
serialization code.

> - brig-util.cc:
>   + gccbrig_is_raw_operation: I think that calling the operations
>     "bit" operations instead of "raw" would make life of readers of
>     the code slightly but noticeably easier.

Renamed.

>   + gccbrig_hsa_type_bit_size: If possible, please make the default
>     case be gcc_unreachable().  (If zero is expected in some cases,
>     then all callers should check for it, so that we for example do
>     not divide by zero in
>     brig_code_entry_handler::get_tree_type_for_hsa_type.)

Made it call gcc_unreachable ().

>   + might_be_host_defined_var: there is no need for the returned
>     expression to start on a new line.

Fixed.

> - gcc/builtins.def: In the added DEF_HSAIL_*_BUILTIN macros, please
>   arrange it so that they only pass true in the last argument of
>   DEF_BUILTIN when gcc is configured with BRIG FE.  Builtins are not
>   free and should not be added needlessly.

It doesn't even include the hsail-builtins.def in case BRIG FE not
enabled now. I suppose that's even better than executing those macros
for nothing.

> Overall, the code has improved significantly.  As far as I am
> concerned, the only real issue I see are the VIEW_CONVERT_EXPRs with
> mismatched operands.  They are asking for trouble, only Ada produces
> those (although it is acknowledged it should not) and Ada only does it
> for aggregates.

These should be now fixed.

> If I understood you correctly, both you and your sponsor have already
> signed the Copyright assignment, right?  If that is so, I'll ask the
> steering committee to approve the intention and then ask a global
> reviewer to also peek at it.

Correct, and I see you already did. Thanks!

> Thanks for your patience,

Thank a lot for the comments. I know how much patience it requires to wade
through a big bunch of someone else's boring code.

New BRIG FE patch set attached.

BR,
Pekka

[-- Attachment #2: 000-brig-fe.patch --]
[-- Type: text/x-patch, Size: 6998 bytes --]

This patch set adds a BRIG (HSAIL) frontend. It can be used as a core
for an HSAIL finalizer implementation for processors with gcc backends.

It is a bit unusual frontend as the consumed format is a binary
representation.  The textual HSAIL can be compiled to it with a 
separate assembler tool.

The frontend has been mostly tested with the HSA PRM conformance suite which
it now passes. The accompanied GENERIC-scanning test suite is supposed to be
only a smoke test. 

libhsail-rt implements HSAIL specific builtins and includes a simple runtime
that implements SPMD execution via either Pth-based fibers or loops to 
execute multiple work-item work groups without SPMD/SIMD-default hardware.

I've split it to 4 patches:

001 - the configuration file changes and misc.
002 - the frontend itself
003 - libhsail-rt
004 - the smoke test suite

The diffstat is as follows:

 .gitignore                                        |     2 +-
 Makefile.def                                      |     3 +
 Makefile.in                                       |   489 +
 configure                                         |     1 +
 configure.ac                                      |     1 +
 gcc/brig/Make-lang.in                             |   247 +
 gcc/brig/brig-builtins.h                          |    99 +
 gcc/brig/brig-c.h                                 |    66 +
 gcc/brig/brig-lang.c                              |   770 +
 gcc/brig/brigfrontend/brig-arg-block-handler.cc   |    66 +
 gcc/brig/brigfrontend/brig-atomic-inst-handler.cc |   265 +
 gcc/brig/brigfrontend/brig-basic-inst-handler.cc  |   865 +
 gcc/brig/brigfrontend/brig-branch-inst-handler.cc |   221 +
 gcc/brig/brigfrontend/brig-cmp-inst-handler.cc    |   198 +
 gcc/brig/brigfrontend/brig-code-entry-handler.cc  |  1712 ++
 gcc/brig/brigfrontend/brig-code-entry-handler.h   |   422 +
 gcc/brig/brigfrontend/brig-comment-handler.cc     |    39 +
 gcc/brig/brigfrontend/brig-control-handler.cc     |   108 +
 .../brigfrontend/brig-copy-move-inst-handler.cc   |    58 +
 gcc/brig/brigfrontend/brig-cvt-inst-handler.cc    |   260 +
 gcc/brig/brigfrontend/brig-fbarrier-handler.cc    |    44 +
 gcc/brig/brigfrontend/brig-function-handler.cc    |   373 +
 gcc/brig/brigfrontend/brig-function.cc            |   719 +
 gcc/brig/brigfrontend/brig-function.h             |   213 +
 gcc/brig/brigfrontend/brig-inst-mod-handler.cc    |    58 +
 gcc/brig/brigfrontend/brig-label-handler.cc       |    37 +
 gcc/brig/brigfrontend/brig-lane-inst-handler.cc   |    84 +
 gcc/brig/brigfrontend/brig-machine.c              |    44 +
 gcc/brig/brigfrontend/brig-machine.h              |    33 +
 gcc/brig/brigfrontend/brig-mem-inst-handler.cc    |   180 +
 gcc/brig/brigfrontend/brig-module-handler.cc      |    41 +
 gcc/brig/brigfrontend/brig-queue-inst-handler.cc  |    93 +
 gcc/brig/brigfrontend/brig-seg-inst-handler.cc    |   146 +
 gcc/brig/brigfrontend/brig-signal-inst-handler.cc |    42 +
 gcc/brig/brigfrontend/brig-to-generic.cc          |   812 +
 gcc/brig/brigfrontend/brig-to-generic.h           |   226 +
 gcc/brig/brigfrontend/brig-util.cc                |   446 +
 gcc/brig/brigfrontend/brig-util.h                 |    53 +
 gcc/brig/brigfrontend/brig-variable-handler.cc    |   263 +
 gcc/brig/brigfrontend/phsa.h                      |    69 +
 gcc/brig/brigspec.c                               |   135 +
 gcc/brig/config-lang.in                           |    41 +
 gcc/brig/lang-specs.h                             |    28 +
 gcc/brig/lang.opt                                 |    41 +
 gcc/builtin-types.def                             |    80 +-
 gcc/builtins.def                                  |    41 +
 gcc/config.in                                     |     6 +
 gcc/configure                                     |    10 +-
 gcc/configure.ac                                  |     5 +
 gcc/doc/frontends.texi                            |     2 +-
 gcc/doc/invoke.texi                               |     4 +
 gcc/doc/standards.texi                            |     8 +
 gcc/hsail-builtins.def                            |   659 +
 gcc/testsuite/brig.dg/README                      |    10 +
 gcc/testsuite/brig.dg/dg.exp                      |    27 +
 gcc/testsuite/brig.dg/test/gimple/alloca.hsail    |    37 +
 gcc/testsuite/brig.dg/test/gimple/atomics.hsail   |    33 +
 gcc/testsuite/brig.dg/test/gimple/branches.hsail  |    58 +
 gcc/testsuite/brig.dg/test/gimple/fbarrier.hsail  |    74 +
 .../brig.dg/test/gimple/function_calls.hsail      |    59 +
 gcc/testsuite/brig.dg/test/gimple/kernarg.hsail   |    25 +
 gcc/testsuite/brig.dg/test/gimple/mem.hsail       |    39 +
 gcc/testsuite/brig.dg/test/gimple/mulhi.hsail     |    33 +
 gcc/testsuite/brig.dg/test/gimple/packed.hsail    |    78 +
 .../brig.dg/test/gimple/smoke_test.hsail          |    91 +
 gcc/testsuite/brig.dg/test/gimple/variables.hsail |   124 +
 gcc/testsuite/brig.dg/test/gimple/vector.hsail    |    57 +
 gcc/testsuite/lib/brig-dg.exp                     |    29 +
 gcc/testsuite/lib/brig.exp                        |    40 +
 include/hsa-interface.h                           |   630 +
 libhsail-rt/Makefile.am                           |   124 +
 libhsail-rt/Makefile.in                           |   740 +
 libhsail-rt/README                                |     4 +
 libhsail-rt/aclocal.m4                            |   978 +
 libhsail-rt/config.h.in                           |   217 +
 libhsail-rt/configure                             | 17016 ++++++++++++++++++
 libhsail-rt/configure.ac                          |   151 +
 libhsail-rt/include/internal/fibers.h             |    95 +
 .../include/internal/phsa-queue-interface.h       |    60 +
 libhsail-rt/include/internal/phsa-rt.h            |    94 +
 libhsail-rt/include/internal/workitems.h          |   107 +
 libhsail-rt/m4/libtool.m4                         |  7997 ++++++++
 libhsail-rt/m4/ltoptions.m4                       |   384 +
 libhsail-rt/m4/ltsugar.m4                         |   123 +
 libhsail-rt/m4/ltversion.m4                       |    23 +
 libhsail-rt/m4/lt~obsolete.m4                     |    98 +
 libhsail-rt/rt/arithmetic.c                       |   475 +
 libhsail-rt/rt/atomics.c                          |   115 +
 libhsail-rt/rt/bitstring.c                        |   190 +
 libhsail-rt/rt/fbarrier.c                         |    87 +
 libhsail-rt/rt/fibers.c                           |   212 +
 libhsail-rt/rt/fp16.c                             |   135 +
 libhsail-rt/rt/misc.c                             |    89 +
 libhsail-rt/rt/multimedia.c                       |   135 +
 libhsail-rt/rt/queue.c                            |    71 +
 libhsail-rt/rt/sat_arithmetic.c                   |   299 +
 libhsail-rt/rt/segment.c                          |    57 +
 libhsail-rt/rt/workitems.c                        |   952 +
 libhsail-rt/target-config.h.in                    |    68 +
 99 files changed, 43463 insertions(+), 5 deletions(-)

[-- Attachment #3: 001-brig-fe-config-etc.patch.gz --]
[-- Type: application/x-gzip, Size: 144470 bytes --]

[-- Attachment #4: 002-brig-fe-gcc.patch.gz --]
[-- Type: application/x-gzip, Size: 72181 bytes --]

[-- Attachment #5: 003-brig-fe-libhsail-rt.patch.gz --]
[-- Type: application/x-gzip, Size: 17369 bytes --]

[-- Attachment #6: 004-brig-fe-testsuite.patch.gz --]
[-- Type: application/x-gzip, Size: 7157 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-10-29 11:59         ` Pekka Jääskeläinen
@ 2016-11-04 14:58           ` Martin Jambor
  2016-11-09 19:13             ` Pekka Jääskeläinen
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Jambor @ 2016-11-04 14:58 UTC (permalink / raw)
  To: Pekka Jääskeläinen; +Cc: gcc-patches

Hi Pekka,

On Sat, Oct 29, 2016 at 02:58:29PM +0300, Pekka Jääskeläinen wrote:
> Hi Martin,
> 
> Thanks for the comments and suggestions. Replies inline:
> 
> On Thu, Oct 20, 2016 at 6:10 PM, Martin Jambor <mjambor@suse.cz> wrote:
> > - Still quite few things need to be documented better, e.g.:
> >   + brig_to_generic::get_mangled_name_tmpl and to a lesser extent
> >     brig_to_generic::get_mangled_name.  It should be clear what is the
> >     intended difference in usage of the two (specially since the
> >     former is a template and so the parameter does not give that much
> >     of a hint to the reader)
> 
> Added more comments.

thanks.

> 
> >   + the visitor classes need some description so that the first time
> >     readers see them, they understand what they are for and what they
> >     visit (i.e. what "visiting" even means).
> 
> This is an adaptation of the classic gang of four Visitor design pattern.
> I added a reference to it in a comment.

Thanks.  As you can see, I am not a C++ person and do not know names
of common patterns.  For me, "Gang of Four" is a reference to history
of China and not something one would want to emulate.

> 
> > - I know it was me who told you to use gcc_assert and gcc_unreachable
> >   instead of internal_error.  The thing is, often the error is clearly
> >   not an internal error but an error in input.  I think that we should
> >   plan to handle these cases differently and report the issues better
> >   to the user, give a meaningful error message together width the
> >   section and the offset there when it was encountered.  I am not
> >   asking you to audit all asserts now and convert those in this
> >   category but it would be nice to have a mechanism to easily do so
> >   (and convert a few obvious places), so that we can convert these as
> >   we bump into them.
> 
> I'm not sure about this. BRIG FE is a rather special case as we
> assume HSAILasm has been used to parse and error check the original
> HSAIL text to the binary BRIG format which it consumes.
> 
> Of course HSAILasm can have bugs, but how much we should produce human
> readable error messages to help debugging HSAILasm is another thing.
> In case the BRIG FE
> fails to consume the input, it means either the BRIG is corrupted for
> a reason or another,
> but typically is not a human error (those should be caught be HSAILasm).

I personally primarily want this for debugging purposes, and we should
try to eventually report errors in a more comprehensible way than
HSAILasm.  But more generally, and more importantly, if the input,
whether human readable or not, is incorrect, the compiler should not
produce an "internal error."  If that happens, users will file bugs
against gcc when in fact it is the generating tool that is broken.
One you maintain the FE, you personally will not want this :-)

> 
> "File format not recognized" error is one that might be useful though.
> I added a check for the BRIG magic number and the supported version (1.0).
> 
> Perhaps we should add error printouts later on case by case basis when we
> see which error cases can be useful and worth reporting in a human readable
> graceful manner? It can be as easy as converting the internal_error
> to fatal_error or similar in that case.

Yeah, that is what I meant, together with printing some information
about offsets in different sections where the error is.

> 
> > - A very minor suggestion: In GCC it is customary to write TODO as one
> >   word.  We generally do not use "TO OPTIMIZE", that is just a TODO
> >   (as opposed to a FIXME, which hints that something is at least a bit
> >   wrong here).  I think you can keep your way if you want but for
> >   example I do have emacs highlighting set up for the traditional
> >   formats.
> 
> Converted all to TODO.
> 
> > - You do not seem to handle BRIG_OPCODE_ICALL, or have I missed it?
> >   That is fine, I don't think anybody else does that now anyway, I
> >   just got curious reading the ipa analysis.
> 
> Right.
> 
> > - brig-lang.c:
> >   + x_flag_whole_program = 0; - talk about this with Honza
> 
> I guess this is note to yourself? As we agreed I didn't try whole program
> optimizations yet.  I might later when optimizing for a target. It will
> be really useful, I agree. Now that it used the proper builtin way, it
> might work more easily.

Right, it was a note for myself.

> 
> >   + brig_langhook_type_for_size: has several issues.  Either always
> >     return a type like go or return error_mark_node instead of NULL.
> >     Also, do not count on long being 64-bit.  I would just copy what
> >     go does.  Or lto.
> 
> Copied the go's version, it should work with BRIG too.

Great.

> 
> >   + convert: did you avoid using convert_to_vector deliberately?  The
> >     size check seems genuinely useful.
> 
> BRIG/HSAIL is a bit special case due to its "untyped" variables (registers),
> I use bit level casts in a lot of places to avoid accidental type conversions
> or sign extensions.

I was wondering why in case of VECTOR_TYPE, instead of
    return build1 (VIEW_CONVERT_EXPR, type, expr);
you do not do
    return fold (convert_to_vector (type, expr));

it seemed to me natural given how you handle the other cases and that
the function internally also builds a V_C_E, after two checks.  So I
thought the checks (matching overall size and only converting from an
integer or another vector-type) were the problem and was wondering
which one.  Especially the size one is something I believe you should
never violate too.

> 
> > - brigfrontend/brig-to-generic.cc:
> >   + brig_to_generic::parse: You seem to be handling LDA instruction
> >     (or more generally, BRIG_KIND_INST_ADDR instructions, but there is
> >     just the one) with copy_move_inst_handler, which seems just wrong.
> 
> What do you mean by 'wrong'?

copy_move_inst_handler operator() starts by type-casting whatever it
gets to BrigInstSourceType* and immediately dereferencing it, loading
field sourceType, even though, in case of an LDA, what it is actually
looking at is not a BrigInstSourceType but a BrigInstAddr, which is an
entirely different structure that does not have that field.  So you
read the 8-bit segment and the first reserved field and happily pass
that to gccbrig_tree_type_for_hsa_type, hoping it does not explode on
random data.  That is wrong.

Later on in the operator() you actually figure out that you are
looking at BrigInstSourceType but that is too late.

> The handlers do not map to instruction types
> directly, but often more towards the specification chapters.
> It comes straight from:
> http://www.hsafoundation.com/html/HSA_Library.htm#PRM/Topics/18_BRIG/BRIG_syntax_copy_move.htm

Ah, so that chapter is why you decided to handle type conversions and
LDA together, it made little sense to me.  Still, I do not think it is
a very good idea implementation-wise, but if the operator() is written
correctly, it certainly can be done.

> 
> >     Its operator() blindly casts the instruction to
> >     BrigInstSourceType, interpreting the segment as if it was a source
> >     type... am I missing something?
> 
> I believe you are confused by the BRIG struct name: BrigInstSourceType
> is a struct
> for "instructions that have different types for their destination and
> source operands"
> http://www.hsafoundation.com/html/HSA_Library.htm#PRM/Topics/18_BRIG/BrigInstSourceType.htm
> 
> That is, it's not a "source type" object, but an instruction type object.

Ehm, no, I think I understand that, the invalid typecast/dereference
is what I am pointing at.

> 
> >   + build_reinterpret_cast: gcc_unreachable followed by return
> >     NULL_TREE does not make sense, please convert the whole thing to
> >     an assert.  More importantly, I think that you really need to
> 
> Converted it to an assert.
> 
> >     solve the case of mismatching sizes differently, and not create a
> >     (single) V_C_E for it.  For register types, you can V_C_E them to
> >     an unsigned int of the same size, then do a widening NOP_EXPR to
> >     unsigned type with the same size as the destination and then do a
> >     V_C_E to whatever you need.  But perhaps this can be solved better
> >     at the caller side somewhere?
> 
> I added the extra conversion step through unsigned ints.
> I hadn't noticed that V_C_E is not guaranteed to work for this case.

Well, Ada also creates mismatched V_C_Es, so we do not verify the
sizes match and it mostly works.  However, after hunting down
expansion bugs caused by a combination of that and SRA, I really
really do not want to avoid adding more such situations, it still
seems quite fragile to me.

So thanks for doing this.

> >   + brig_code_entry_handler::build_address_operand: I must say I
> >     really dislike how the end of the function is structured, it is
> >     terrible difficult to read given that it is not doing anything that
> >     complicated and I think it does not handle correctly an (LDA of a)
> >     NULL address, which unfortunately I believe is valid for private
> >     and group segment addresses.  Dealing with the most complex case
> >     by converting symbol to size_type looks exactly backwards,
> >     especially given that you have converted the base of the
> >     POINTER_PLUS_EXPR only few lines before.  I think the code would
> >     be a lot nicer and easier to comprehend if you clearly
> >     distinguished the various cases (symbol_base != NULL (and sub
> >     cases when ptr_base is or is not NULL), ptrbase != NULL and simple
> >     constant, even NULL constant, which you do not handle but fail an
> >     assert, I think) and handled them separately, including all type
> >     conversions.  ptr_base is an unfortunate name, IMHO, in many cases
> >     it has the role of a variable offset rather than a base.
> >     Similarly, ptr_ofset is really a constant_offset.
> 
> Renamed ptr_base to var_offset and ptr_offset to const_offset which
> indeed are more descriptive.
> 
> I also cleaned the if..else mess at the end of the function and made the
> different cases very explicit, instead of the spaghetti mess previous
> version. It looks much better now to me at least.

Indeed it does, thanks.  However, I think that at the point when you
do:

  if (offs > 0)
    const_offset = build_int_cst (size_type_node, offs);

const_offset can already be non-NULL you may lose whatever value it
had before.  You might want to keep the constant offset as integer
until the very moment when you may want to use it.

(Also, single statements should not have braces around them ;-)

> > - brig-basic-inst-handler.cc:
> >   + scalarized_sat_arithmetics::scalarized_sat_arithmetics: Do not
> >     undef macros preemptively.  Instead, undef them right after using
> >     them, after the include of a .def file.
> 
> This seems to be an idiom with the builtin import mechanism elsewhere
> also.  builtins.def defines a default macro which one must undef if not
> wanting to do anything with that builtin type in that particular import
> location.

I see, OK.

> 
> >   + brig_basic_inst_handler::must_be_scalarized: I am intrigued by the
> >     (elements < 16) part of the condition.  This function would also
> >     benefit from a comment.
> 
> This function black listed the known cases of MULHI with vectors
> that have been tested to break with AMD64/x86-64. It seems the
> support for vector MULT_HIGHPART_EXPR is flaky and undertested.
> 
> The robust thing to do here is to force scalarization always with MULHI for now,
> until these issues are  debugged further. I added an exception for 2x64b
> MULT_HIGHPART_EXPR to avoid the need for 128b scalar arithmetics,
> and as it seems to work for the CPUs I've tested.
> 
> The decision should not IMO belong to the frontend, but there'd be
> better a step where the vector operations are optionally scalarized if the
> target prefers scalarized operations which should be caught by this
> step also. Something to fix during optimization work.

I see.  If you think that MULT_HIGHPART_EXPR is broken, it may be
worth filing a bug report (probably during next stage1).

> >   + brig_basic_inst_handler::operator (): It seems that the opcode
> >     local variable is only used to identify the return brig
> >     instructions which seems wasteful.  Generally, it would be nice to
> 
> It's also used to catch MULHI which is generated from multiple brig
> opcodes.
> 
> >     clean this function up a little by moving assignments to some of
> >     the very many local variables down, as close to their first use as
> >     reasonable.  Surprisingly often, you'd remove the need to compute
> >     them in many cases at all, e.g. look at element_count and
> >     element_size_bits.
> 
> Moved element_count and is_fp16_operation. element_size_bits is now
> used for catching mulhis for 64b elements.
> 
> >     Extra points for a function comment explaining how work is divided
> >     in between operator() itself and its main helpers such as
> >     build_instr_expr.
> 
> I added a method comment, but the truth is that the division of work
> is a bit artificial, mostly the build_inst_expr() call is there to split
> a complex if..else structure to two functions to improve readability.

Yeah, I thought so.  No worries.

> > - brig-cmp-inst-handler.cc:
> >   + brig_cmp_inst_handler::operator (): the neg_expr seems to be
> >     something left from earlier times?  Use BITS_PER_UNIT instead of
> 
> Correct. Removed.
> 
> >     8, having both result_width and element_width seems unnecessary
> 
> Removed element_width and moved result_width definition
> closer to its use.
> 
> >     (and speaking of elements, is that actually even a vector case?),
> >     and should be initialized only in the case when it is used.  In
> >     case of vector results, please build either all_ones or all_zeros,
> >     it is wasteful to allocate both.
> 
> They both are used to produce the HSA required all_ones/all_zeros
> output.

We already discussed this on IRC, this was my oversight.

> 
> > - brig-mem-inst-handler.cc: I believe that using the alignment
> >   modifier is something that we should try to get done as soon as
> >   possible.
> 
> I agree. Probably one of the first things that will pop up during optimizing
> the performance.
> 
> > - brig-inst-mod-handler.cc: This seems like something that we should
> >   at least warn about (in case when effectively an unsupported
> >   operation is requested).
> 
> If there will be an upgrade for the frontend to support the 'full' profile (it's
> only supporting 'base' now) with all the rounding modifiers, a better way might
> be found than injecting fesetround() calls around all float
> expressions.  Probably
> in that case all float ops must be converted to builtin calls that
> ensure the wanted
> rounding (ungh!).

Yay.  OK.

> 
> > - brig-seg-inst-handler.cc: At this point I'm trying to read quickly
> >   but it seems to me you do not support conversion between flat and
> >   global segment... how come?
> 
> Global address is already a flat address. Check the description tab in
> http://www.hsafoundation.com/html/HSA_Library.htm#PRM/Topics/05_Arithmetic/segment_conversion.htm?Highlight=global

Oh, I did not realize that HSAIL does not allow stof_global and
ftos_global variants.  Interesting.  OK then.

> 
> > - brig-copy-move-inst-handler.cc:
> >   + brig_copy_move_inst_handler::operator (): The function definitely
> >     should not cast LDA instructions to BrigInstSourceType*!
> 
> Yes it should. Like I explained above, the struct name is misleading,
> but it's actually an instruction calls not a source type.

but surely LDA is encoded as struct BrigInstAddr, not
BrigInstSourceType ???

> 
> > - brig-branch-inst-handler.cc: I believe that as long as the builtins
> >   representing barriers are not pure, they will not be hoisted out of
> >   a loop.  Nonduplication might indeed be a problem, although short of
> >   whole function cloning, I could not think of a transformation gcc
> >   performs that might pause a problem.  Nevertheless, we probably
> >   should introduce an attribute for it and look for it in
> >   gimple_can_duplicate_bb_p (and in cfg_layout_can_duplicate_bb_p?).
> >   An important issue, but hopefully for later.
> 
> Agreed.
> 

> 
> > - brig-function-handler.cc:
> >   + brig_directive_function_handler::operator: Please use gcc_assert
> >     instead of assert.  (Well, in this case it is clearly input error
> >     which, eventually, we will want to give nice errors about.  But at
> >     least do not use assert.)
> 
> Converted to gcc_assert() for now.
> 

> >   + brig_function::build_launcher_and_metadata: The ASM directive is
> >     really an ugly hack.  It is isolated so I am not that much
> >     concerned, but building a structure and filling it with data (like
> >     we do for example in hsa_output_libgomp_mapping) seems cleaner.
> 
> Hmm. I don't generate any metadata structs to data section, but a separate
> custom ELF section per kernel. I agree the ASM directive is not ideal
> in general,
> but I don't think there's a generic way to add custom ELF sections.
> 
> I'm not sure how much building the structure field by field would be a better
> approach in comparison to a raw dump as the point is to just transfer the
> metadata to the HSA runtime with the finalized binary. The runtime should use
> exactly the same struct layout, otherwise it won't work anyways. If I
> do a raw dump,
> at least I ensure that if the struct is updated I won't forget to update the
> serialization code.

Well, hopefully nobody will object too strongly against this.  As I
wrote before, it is isolated enough for me.

> > - gcc/builtins.def: In the added DEF_HSAIL_*_BUILTIN macros, please
> >   arrange it so that they only pass true in the last argument of
> >   DEF_BUILTIN when gcc is configured with BRIG FE.  Builtins are not
> >   free and should not be added needlessly.
> 
> It doesn't even include the hsail-builtins.def in case BRIG FE not
> enabled now. I suppose that's even better than executing those macros
> for nothing.

Oh, I have missed that.  Sure.

> > If I understood you correctly, both you and your sponsor have already
> > signed the Copyright assignment, right?  If that is so, I'll ask the
> > steering committee to approve the intention and then ask a global
> > reviewer to also peek at it.
> 
> Correct, and I see you already did. Thanks!
> 
> > Thanks for your patience,
> 
> Thank a lot for the comments. I know how much patience it requires to wade
> through a big bunch of someone else's boring code.
> 
> New BRIG FE patch set attached.

This week I have checked out the updated tree from github and looked
at only a few changed functions there.  Hopefully the steering
committee will decide soon and we'll get attention of a global
reviewer during the next few weeks too.

Thanks for addressing so many of my comments,

Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
  2016-11-04 14:58           ` Martin Jambor
@ 2016-11-09 19:13             ` Pekka Jääskeläinen
  0 siblings, 0 replies; 12+ messages in thread
From: Pekka Jääskeläinen @ 2016-11-09 19:13 UTC (permalink / raw)
  To: Pekka Jääskeläinen, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3592 bytes --]

Hi Martin,

On Fri, Nov 4, 2016 at 4:58 PM, Martin Jambor <mjambor@suse.cz> wrote:
> I personally primarily want this for debugging purposes, and we should
> try to eventually report errors in a more comprehensible way than
> HSAILasm.  But more generally, and more importantly, if the input,
> whether human readable or not, is incorrect, the compiler should not
> produce an "internal error."  If that happens, users will file bugs
> against gcc when in fact it is the generating tool that is broken.
> One you maintain the FE, you personally will not want this :-)

Yes, I will add better error messages as soon asissues are reported.

> I was wondering why in case of VECTOR_TYPE, instead of
>     return build1 (VIEW_CONVERT_EXPR, type, expr);
> you do not do
>     return fold (convert_to_vector (type, expr));
>
> it seemed to me natural given how you handle the other cases and that
> the function internally also builds a V_C_E, after two checks.  So I
> thought the checks (matching overall size and only converting from an
> integer or another vector-type) were the problem and was wondering
> which one.  Especially the size one is something I believe you should
> never violate too.

I see. I cannot recall the origins of this, but my guess is that the code is
copied from an older Go frontend version. I tested with your
suggestion and it seems
to work fine.

> copy_move_inst_handler operator() starts by type-casting whatever it
> gets to BrigInstSourceType* and immediately dereferencing it, loading
> field sourceType, even though, in case of an LDA, what it is actually
> looking at is not a BrigInstSourceType but a BrigInstAddr, which is an
> entirely different structure that does not have that field.  So you
> read the 8-bit segment and the first reserved field and happily pass
> that to gccbrig_tree_type_for_hsa_type, hoping it does not explode on
> random data.  That is wrong.
>
> Later on in the operator() you actually figure out that you are
> looking at BrigInstSourceType but that is too late.

Ah, I see the issue now. It worked as with LDA the input type was
forced and handled explicitly. I now made the code more explicit
regarding this.

> Indeed it does, thanks.  However, I think that at the point when you
> do:
>
>   if (offs > 0)
>     const_offset = build_int_cst (size_type_node, offs);
>
> const_offset can already be non-NULL you may lose whatever value it
> had before.  You might want to keep the constant offset as integer
> until the very moment when you may want to use it.

Great catch! There indeed was a major memory corruption with
group and private variable which was not caught by the PRM conformance
suite I use for testing. Should be now fixed.

> (Also, single statements should not have braces around them ;-)

...also fixed these, once again ;)

> I see.  If you think that MULT_HIGHPART_EXPR is broken, it may be
> worth filing a bug report (probably during next stage1).

It can be considered broken or not just fully implemented for all targets
and various vector types (especially those not directly supported by the
target).  Yes, I rather look closer and report this after the patch has been
merged to be able to provide BRIG test cases that work with
upstream code base.

> This week I have checked out the updated tree from github and looked
> at only a few changed functions there.  Hopefully the steering
> committee will decide soon and we'll get attention of a global
> reviewer during the next few weeks too.
>
> Thanks for addressing so many of my comments,

Thanks again, an updated FE patch attached.

BR,
Pekka

[-- Attachment #2: 002-brig-fe-gcc.patch.gz --]
[-- Type: application/x-gzip, Size: 72342 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-11-09 19:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-16 17:25 [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself Pekka Jääskeläinen
2016-05-18  0:21 ` Joseph Myers
2016-05-18 17:00   ` Pekka Jääskeläinen
2016-05-18 18:36     ` David Malcolm
2016-05-19  8:48       ` Pekka Jääskeläinen
2016-08-01 16:37 ` Martin Jambor
2016-08-15 13:22   ` Pekka Jääskeläinen
2016-08-30 16:50     ` Pekka Jääskeläinen
2016-10-20 15:10       ` Martin Jambor
2016-10-29 11:59         ` Pekka Jääskeläinen
2016-11-04 14:58           ` Martin Jambor
2016-11-09 19:13             ` Pekka Jääskeläinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).