public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Nathan Sidwell <nathan@acm.org>
To: Alexander Monakov <amonakov@ispras.ru>
Cc: gcc-patches@gcc.gnu.org, Sandra Loosemore <sandra@codesourcery.com>
Subject: Re: [PATCH] nvptx per-warp compiler-defined stacks (-msoft-stack)
Date: Fri, 20 May 2016 19:42:00 -0000	[thread overview]
Message-ID: <e83ec5f1-c8bd-9daa-584c-512cc20c36e3@acm.org> (raw)
In-Reply-To: <alpine.LNX.2.20.1605201747400.7578@monopod.intra.ispras.ru>

On 05/20/16 11:09, Alexander Monakov wrote:

> This patch implements '-msoft-stack' code generation variant for NVPTX.  The
> goal is to avoid relying on '.local' memory space for placement of automatic
> data, and instead have an explicitely-maintained stack pointer (which can be
> set up to point to preallocated global memory space).  This allows to have
> stack data accessible from all threads and modifiable with atomic
> instructions.  This also allows to implement variable-length stack allocation
> (for 'alloca' and C99 VLAs).
>
> Each warp has its own 'soft stack' pointer.  It lives in shared memory array
> called __nvptx_stacks at index %tid.y (like OpenACC, OpenMP is offloading is
> going to use launch geometry such that %tid.y gives the warp index).  It is
> retrieved in function prologue (if the function needs a stack frame) and may
> also be written there (if the function is non-leaf, so that its callees see
> the updated stack pointer), and restored prior to returning.
>
> Startup code is responsible for setting up the initial soft-stack pointer. For
> -mmainkernel testing it is libgcc's __main, for OpenMP offloading it's the
> kernel region entry code.

ah,  that's much more understandable,  thanks.  Presumably this doesn't support 
worker-single mode (in OpenACC parlance, I don't know what the OpenMP version of 
that is?)  And neither would it support calls  from vector-partitioned code (I 
think that's SIMD in OpenMP-land?).   It seems like we should reject the 
combination of -msoft-stack -fopenacc?

> 2016-05-19  Alexander Monakov  <amonakov@ispras.ru>

> 2016-05-19  Alexander Monakov  <amonakov@ispras.ru>

> 2016-05-19  Alexander Monakov  <amonakov@ispras.ru>

> 2016-05-19  Alexander Monakov  <amonakov@ispras.ru>

> 2016-03-15  Alexander Monakov  <amonakov@ispras.ru>

> 2015-12-14  Alexander Monakov  <amonakov@ispras.ru>

> 2015-12-09  Alexander Monakov  <amonakov@ispras.ru>

why so many changelogs?  The on-branch development history is irrelvant for 
trunk -- the usual single changelog style should be followed.


> diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
> index 2d4dad1..700c4b0 100644

> @@ -992,16 +1091,56 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl)


> +  else if (need_frameptr || cfun->machine->has_varadic || cfun->calls_alloca)
> +    {
> +      /* Maintain 64-bit stack alignment.  */

This block needs a more descriptive comment -- it appears to be doing a great 
deal more than maintaining 64-bit stack alignment!

> +      int keep_align = BIGGEST_ALIGNMENT / BITS_PER_UNIT;
> +      sz = ROUND_UP (sz, keep_align);
> +      int bits = POINTER_SIZE;
> +      fprintf (file, "\t.reg.u%d %%frame;\n", bits);
> +      fprintf (file, "\t.reg.u32 %%fstmp0;\n");
> +      fprintf (file, "\t.reg.u%d %%fstmp1;\n", bits);
> +      fprintf (file, "\t.reg.u%d %%fstmp2;\n", bits);

Some of these register names appear to be long lived -- and referenced in other 
functions.  It would be better to give those more descriptive names, or even 
give them hard-regs.   You should  certainly  do so for those that are already 
hard regs (%frame & %stack) -- is it more feasible to augment init_frame to 
initialize them?

> @@ -1037,6 +1178,10 @@ nvptx_output_return (void)
>  {
>    machine_mode mode = (machine_mode)cfun->machine->return_mode;
>
> +  if (cfun->machine->using_softstack)
> +    fprintf (asm_out_file, "\tst.shared.u%d [%%fstmp2], %%fstmp1;\n",

See note above about obscure reg names.

   Since ptx is a virtual target, we just define a few
>     hard registers for special purposes and leave pseudos unallocated.
> @@ -200,6 +205,7 @@ struct GTY(()) machine_function
>    bool is_varadic;  /* This call is varadic  */
>    bool has_varadic;  /* Current function has a varadic call.  */
>    bool has_chain; /* Current function has outgoing static chain.  */
> +  bool using_softstack; /* Need to restore __nvptx_stacks[tid.y].  */

Comment should describe what the attribute is, not what it implies.  In this 
case I think it's /*  Current function has   a soft stack frame.  */


> diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
> index 33a4862..e5650b6 100644
> --- a/gcc/config/nvptx/nvptx.md
> +++ b/gcc/config/nvptx/nvptx.md


> +(define_insn "set_softstack_insn"
> +  [(unspec [(match_operand 0 "nvptx_register_operand" "R")] UNSPEC_ALLOCA)]
> +  "TARGET_SOFT_STACK"
> +{
> +  return (cfun->machine->using_softstack
> +	  ? "%.\\tst.shared%t0\\t[%%fstmp2], %0;"
> +	  : "");
> +})

Is this alloca related (UNSPEC_ALLOCA) or restore related (invoked in 
restore_stack_block), or stack setting (as insn name suggests).  Things seem 
inconsistently named.  Comments would be good.

>
>  (define_expand "restore_stack_block"
>    [(match_operand 0 "register_operand" "")
>     (match_operand 1 "register_operand" "")]
>    ""
>  {
> +  if (TARGET_SOFT_STACK)
> +    {
> +      emit_move_insn (operands[0], operands[1]);
> +      emit_insn (gen_set_softstack_insn (operands[0]));

Is it necessary to store the soft stack here?  Only restore_stack_block seems 
defined, and not save_stack_block.  Why the asymmetry?


> diff --git a/gcc/testsuite/gcc.target/nvptx/softstack.c b/gcc/testsuite/gcc.target/nvptx/softstack.c
> new file mode 100644
> index 0000000..73e60f2
> --- /dev/null


This test is ok, but probably wise to check varadic generates the expected code?

Are changes needed in nvptx's  crt0.s (or an auxilary file in libc or libgcc) to 
make this work for testing?  Have you tested an nvptx-elf target with this flag on?

nathan

  parent reply	other threads:[~2016-05-20 19:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-20 17:00 Alexander Monakov
2016-04-21 14:08 ` Nathan Sidwell
2016-05-20 15:09   ` Alexander Monakov
2016-05-20 16:36     ` Sandra Loosemore
2016-05-20 19:42     ` Nathan Sidwell [this message]
2016-05-25  2:32       ` Alexander Monakov
2016-05-25 13:33         ` Nathan Sidwell
2016-06-02 21:22           ` Alexander Monakov
2016-06-09 19:49             ` Nathan Sidwell
2016-06-09 20:11               ` Alexander Monakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e83ec5f1-c8bd-9daa-584c-512cc20c36e3@acm.org \
    --to=nathan@acm.org \
    --cc=amonakov@ispras.ru \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=sandra@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).