From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 115478 invoked by alias); 3 Aug 2015 11:58:52 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 115465 invoked by uid 89); 3 Aug 2015 11:58:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qk0-f176.google.com Received: from mail-qk0-f176.google.com (HELO mail-qk0-f176.google.com) (209.85.220.176) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 03 Aug 2015 11:58:50 +0000 Received: by qkfc129 with SMTP id c129so49745014qkf.1 for ; Mon, 03 Aug 2015 04:58:48 -0700 (PDT) X-Received: by 10.55.20.82 with SMTP id e79mr23175079qkh.31.1438603127974; Mon, 03 Aug 2015 04:58:47 -0700 (PDT) Received: from ?IPv6:2601:181:c000:c497:a2a8:cdff:fe3e:b48? ([2601:181:c000:c497:a2a8:cdff:fe3e:b48]) by smtp.googlemail.com with ESMTPSA id 20sm6744016qkp.39.2015.08.03.04.58.46 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 03 Aug 2015 04:58:46 -0700 (PDT) To: GCC Patches From: Nathan Sidwell Subject: [gomp4] PTX launch dimensions Message-ID: <55BF5774.50502@acm.org> Date: Mon, 03 Aug 2015 11:58:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------070401000700030106010600" X-SW-Source: 2015-08/txt/msg00053.txt.bz2 This is a multi-part message in MIME format. --------------070401000700030106010600 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-length: 194 I've committed this to gomp4. The ptx backend can now examine the openacc attribute to determine launch dimensions and figure out whether vector or worker single neutering is needed. nathan --------------070401000700030106010600 Content-Type: text/x-patch; name="gomp4-ptx-dim.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="gomp4-ptx-dim.patch" Content-length: 3266 2015-08-03 Nathan Sidwell * config/nvptx/nvptx.c (nvptx_reorg): Check get_oacc_fn_attrib for launch dimensions and only do parallel processing when present. Check dimensions to determine neutering requirements. (nvptx_record_offload_symbol): Launch dimension attribute must be present on offloaded functions. Index: gcc/config/nvptx/nvptx.c =================================================================== --- gcc/config/nvptx/nvptx.c (revision 226485) +++ gcc/config/nvptx/nvptx.c (working copy) @@ -2980,13 +2980,42 @@ nvptx_reorg (void) if (REG_N_SETS (i) == 0 && REG_N_REFS (i) == 0) regno_reg_rtx[i] = const0_rtx; - parallel *pars = nvptx_discover_pars (&bb_insn_map); - - nvptx_process_pars (pars); - nvptx_neuter_pars (pars, (GOMP_DIM_MASK (GOMP_DIM_VECTOR) - | GOMP_DIM_MASK (GOMP_DIM_WORKER)), 0); - - delete pars; + /* Determine launch dimensions of the function. If it is not an + offloaded function (i.e. this is a regular compiler), the + function has no neutering. */ + tree attr = get_oacc_fn_attrib (current_function_decl); + if (attr) + { + unsigned mask = 0; + tree dims = TREE_VALUE (attr); + unsigned ix; + + for (ix = 0; ix != GOMP_DIM_MAX; ix++) + { + unsigned HOST_WIDE_INT dim = 0; + + if (dims) + { + tree cst = TREE_VALUE (dims); + + dim = TREE_INT_CST_LOW (cst); + dims = TREE_CHAIN (dims); + } + if (dim != 1) + mask |= GOMP_DIM_MASK (ix); + } + /* If there is worker neutering, there must be vector + neutering. Otherwise the hardware will fail. This really + should be dealt with earlier because it indicates faulty + logic in determining launch dimensions. */ + if (mask & GOMP_DIM_MASK (GOMP_DIM_WORKER)) + mask |= GOMP_DIM_MASK (GOMP_DIM_VECTOR); + + parallel *pars = nvptx_discover_pars (&bb_insn_map); + nvptx_process_pars (pars); + nvptx_neuter_pars (pars, mask, 0); + delete pars; + } nvptx_reorg_subreg (); @@ -3073,32 +3102,25 @@ nvptx_record_offload_symbol (tree decl) case FUNCTION_DECL: { tree attr = get_oacc_fn_attrib (decl); - tree dims = NULL_TREE; + tree dims = TREE_VALUE (attr); unsigned ix; - if (attr) - dims = TREE_VALUE (attr); fprintf (asm_out_file, "//:FUNC_MAP \"%s\"", IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl))); - for (ix = 0; ix != GOMP_DIM_MAX; ix++) + for (ix = 0; ix != GOMP_DIM_MAX; ix++, dims = TREE_CHAIN (dims)) { - unsigned HOST_WIDE_INT dim = 0; - if (dims) - { - tree cst = TREE_VALUE (dims); - - /* When device_type support is added an ealier pass - should have massaged the attribute to be - ptx-specific. */ - gcc_assert (TREE_CODE (cst) == INTEGER_CST); - - dim = TREE_INT_CST_LOW (cst); - dims = TREE_CHAIN (dims); - } + tree cst = TREE_VALUE (dims); + + /* When device_type support is added an earlier pass + should have massaged the attribute to be + ptx-specific. */ + gcc_assert (TREE_CODE (cst) == INTEGER_CST); + + unsigned HOST_WIDE_INT dim = TREE_INT_CST_LOW (cst); fprintf (asm_out_file, ", " HOST_WIDE_INT_PRINT_HEX, dim); } - + fprintf (asm_out_file, "\n"); } break; --------------070401000700030106010600--