From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mxout.security-mail.net (mxout.security-mail.net [85.31.212.42]) by sourceware.org (Postfix) with ESMTPS id 2F38B3856275 for ; Tue, 12 Sep 2023 10:08:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2F38B3856275 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=kalrayinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kalrayinc.com Received: from localhost (localhost [127.0.0.1]) by fx302.security-mail.net (Postfix) with ESMTP id 33A5F4F841F for ; Tue, 12 Sep 2023 12:08:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kalrayinc.com; s=sec-sig-email; t=1694513280; bh=0ysOOHRT71tqEjkIXQOz2h58Gp1qvQzvSdFlOtqfXuk=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=SC8G/pUE6ZSoHLAA2j8Ef0zk2ic99VEI4SAJ7qqyVlKKvV41dGfHWzfXL1WdEYRr8 Z8nAGCNNBDXn63xcLG6p8eRzfVSA/l0Cwj8cBsjUWi5+7MBwPmgeGPYYCptUpbPNF0 oY+oEwcOJNj6ID35XiHJFoDdZUTlfwWzDt/R5p1k= Received: from fx302 (localhost [127.0.0.1]) by fx302.security-mail.net (Postfix) with ESMTP id 13B9B4F868C for ; Tue, 12 Sep 2023 12:08:00 +0200 (CEST) Received: from FRA01-MR2-obe.outbound.protection.outlook.com (mail-mr2fra01lp0100.outbound.protection.outlook.com [104.47.25.100]) by fx302.security-mail.net (Postfix) with ESMTPS id 8E50A4F8707 for ; Tue, 12 Sep 2023 12:07:59 +0200 (CEST) Received: from PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:184::6) by MRZP264MB2442.FRAP264.PROD.OUTLOOK.COM (2603:10a6:501:7::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6768.35; Tue, 12 Sep 2023 10:07:57 +0000 Received: from PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM ([fe80::ad9:f173:8a66:3461]) by PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM ([fe80::ad9:f173:8a66:3461%7]) with mapi id 15.20.6768.029; Tue, 12 Sep 2023 10:07:57 +0000 X-Virus-Scanned: E-securemail Secumail-id: <7e2c.6500387f.8dca1.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PJJ1i5IGzNiz1SDmC05GtW3WJCjutiqwGpm5PwOINLfcyEfemmKnyYaNMLZD39dltVV/IQwssF7fmj6ehaEFn19quFd76PfHYetya8kwtojYBHCJdV3wYzg5/DPoVzfwswtNcINWjM9Yo8ZD1PuNc5+S+V9NCG6FBUv3xQ2GMfNBrEB/k/PGCPCaDpiJV+7Krhz970bOzHZASKxqM7XhuDN9AFoTkI4d1s/jnl3aVe/++7ONIwxEBR32IFTwOyfUQ21T0lM2tWcIsKDpY4EfVDYadDbWpPV9NB/GFHujcWXWSYFcyfXuUcSRhsWoqORhlYrTejrwCiRt8Aa/abJQcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=E+fJaLO3B2xbEw+KM1EcFYIIgSNS4vGZwvu3BWNTIO8=; b=iSdDnf2VThEesww9zsg+MPkozgz2phqa85A8p6zud+j2Z8A+SY65nax8Kyf/sJTwM3ciSLmdUM76EU3+63n++/QRaAKIzP0qlw6eP6qumjlFZfI65aeUrWjnCdmDYhjoUX+yCBQl94ZfIoO6Jbn7rJxlDu3IjZoUD7LzIirO8g90X1I9cf9eNHJK03DvyfMhzy1/mZtHXqeqD+s3YdZYg0NaJ3FVAYW1GrnL9BJPh+DjrXaDYkkesyQQhOpunYWnz6P6wVdJValF5ZS3zSWMyEdIXsx4ssUsLP/hE4qqq7uYaChBIEL1P2R+JV6wr6AZXEh0dq/8XOqjlbX84H113w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kalrayinc.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=E+fJaLO3B2xbEw+KM1EcFYIIgSNS4vGZwvu3BWNTIO8=; b=buM7X+IUS0BIfqF8Yh2mAmcqnoWfuydgd/cukFsFywJdKhEF0sEnpx0yi/1xsCDPnJChpf5nuT1B6dseyLyujN/gae0XObYSH5w4xZeMsnszHc1/1lsqvpU1/sqrIeeqoyW/Xxy4MQzn2+YgaG/4O5/SZnwcEw8bScau5g1YuyMJKFUz0oBbH/EXIISId3UXbDkSvhDrs+qwLzq2MxaGyBfZVwdEZadWqE2qpEBgBobWzIeVhT5PBF66hUXp6mg0GcQnXZ1vhuIQ/Krk6JM2TbS1gVSR2e9h0K4gYZxskv6gt/u8FPlAY9YKN7RSz7U6NT834BYRHNziDabpDVtJYw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=kalrayinc.com; From: Sylvain Noiry To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH v2 07/11] Native complex ops: Vectorization of native complex operations Date: Tue, 12 Sep 2023 12:07:09 +0200 Message-ID: <20230912100713.1074-8-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230912100713.1074-1-snoiry@kalrayinc.com> References: <20230717090250.4645-10-snoiry@kalrayinc.com> <20230912100713.1074-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LO4P123CA0194.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1a4::19) To PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:184::6) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: PR1P264MB3448:EE_|MRZP264MB2442:EE_ X-MS-Office365-Filtering-Correlation-Id: d0dc123a-3ca5-4550-ac5b-08dbb3782328 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: RK1ivsPmBGm1r/o3DISrGtjAc2lbiZai3PPAPoiNan47TmKkkXF7KAnSeKWsPFUj9chKFb8gWdqklZCZQIj8vTjI44UBxRId0oXw0HTn+HDLrtbwWktaCX8sH4YFNQYvaoh6CXspSAgdkBzwwh8l4dWYUINPoD+NTlYWSkRzq/zMBl0wXCNv03nJ5+HAL2AIS0+DMPV51d+lvLx6pKmdc6Xc0iPxLL5bWaU0DNGqSCe4ggjYWSTApoVWhsJx0J7CoTbQxBsntsh/5fSVC5b6iGvAfsggG/YYOePLSIs/Kgv2woQhse8D7Du2WXdHR8oZXeI0HyD9sI6eJINo6m5ItkpuMybcvSY1v75EqaxdO2HZ5VkF02kHMYWyjV658f8JViMDWxCMtQrvZgpQWKrPuopW9H+9RnysRD114RrjAFItc6keEjjRbfaAnwHlDuL7ONowKHO/NJ3XD+exhqKWgZUh+LOUPDiTdsMil8xh1Pn7yp3ZOVR/0OUDLWHOuH64BpLSsyootNBHc4DYJ8qkJHU0hY5kFxdfhQxYmVIUpyA0g4tHDlJNmufs3Wq84ueN X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230031)(136003)(366004)(376002)(346002)(396003)(39860400002)(451199024)(186009)(1800799009)(1076003)(107886003)(2616005)(5660300002)(4326008)(8936002)(8676002)(6512007)(26005)(6486002)(6666004)(6506007)(83380400001)(66476007)(66556008)(66946007)(38100700002)(478600001)(41300700001)(6916009)(316002)(36756003)(86362001)(30864003)(2906002);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: syU5ks0AgBcwgD6eF8mJlXi46+UZmaDgqSWuj5oc6TYkSqd6UlzKXLQodon5/lF97DXZK6z3au9sLs72IMD2MNWCzgAEZLAV0PwrUPOP3t/2S6xW7AFCowOUD77mmFc7Whmf8gwQpWWWW7rFu6Zv3QvP+Y4ITKcLsvwl9XFBWZRc5dBjtf0Khlz2Ik49/dwf1S+uKc+wrsQF6VzflM9cxzapuL4YrwWGV+FGElzKlWg6pMe9UQmd4siai9IG2pGWoyaDfZ+J+E0kcrow5aOiFuVB/86ClNjtwpdhgLW4x3PPtaKTfqsAy2F++0gKS0j+jaHxOTAe1smETcra3xVpCPeK5wFicJCNsyW+8A2m1x6R1fTl7U9cNr0hyon4Dq65XRZOwlEhAAHqj0Hto2+9jxerjRrO0oGidPERzZelKrpS9jQv9X7smSeBsvUJ/124oZjaW9AFqt/2D0cEnPWl3jVoRs81bM+u11OwjaoVMR2rmsriqVD8NQie5KikBPDWucH/LwnKYVa5DR8EaeQfPUdVVKG8OR0TQXCVXtwAiXj3vZFYmeHNscUr+3lv5WZympj0j1rWTvk6NtWoPkfZUizqnsp1ngpHlZHm64P7oMvZUvHvJKTe1VXidMt8/67rYLMFmgQ57F1iQ2pk3UuDsuXt3g2iv9+67+/fDIXQPLFcPpB9Ac6QPFHLbrjj47GstdMx5UQfcRuoCkEsPZ9tgGOVYzGuLEwlzmpohAKkiiX+k41u/571vjPAmjJrt9IgM55sSmk/jzoR8LGQ+y8a0oBUqJwD2n5AnftwAPV/EjNUbxLf3Ewb7uwTrIXaUivT4q/s+X7kjrsRBRfX8cQmK7RuZvvHb+3CcNizx/AlPqoMAR8aDlCm0oWlR7TzcyA55OGOkAAKlw9kp9hcGuAZwlcMrGXmoi33Wo3cdzsBxN9K+R/akIfqAK19sp254vCR YQbzKg8DKx2tpKFejhwF5osLmw+siA8JmfE3Ztu5xGBAQB0VTgXWJZ5UBkcw9YDQnU+FaKYiCaN7pZEwEpR5V45Lniiny+3TjBDwsdKmxc/+rSdqLPkvswCgnKDvySqF4ppiBUnRrGwTGCDYu240QhG0NmgRlKy5ZKysGQEVxzYzf2KB6kp0jrutBpGCH8mDKio6mfIdklVSoBC/VXMRdswy28s5aB0QRR+foL4zw/sQRakk+9wr44Tj+UnYuQ/nT4RyKu6LK9lVTUNit0dRDt8d/F/w+bDde69K0RwdHh2w5GradaBomhdTY9nZOyxtpAAqS1x9L70ADeRtY17uhugMK8aYHk0OlsYGn7U8aYP3xJBLIDmhQA4kYWS9lYPxoa8F0mfFjZ28GMWgNZy/rsd/hhe0WwF74HMdknQQ0E9h3JA8QWc3LW8wM9g+I/XvsLXb9wJhrPSvWjJbbQB0jBH5ET8cpZMjn6uWPjC9aLBbxCp9vfar7EIVLpUADzqDQJCGGxB9JT5XqHbrHKr0Q35I2S4ehD74P5db+HZLBxAknCNGLHmQNRTz/WhGNwJEwnh1Jmxr/J14qsEmvvPe1mz3k9BqDDwkJPFI/HsI2+AqPibs9a9MgvFJkOntQtxJ17IYHv9gDgH8B4rJ6FjAyA== X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: d0dc123a-3ca5-4550-ac5b-08dbb3782328 X-MS-Exchange-CrossTenant-AuthSource: PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Sep 2023 10:07:57.1388 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: jQsWotf6R9Z6PWZwXImnDUOfn0epbdNFRGGU15zqHzfLEO/dUJo7uLauAUAU8TAZewCu52knz/bWueB/Sp3JsQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MRZP264MB2442 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Summary: Add vectors of complex types to vectorize native operations. Because of the vectorize was designed to work with scalar elements, several functions and target hooks have to be adapted or duplicated to support complex types. After that, the vectorization of native complex operations follows exactly the same flow as scalars operations. gcc/ChangeLog: * target.def: Add preferred_simd_mode_complex and related_mode_complex by duplicating their scalar counterparts * targhooks.h: Add default_preferred_simd_mode_complex and default_vectorize_related_mode_complex * targhooks.cc (default_preferred_simd_mode_complex): New: Default implementation of preferred_simd_mode_complex (default_vectorize_related_mode_complex): New: Default implementation of related_mode_complex * doc/tm.texi: Document TARGET_VECTORIZE_PREFERRED_SIMD_MODE_COMPLEX and TARGET_VECTORIZE_RELATED_MODE_COMPLEX * doc/tm.texi.in: Add TARGET_VECTORIZE_PREFERRED_SIMD_MODE_COMPLEX and TARGET_VECTORIZE_RELATED_MODE_COMPLEX * emit-rtl.cc (init_emit_once): Add the zero constant for vectors of complex modes * genmodes.cc (vector_class): Add case for vectors of complex (complete_mode): Likewise (make_complex_modes): Likewise * gensupport.cc (match_pattern): Likewise * machmode.h: Add vectors of complex in predicates and redefine mode_for_vector and related_vector_mode for complex types * mode-classes.def: Add MODE_VECTOR_COMPLEX_INT and MODE_VECTOR_COMPLEX_FLOAT classes * stor-layout.cc (mode_for_vector): Adapt for complex modes using sub-functions calling a common one (int_mode_for_mode): Add case for complex vectors (related_vector_mode): Implement the function for complex modes * tree-vect-generic.cc (type_for_widest_vector_mode): Add cases for complex modes * tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Adapt for complex modes * tree.cc (build_vector_type_for_mode): Add cases for complex modes --- gcc/doc/tm.texi | 31 ++++++++++++++++++++++ gcc/doc/tm.texi.in | 4 +++ gcc/emit-rtl.cc | 10 +++++++ gcc/genmodes.cc | 8 ++++++ gcc/gensupport.cc | 3 +++ gcc/machmode.h | 19 +++++++++++--- gcc/mode-classes.def | 2 ++ gcc/stor-layout.cc | 45 ++++++++++++++++++++++++++++--- gcc/target.def | 39 +++++++++++++++++++++++++++ gcc/targhooks.cc | 29 ++++++++++++++++++++ gcc/targhooks.h | 4 +++ gcc/tree-vect-generic.cc | 4 +++ gcc/tree-vect-stmts.cc | 57 ++++++++++++++++++++++++++++------------ gcc/tree.cc | 2 ++ 14 files changed, 232 insertions(+), 25 deletions(-) diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 1e87f798449..f7a8a5351e2 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6251,6 +6251,13 @@ equal to @code{word_mode}, because the vectorizer can do some transformations even in absence of specialized @acronym{SIMD} hardware. @end deftypefn +@deftypefn {Target Hook} machine_mode TARGET_VECTORIZE_PREFERRED_SIMD_MODE_COMPLEX (complex_mode @var{mode}) +This hook should return the preferred mode for vectorizing complex +mode @var{mode}. The default is +equal to @code{word_mode}, because the vectorizer can do some +transformations even in absence of specialized @acronym{SIMD} hardware. +@end deftypefn + @deftypefn {Target Hook} machine_mode TARGET_VECTORIZE_SPLIT_REDUCTION (machine_mode) This hook should return the preferred mode to split the final reduction step on @var{mode} to. The reduction is then carried out reducing upper @@ -6313,6 +6320,30 @@ requested mode, returning a mode with the same size as @var{vector_mode} when @var{nunits} is zero. This is the correct behavior for most targets. @end deftypefn +@deftypefn {Target Hook} opt_machine_mode TARGET_VECTORIZE_RELATED_MODE_COMPLEX (machine_mode @var{vector_mode}, complex_mode @var{element_mode}, poly_uint64 @var{nunits}) +If a piece of code is using vector mode @var{vector_mode} and also wants +to operate on elements of mode @var{element_mode}, return the vector mode +it should use for those elements. If @var{nunits} is nonzero, ensure that +the mode has exactly @var{nunits} elements, otherwise pick whichever vector +size pairs the most naturally with @var{vector_mode}. Return an empty +@code{opt_machine_mode} if there is no supported vector mode with the +required properties. + +There is no prescribed way of handling the case in which @var{nunits} +is zero. One common choice is to pick a vector mode with the same size +as @var{vector_mode}; this is the natural choice if the target has a +fixed vector size. Another option is to choose a vector mode with the +same number of elements as @var{vector_mode}; this is the natural choice +if the target has a fixed number of elements. Alternatively, the hook +might choose a middle ground, such as trying to keep the number of +elements as similar as possible while applying maximum and minimum +vector sizes. + +The default implementation uses @code{mode_for_vector} to find the +requested mode, returning a mode with the same size as @var{vector_mode} +when @var{nunits} is zero. This is the correct behavior for most targets. +@end deftypefn + @deftypefn {Target Hook} opt_machine_mode TARGET_VECTORIZE_GET_MASK_MODE (machine_mode @var{mode}) Return the mode to use for a vector mask that holds one boolean result for each element of vector mode @var{mode}. The returned mask mode diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 27a0b321fe0..68650d70e02 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4197,12 +4197,16 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_VECTORIZE_PREFERRED_SIMD_MODE +@hook TARGET_VECTORIZE_PREFERRED_SIMD_MODE_COMPLEX + @hook TARGET_VECTORIZE_SPLIT_REDUCTION @hook TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES @hook TARGET_VECTORIZE_RELATED_MODE +@hook TARGET_VECTORIZE_RELATED_MODE_COMPLEX + @hook TARGET_VECTORIZE_GET_MASK_MODE @hook TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc index f7c33c4afb1..b0d556e45aa 100644 --- a/gcc/emit-rtl.cc +++ b/gcc/emit-rtl.cc @@ -6276,6 +6276,16 @@ init_emit_once (void) targetm.gen_rtx_complex (mode, inner, inner); } + FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_COMPLEX_INT) + { + const_tiny_rtx[0][(int) mode] = gen_const_vector (mode, 0); + } + + FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_COMPLEX_FLOAT) + { + const_tiny_rtx[0][(int) mode] = gen_const_vector (mode, 0); + } + FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_BOOL) { const_tiny_rtx[0][(int) mode] = gen_const_vector (mode, 0); diff --git a/gcc/genmodes.cc b/gcc/genmodes.cc index 55ac2adb559..775878b3f7e 100644 --- a/gcc/genmodes.cc +++ b/gcc/genmodes.cc @@ -142,6 +142,8 @@ vector_class (enum mode_class cl) case MODE_UFRACT: return MODE_VECTOR_UFRACT; case MODE_ACCUM: return MODE_VECTOR_ACCUM; case MODE_UACCUM: return MODE_VECTOR_UACCUM; + case MODE_COMPLEX_INT: return MODE_VECTOR_COMPLEX_INT; + case MODE_COMPLEX_FLOAT: return MODE_VECTOR_COMPLEX_FLOAT; default: error ("no vector class for class %s", mode_class_names[cl]); return MODE_RANDOM; @@ -400,6 +402,8 @@ complete_mode (struct mode_data *m) case MODE_VECTOR_UFRACT: case MODE_VECTOR_ACCUM: case MODE_VECTOR_UACCUM: + case MODE_VECTOR_COMPLEX_INT: + case MODE_VECTOR_COMPLEX_FLOAT: /* Vector modes should have a component and a number of components. */ validate_mode (m, UNSET, UNSET, SET, SET, UNSET); if (m->component->precision != (unsigned int)-1) @@ -462,6 +466,10 @@ make_complex_modes (enum mode_class cl, if (m->boolean) continue; + /* Skip already created mode. */ + if (m->complex) + continue; + m_len = strlen (m->name); /* The leading "1 +" is in case we prepend a "C" below. */ buf = (char *) xmalloc (1 + m_len + 1); diff --git a/gcc/gensupport.cc b/gcc/gensupport.cc index 54f7b3cfe81..6be704feee1 100644 --- a/gcc/gensupport.cc +++ b/gcc/gensupport.cc @@ -3747,16 +3747,19 @@ match_pattern (optab_pattern *p, const char *name, const char *pat) if (*p == 0 && (! force_int || mode_class[i] == MODE_INT || mode_class[i] == MODE_COMPLEX_INT + || mode_class[i] == MODE_VECTOR_COMPLEX_INT || mode_class[i] == MODE_VECTOR_INT) && (! force_partial_int || mode_class[i] == MODE_INT || mode_class[i] == MODE_COMPLEX_INT + || mode_class[i] == MODE_VECTOR_COMPLEX_INT || mode_class[i] == MODE_PARTIAL_INT || mode_class[i] == MODE_VECTOR_INT) && (! force_float || mode_class[i] == MODE_FLOAT || mode_class[i] == MODE_DECIMAL_FLOAT || mode_class[i] == MODE_COMPLEX_FLOAT + || mode_class[i] == MODE_VECTOR_COMPLEX_FLOAT || mode_class[i] == MODE_VECTOR_FLOAT) && (! force_fixed || mode_class[i] == MODE_FRACT diff --git a/gcc/machmode.h b/gcc/machmode.h index fd87af7c74a..a32624da863 100644 --- a/gcc/machmode.h +++ b/gcc/machmode.h @@ -110,6 +110,7 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || GET_MODE_CLASS (MODE) == MODE_PARTIAL_INT \ || GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_BOOL \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_INT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_INT) /* Nonzero if MODE is a floating-point mode. */ @@ -117,19 +118,24 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; (GET_MODE_CLASS (MODE) == MODE_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_DECIMAL_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT) /* Nonzero if MODE is a complex integer mode. */ -#define COMPLEX_INT_MODE_P(MODE) \ - (GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT) +#define COMPLEX_INT_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_INT \ + || GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT) /* Nonzero if MODE is a complex floating-point mode. */ -#define COMPLEX_FLOAT_MODE_P(MODE) \ - (GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT) +#define COMPLEX_FLOAT_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_FLOAT \ + || GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT) /* Nonzero if MODE is a complex mode. */ #define COMPLEX_MODE_P(MODE) \ (GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_INT \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT) /* Nonzero if MODE is a vector mode. */ @@ -140,6 +146,8 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || GET_MODE_CLASS (MODE) == MODE_VECTOR_FRACT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_UFRACT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_ACCUM \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_INT \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_UACCUM) /* Nonzero if MODE is a scalar integral mode. */ @@ -929,6 +937,9 @@ extern opt_machine_mode bitwise_mode_for_mode (machine_mode); extern opt_machine_mode mode_for_vector (scalar_mode, poly_uint64); extern opt_machine_mode related_vector_mode (machine_mode, scalar_mode, poly_uint64 = 0); +extern opt_machine_mode mode_for_vector (complex_mode, poly_uint64); +extern opt_machine_mode related_vector_mode (machine_mode, + complex_mode, poly_uint64 = 0); extern opt_machine_mode related_int_vector_mode (machine_mode); /* A class for iterating through possible bitfield modes. */ diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def index de42d7ee6fb..cc6bcaeb026 100644 --- a/gcc/mode-classes.def +++ b/gcc/mode-classes.def @@ -32,9 +32,11 @@ along with GCC; see the file COPYING3. If not see DEF_MODE_CLASS (MODE_COMPLEX_FLOAT), \ DEF_MODE_CLASS (MODE_VECTOR_BOOL), /* vectors of single bits */ \ DEF_MODE_CLASS (MODE_VECTOR_INT), /* SIMD vectors */ \ + DEF_MODE_CLASS (MODE_VECTOR_COMPLEX_INT), /* SIMD vectors */ \ DEF_MODE_CLASS (MODE_VECTOR_FRACT), /* SIMD vectors */ \ DEF_MODE_CLASS (MODE_VECTOR_UFRACT), /* SIMD vectors */ \ DEF_MODE_CLASS (MODE_VECTOR_ACCUM), /* SIMD vectors */ \ DEF_MODE_CLASS (MODE_VECTOR_UACCUM), /* SIMD vectors */ \ DEF_MODE_CLASS (MODE_VECTOR_FLOAT), \ + DEF_MODE_CLASS (MODE_VECTOR_COMPLEX_FLOAT), \ DEF_MODE_CLASS (MODE_OPAQUE) /* opaque modes */ diff --git a/gcc/stor-layout.cc b/gcc/stor-layout.cc index ba375fa423c..450067d3562 100644 --- a/gcc/stor-layout.cc +++ b/gcc/stor-layout.cc @@ -378,6 +378,8 @@ int_mode_for_mode (machine_mode mode) case MODE_COMPLEX_INT: case MODE_COMPLEX_FLOAT: + case MODE_VECTOR_COMPLEX_INT: + case MODE_VECTOR_COMPLEX_FLOAT: case MODE_FLOAT: case MODE_DECIMAL_FLOAT: case MODE_FRACT: @@ -480,8 +482,8 @@ bitwise_type_for_mode (machine_mode mode) elements of mode INNERMODE, if one exists. The returned mode can be either an integer mode or a vector mode. */ -opt_machine_mode -mode_for_vector (scalar_mode innermode, poly_uint64 nunits) +static opt_machine_mode +mode_for_vector (machine_mode innermode, poly_uint64 nunits) { machine_mode mode; @@ -496,8 +498,14 @@ mode_for_vector (scalar_mode innermode, poly_uint64 nunits) mode = MIN_MODE_VECTOR_ACCUM; else if (SCALAR_UACCUM_MODE_P (innermode)) mode = MIN_MODE_VECTOR_UACCUM; - else + else if (SCALAR_INT_MODE_P (innermode)) mode = MIN_MODE_VECTOR_INT; + else if (COMPLEX_FLOAT_MODE_P (innermode)) + mode = MIN_MODE_VECTOR_COMPLEX_FLOAT; + else if (COMPLEX_INT_MODE_P (innermode)) + mode = MIN_MODE_VECTOR_COMPLEX_INT; + else + gcc_unreachable (); /* Only check the broader vector_mode_supported_any_target_p here. We'll filter through target-specific availability and @@ -511,7 +519,7 @@ mode_for_vector (scalar_mode innermode, poly_uint64 nunits) /* For integers, try mapping it to a same-sized scalar mode. */ if (GET_MODE_CLASS (innermode) == MODE_INT) { - poly_uint64 nbits = nunits * GET_MODE_BITSIZE (innermode); + poly_uint64 nbits = nunits * GET_MODE_BITSIZE (innermode).coeffs[0]; if (int_mode_for_size (nbits, 0).exists (&mode) && have_regs_of_mode[mode]) return mode; @@ -520,6 +528,26 @@ mode_for_vector (scalar_mode innermode, poly_uint64 nunits) return opt_machine_mode (); } +/* Find a mode that is suitable for representing a vector with NUNITS + elements of scalar mode INNERMODE, if one exists. The returned mode + can be either an integer mode or a vector mode. */ + +opt_machine_mode +mode_for_vector (scalar_mode innermode, poly_uint64 nunits) +{ + return mode_for_vector (machine_mode (innermode), nunits); +} + +/* Find a mode that is suitable for representing a vector with NUNITS + elements of complex mode INNERMODE, if one exists. The returned mode + can be either an integer mode or a vector mode. */ + +opt_machine_mode +mode_for_vector (complex_mode innermode, poly_uint64 nunits) +{ + return mode_for_vector (machine_mode (innermode), nunits); +} + /* If a piece of code is using vector mode VECTOR_MODE and also wants to operate on elements of mode ELEMENT_MODE, return the vector mode it should use for those elements. If NUNITS is nonzero, ensure that @@ -540,6 +568,15 @@ related_vector_mode (machine_mode vector_mode, scalar_mode element_mode, return targetm.vectorize.related_mode (vector_mode, element_mode, nunits); } +opt_machine_mode +related_vector_mode (machine_mode vector_mode, + complex_mode element_mode, poly_uint64 nunits) +{ + gcc_assert (VECTOR_MODE_P (vector_mode)); + return targetm.vectorize.related_mode_complex (vector_mode, element_mode, + nunits); +} + /* If a piece of code is using vector mode VECTOR_MODE and also wants to operate on integer vectors with the same element size and number of elements, return the vector mode it should use. Return an empty diff --git a/gcc/target.def b/gcc/target.def index 4eafff1d21b..665e81e9ef1 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1943,6 +1943,18 @@ transformations even in absence of specialized @acronym{SIMD} hardware.", (scalar_mode mode), default_preferred_simd_mode) +/* Returns the preferred mode for SIMD operations for the specified + complex mode. */ +DEFHOOK +(preferred_simd_mode_complex, + "This hook should return the preferred mode for vectorizing complex\n\ +mode @var{mode}. The default is\n\ +equal to @code{word_mode}, because the vectorizer can do some\n\ +transformations even in absence of specialized @acronym{SIMD} hardware.", + machine_mode, + (complex_mode mode), + default_preferred_simd_mode_complex) + /* Returns the preferred mode for splitting SIMD reductions to. */ DEFHOOK (split_reduction, @@ -2017,6 +2029,33 @@ when @var{nunits} is zero. This is the correct behavior for most targets.", (machine_mode vector_mode, scalar_mode element_mode, poly_uint64 nunits), default_vectorize_related_mode) +DEFHOOK +(related_mode_complex, + "If a piece of code is using vector mode @var{vector_mode} and also wants\n\ +to operate on elements of mode @var{element_mode}, return the vector mode\n\ +it should use for those elements. If @var{nunits} is nonzero, ensure that\n\ +the mode has exactly @var{nunits} elements, otherwise pick whichever vector\n\ +size pairs the most naturally with @var{vector_mode}. Return an empty\n\ +@code{opt_machine_mode} if there is no supported vector mode with the\n\ +required properties.\n\ +\n\ +There is no prescribed way of handling the case in which @var{nunits}\n\ +is zero. One common choice is to pick a vector mode with the same size\n\ +as @var{vector_mode}; this is the natural choice if the target has a\n\ +fixed vector size. Another option is to choose a vector mode with the\n\ +same number of elements as @var{vector_mode}; this is the natural choice\n\ +if the target has a fixed number of elements. Alternatively, the hook\n\ +might choose a middle ground, such as trying to keep the number of\n\ +elements as similar as possible while applying maximum and minimum\n\ +vector sizes.\n\ +\n\ +The default implementation uses @code{mode_for_vector} to find the\n\ +requested mode, returning a mode with the same size as @var{vector_mode}\n\ +when @var{nunits} is zero. This is the correct behavior for most targets.", + opt_machine_mode, + (machine_mode vector_mode, complex_mode element_mode, poly_uint64 nunits), + default_vectorize_related_mode_complex) + /* Function to get a target mode for a vector mask. */ DEFHOOK (get_mask_mode, diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index d89668cd1ab..463b53e0baa 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -1533,6 +1533,15 @@ default_preferred_simd_mode (scalar_mode) return word_mode; } +/* By default, only attempt to parallelize bitwise operations, and + possibly adds/subtracts using bit-twiddling. */ + +machine_mode +default_preferred_simd_mode_complex (complex_mode) +{ + return word_mode; +} + /* By default, call gen_rtx_CONCAT. */ rtx @@ -1734,6 +1743,26 @@ default_vectorize_related_mode (machine_mode vector_mode, return opt_machine_mode (); } + +/* The default implementation of TARGET_VECTORIZE_RELATED_MODE_COMPLEX. */ + +opt_machine_mode +default_vectorize_related_mode_complex (machine_mode vector_mode, + complex_mode element_mode, + poly_uint64 nunits) +{ + machine_mode result_mode; + if ((maybe_ne (nunits, 0U) + || multiple_p (GET_MODE_SIZE (vector_mode), + GET_MODE_SIZE (element_mode), &nunits)) + && mode_for_vector (element_mode, nunits).exists (&result_mode) + && VECTOR_MODE_P (result_mode) + && targetm.vector_mode_supported_p (result_mode)) + return result_mode; + + return opt_machine_mode (); +} + /* By default a vector of integers is used as a mask. */ opt_machine_mode diff --git a/gcc/targhooks.h b/gcc/targhooks.h index f3ae17998de..c7ca6b55e31 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -115,11 +115,15 @@ default_builtin_support_vector_misalignment (machine_mode mode, const_tree, int, bool); extern machine_mode default_preferred_simd_mode (scalar_mode mode); +extern machine_mode default_preferred_simd_mode_complex (complex_mode mode); extern machine_mode default_split_reduction (machine_mode); extern unsigned int default_autovectorize_vector_modes (vector_modes *, bool); extern opt_machine_mode default_vectorize_related_mode (machine_mode, scalar_mode, poly_uint64); +extern opt_machine_mode default_vectorize_related_mode_complex (machine_mode, + complex_mode, + poly_uint64); extern opt_machine_mode default_get_mask_mode (machine_mode); extern bool default_empty_mask_is_expensive (unsigned); extern vector_costs *default_vectorize_create_costs (vec_info *, bool); diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index a7e6cb87a5e..718b144ec23 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -1363,6 +1363,10 @@ type_for_widest_vector_mode (tree type, optab op) mode = MIN_MODE_VECTOR_ACCUM; else if (SCALAR_UACCUM_MODE_P (inner_mode)) mode = MIN_MODE_VECTOR_UACCUM; + else if (COMPLEX_INT_MODE_P (inner_mode)) + mode = MIN_MODE_VECTOR_COMPLEX_INT; + else if (COMPLEX_FLOAT_MODE_P (inner_mode)) + mode = MIN_MODE_VECTOR_COMPLEX_FLOAT; else if (inner_mode == BImode) mode = MIN_MODE_VECTOR_BOOL; else diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index cd7c1090d88..507cfd04897 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -12732,18 +12732,31 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, tree scalar_type, poly_uint64 nunits) { tree orig_scalar_type = scalar_type; - scalar_mode inner_mode; + scalar_mode scal_mode; + complex_mode cplx_mode; + machine_mode inner_mode; machine_mode simd_mode; tree vectype; + bool cplx = false; - if ((!INTEGRAL_TYPE_P (scalar_type) + if (!INTEGRAL_TYPE_P (scalar_type) && !POINTER_TYPE_P (scalar_type) - && !SCALAR_FLOAT_TYPE_P (scalar_type)) - || (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode) - && !is_float_mode (TYPE_MODE (scalar_type), &inner_mode))) + && !SCALAR_FLOAT_TYPE_P (scalar_type) + && !COMPLEX_INTEGER_TYPE_P (scalar_type) + && !COMPLEX_FLOAT_TYPE_P (scalar_type)) return NULL_TREE; - unsigned int nbytes = GET_MODE_SIZE (inner_mode); + if (is_complex_int_mode (TYPE_MODE (scalar_type), &cplx_mode) + || is_complex_float_mode (TYPE_MODE (scalar_type), &cplx_mode)) + cplx = true; + + if (!cplx && !is_int_mode (TYPE_MODE (scalar_type), &scal_mode) + && !is_float_mode (TYPE_MODE (scalar_type), &scal_mode)) + return NULL_TREE; + + unsigned int nbytes = + (cplx) ? GET_MODE_SIZE (cplx_mode) : GET_MODE_SIZE (scal_mode); + inner_mode = (cplx) ? machine_mode (cplx_mode) : machine_mode (scal_mode); /* Interoperability between modes requires one to be a constant multiple of the other, so that the number of vectors required for each operation @@ -12761,19 +12774,20 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, they support the proper result truncation/extension. We also make sure to build vector types with INTEGER_TYPE component type only. */ - if (INTEGRAL_TYPE_P (scalar_type) - && (GET_MODE_BITSIZE (inner_mode) != TYPE_PRECISION (scalar_type) + if (!cplx && INTEGRAL_TYPE_P (scalar_type) + && (GET_MODE_BITSIZE (scal_mode) != TYPE_PRECISION (scalar_type) || TREE_CODE (scalar_type) != INTEGER_TYPE)) - scalar_type = build_nonstandard_integer_type (GET_MODE_BITSIZE (inner_mode), - TYPE_UNSIGNED (scalar_type)); + scalar_type = + build_nonstandard_integer_type (GET_MODE_BITSIZE (scal_mode), + TYPE_UNSIGNED (scalar_type)); /* We shouldn't end up building VECTOR_TYPEs of non-scalar components. When the component mode passes the above test simply use a type corresponding to that mode. The theory is that any use that would cause problems with this will disable vectorization anyway. */ - else if (!SCALAR_FLOAT_TYPE_P (scalar_type) + else if (!cplx && !SCALAR_FLOAT_TYPE_P (scalar_type) && !INTEGRAL_TYPE_P (scalar_type)) - scalar_type = lang_hooks.types.type_for_mode (inner_mode, 1); + scalar_type = lang_hooks.types.type_for_mode (scal_mode, 1); /* We can't build a vector type of elements with alignment bigger than their size. */ @@ -12791,7 +12805,10 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, if (prevailing_mode == VOIDmode) { gcc_assert (known_eq (nunits, 0U)); - simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode); + + simd_mode = (cplx) + ? targetm.vectorize.preferred_simd_mode_complex (cplx_mode) + : targetm.vectorize.preferred_simd_mode (scal_mode); if (SCALAR_INT_MODE_P (simd_mode)) { /* Traditional behavior is not to take the integer mode @@ -12802,13 +12819,18 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, Note that nunits == 1 is allowed in order to support single element vector types. */ if (!multiple_p (GET_MODE_SIZE (simd_mode), nbytes, &nunits) - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode)) + || !((cplx) + ? mode_for_vector (cplx_mode, nunits).exists (&simd_mode) + : mode_for_vector (scal_mode, nunits).exists (&simd_mode))) return NULL_TREE; } } else if (SCALAR_INT_MODE_P (prevailing_mode) - || !related_vector_mode (prevailing_mode, - inner_mode, nunits).exists (&simd_mode)) + || !((cplx) ? related_vector_mode (prevailing_mode, + cplx_mode, + nunits).exists (&simd_mode) : + related_vector_mode (prevailing_mode, scal_mode, + nunits).exists (&simd_mode))) { /* Fall back to using mode_for_vector, mostly in the hope of being able to use an integer mode. */ @@ -12816,7 +12838,8 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits)) return NULL_TREE; - if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)) + if (!((cplx) ? mode_for_vector (cplx_mode, nunits).exists (&simd_mode) + : mode_for_vector (scal_mode, nunits).exists (&simd_mode))) return NULL_TREE; } diff --git a/gcc/tree.cc b/gcc/tree.cc index 73b72f80d25..8dcad519164 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -10170,6 +10170,8 @@ build_vector_type_for_mode (tree innertype, machine_mode mode) case MODE_VECTOR_UFRACT: case MODE_VECTOR_ACCUM: case MODE_VECTOR_UACCUM: + case MODE_VECTOR_COMPLEX_INT: + case MODE_VECTOR_COMPLEX_FLOAT: nunits = GET_MODE_NUNITS (mode); break; -- 2.17.1