From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 6BB9F3858C78 for ; Fri, 12 May 2023 13:04:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6BB9F3858C78 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 80F48219FD for ; Fri, 12 May 2023 13:04:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1683896654; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=E0br4S93rctDzRtYl+34LvYBm5xwixtNvApEP5o1hyg=; b=g22VEL+xXWD0yHXdum+CZJul4YUVInOn5KR6IAICBjGrR69215m+o2SQhFBOCKDY9miW4a ufl4yH8zxBTvGa0kwJhSReNyFesyy83tQtBdeqnqdMYli59eWkJGI5C4i3vxMZDdvEB25c E3vcgpAK5dMrTs0u6qwBVtqmQdRHZuo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1683896654; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=E0br4S93rctDzRtYl+34LvYBm5xwixtNvApEP5o1hyg=; b=E3zHqsKRKsgCkGINadm2ojS9mRORvVDWMa92qKbIohZ4bEqOADJPOv8zKHhZbX1K6BUPTh ihtKyJ8t87N5E/Bw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6D78B13466 for ; Fri, 12 May 2023 13:04:14 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id /UV1GU45XmRRMgAAMHmgww (envelope-from ) for ; Fri, 12 May 2023 13:04:14 +0000 Date: Fri, 12 May 2023 15:04:14 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/64731 - extend store-from CTOR lowering to TARGET_MEM_REF MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Message-Id: <20230512130414.6D78B13466@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The following also covers TARGET_MEM_REF when decomposing stores from CTORs to supported elementwise operations. This avoids spilling and cleans up after vector lowering which doesn't touch loads or stores. It also mimics what we already do for loads. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/64731 * tree-ssa-forwprop.cc (pass_forwprop::execute): Also handle TARGET_MEM_REF destinations of stores from vector CTORs. * gcc.target/i386/pr64731.c: New testcase. --- gcc/testsuite/gcc.target/i386/pr64731.c | 14 +++++++++ gcc/tree-ssa-forwprop.cc | 41 +++++++++++++++---------- 2 files changed, 38 insertions(+), 17 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr64731.c diff --git a/gcc/testsuite/gcc.target/i386/pr64731.c b/gcc/testsuite/gcc.target/i386/pr64731.c new file mode 100644 index 00000000000..dea5141ad24 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr64731.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mno-avx" } */ + +typedef double double4 __attribute__((vector_size(32))); + +void fun(double * a, double * b) +{ + for (int i = 0; i < 1024; i+=4) + *(double4*)&a[i] += *(double4 *)&b[i]; +} + +/* We don't want to spill but have both loads and stores lowered + to supported SSE operations. */ +/* { dg-final { scan-assembler-not "movap\[sd\].*\[er\]sp" } } */ diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc index 9dc67b5309c..e63d2ab82c9 100644 --- a/gcc/tree-ssa-forwprop.cc +++ b/gcc/tree-ssa-forwprop.cc @@ -3236,6 +3236,26 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi) return true; } +/* Prepare a TARGET_MEM_REF ref so that it can be subsetted as + lvalue. This splits out an address computation stmt before *GSI + and returns a MEM_REF wrapping the address. */ + +static tree +prepare_target_mem_ref_lvalue (tree ref, gimple_stmt_iterator *gsi) +{ + if (TREE_CODE (TREE_OPERAND (ref, 0)) == ADDR_EXPR) + mark_addressable (TREE_OPERAND (TREE_OPERAND (ref, 0), 0)); + tree ptrtype = build_pointer_type (TREE_TYPE (ref)); + tree tem = make_ssa_name (ptrtype); + gimple *new_stmt + = gimple_build_assign (tem, build1 (ADDR_EXPR, TREE_TYPE (tem), + unshare_expr (ref))); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + ref = build2_loc (EXPR_LOCATION (ref), + MEM_REF, TREE_TYPE (ref), tem, + build_int_cst (TREE_TYPE (TREE_OPERAND (ref, 1)), 0)); + return ref; +} /* Rewrite the vector load at *GSI to component-wise loads if the load is only used in BIT_FIELD_REF extractions with eventual intermediate @@ -3317,20 +3337,7 @@ optimize_vector_load (gimple_stmt_iterator *gsi) For TARGET_MEM_REFs we have to separate the LEA from the reference. */ tree load_rhs = rhs; if (TREE_CODE (load_rhs) == TARGET_MEM_REF) - { - if (TREE_CODE (TREE_OPERAND (load_rhs, 0)) == ADDR_EXPR) - mark_addressable (TREE_OPERAND (TREE_OPERAND (load_rhs, 0), 0)); - tree ptrtype = build_pointer_type (TREE_TYPE (load_rhs)); - tree tem = make_ssa_name (ptrtype); - gimple *new_stmt - = gimple_build_assign (tem, build1 (ADDR_EXPR, TREE_TYPE (tem), - unshare_expr (load_rhs))); - gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); - load_rhs = build2_loc (EXPR_LOCATION (load_rhs), - MEM_REF, TREE_TYPE (load_rhs), tem, - build_int_cst - (TREE_TYPE (TREE_OPERAND (load_rhs, 1)), 0)); - } + load_rhs = prepare_target_mem_ref_lvalue (load_rhs, gsi); /* Rewrite the BIT_FIELD_REFs to be actual loads, re-emitting them at the place of the original load. */ @@ -3823,9 +3830,7 @@ pass_forwprop::execute (function *fun) && gimple_store_p (use_stmt) && !gimple_has_volatile_ops (use_stmt) && !stmt_can_throw_internal (fun, use_stmt) - && is_gimple_assign (use_stmt) - && (TREE_CODE (gimple_assign_lhs (use_stmt)) - != TARGET_MEM_REF)) + && is_gimple_assign (use_stmt)) { tree elt_t = TREE_TYPE (CONSTRUCTOR_ELT (rhs, 0)->value); unsigned HOST_WIDE_INT elt_w @@ -3835,6 +3840,8 @@ pass_forwprop::execute (function *fun) tree use_lhs = gimple_assign_lhs (use_stmt); if (auto_var_p (use_lhs)) DECL_NOT_GIMPLE_REG_P (use_lhs) = 1; + else if (TREE_CODE (use_lhs) == TARGET_MEM_REF) + use_lhs = prepare_target_mem_ref_lvalue (use_lhs, &gsi); for (unsigned HOST_WIDE_INT bi = 0; bi < n; bi += elt_w) { unsigned HOST_WIDE_INT ci = bi / elt_w; -- 2.35.3