From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 113318 invoked by alias); 9 Jan 2019 23:06:14 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 113303 invoked by uid 89); 9 Jan 2019 23:06:13 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.9 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=amounts, known_eq, bits_per_unit, BITS_PER_UNIT X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 09 Jan 2019 23:06:10 +0000 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 05B5C88E59; Wed, 9 Jan 2019 23:06:09 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-18.ams2.redhat.com [10.36.116.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 052FB5D739; Wed, 9 Jan 2019 23:06:07 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id x09N65ZU028798; Thu, 10 Jan 2019 00:06:05 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id x09N61ht028797; Thu, 10 Jan 2019 00:06:01 +0100 Date: Wed, 09 Jan 2019 23:06:00 -0000 From: Jakub Jelinek To: Richard Biener , Jeff Law , Eric Botcazou Cc: Renlin Li , gcc-patches@gcc.gnu.org Subject: [PATCH] Move PR84877 fix elsewhere (PR bootstrap/88450) Message-ID: <20190109230601.GR30353@tucnak> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) X-IsSubscribed: yes X-SW-Source: 2019-01/txt/msg00505.txt.bz2 Hi! The r266345 change breaks mingw bootstraps for quite some time. The problem is that the dynamic realignment is done IMHO way too low. The biggest problem is that when assign_stack_local_1 decides to do this dynamic_align_addr, it emits wherever it is currently expanding code about 3 instructions to do something like new_base = (base + (align - 1)) / align * align and returns a MEM with that new_base as the XEXP (mem, 0). Unfortunately, one of the common callers of assign_stack_local_1, assign_stack_temp_from_type, does temporary stack slot management and reuses already assigned slots if they are suitable somewhere else. This of course doesn't work at all if the instructions that set new_base don't dominate one or more of the other uses. Another problem is that in way too many cases we decide to choose BIGGEST_ALIGNMENT for stack slots, even when not strictly needed. E.g. any BLKmode stack slot requests that BIGGEST_ALIGNMENT, even if TYPE_ALIGN is much smaller and assign_stack_local_1 even asserts that for BLKmode the alignment must be BIGGEST_ALIGNMENT. E.g. on mingw with -mavx or -mavx512f BIGGEST_ALIGNMENT is 256 or 512 bits, but MAX_SUPPORTED_STACK_ALIGNMENT is 128 bits. So the PR84877 change, even if it didn't cause wrong-code, causes huge amounts of stack slots to be unnecessarily overaligned with dynamic realignment. The following patch reverts to the previous behavior and moves this dynamic stack realignment to the caller that needs it, which doesn't do caching and where we can do it solely for this overaligned DECL_ALIGN. If there are other spots that need this, wondering about: else copy = assign_temp (type, 1, 0); in calls.c, either it can be done by using the variable-sized object case in the then block, or could be done using assign_stack_local + this short realignment too. Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux, bootstrapped on powerpc64-linux (regtest still pending there). Ok for trunk? 2019-01-09 Jakub Jelinek PR middle-end/84877 PR bootstrap/88450 * function.c (assign_stack_local_1): Revert the 2018-11-21 changes. (assign_parm_setup_block): Do the argument slot realignment here instead. --- gcc/function.c.jj 2019-01-09 15:39:26.261153591 +0100 +++ gcc/function.c 2019-01-09 16:17:59.757704567 +0100 @@ -377,7 +377,6 @@ assign_stack_local_1 (machine_mode mode, poly_int64 bigend_correction = 0; poly_int64 slot_offset = 0, old_frame_offset; unsigned int alignment, alignment_in_bits; - bool dynamic_align_addr = false; if (align == 0) { @@ -396,22 +395,14 @@ assign_stack_local_1 (machine_mode mode, alignment_in_bits = alignment * BITS_PER_UNIT; + /* Ignore alignment if it exceeds MAX_SUPPORTED_STACK_ALIGNMENT. */ if (alignment_in_bits > MAX_SUPPORTED_STACK_ALIGNMENT) { - /* If the required alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT and - it is not OK to reduce it. Align the slot dynamically. */ - if (mode == BLKmode - && (kind & ASLK_REDUCE_ALIGN) == 0 - && currently_expanding_to_rtl) - dynamic_align_addr = true; - else - { - alignment_in_bits = MAX_SUPPORTED_STACK_ALIGNMENT; - alignment = MAX_SUPPORTED_STACK_ALIGNMENT / BITS_PER_UNIT; - } + alignment_in_bits = MAX_SUPPORTED_STACK_ALIGNMENT; + alignment = MAX_SUPPORTED_STACK_ALIGNMENT / BITS_PER_UNIT; } - if (SUPPORTS_STACK_ALIGNMENT && !dynamic_align_addr) + if (SUPPORTS_STACK_ALIGNMENT) { if (crtl->stack_alignment_estimated < alignment_in_bits) { @@ -441,42 +432,10 @@ assign_stack_local_1 (machine_mode mode, } } - /* Handle overalignment here for parameter copy on the stack. - Reserved enough space for it and dynamically align the address. - No free frame_space is added here. */ - if (dynamic_align_addr) - { - rtx allocsize = gen_int_mode (size, Pmode); - get_dynamic_stack_size (&allocsize, 0, alignment_in_bits, NULL); - - /* This is the size of space needed to accommodate required size of data - with given alignment. */ - poly_int64 len = rtx_to_poly_int64 (allocsize); - old_frame_offset = frame_offset; - - if (FRAME_GROWS_DOWNWARD) - { - frame_offset -= len; - try_fit_stack_local (frame_offset, len, len, - PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT, - &slot_offset); - } - else - { - frame_offset += len; - try_fit_stack_local (old_frame_offset, len, len, - PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT, - &slot_offset); - } - goto found_space; - } - else - { - if (crtl->stack_alignment_needed < alignment_in_bits) - crtl->stack_alignment_needed = alignment_in_bits; - if (crtl->max_used_stack_slot_alignment < alignment_in_bits) - crtl->max_used_stack_slot_alignment = alignment_in_bits; - } + if (crtl->stack_alignment_needed < alignment_in_bits) + crtl->stack_alignment_needed = alignment_in_bits; + if (crtl->max_used_stack_slot_alignment < alignment_in_bits) + crtl->max_used_stack_slot_alignment = alignment_in_bits; if (mode != BLKmode || maybe_ne (size, 0)) { @@ -563,12 +522,6 @@ assign_stack_local_1 (machine_mode mode, (slot_offset + bigend_correction, Pmode)); - if (dynamic_align_addr) - { - addr = align_dynamic_address (addr, alignment_in_bits); - mark_reg_pointer (addr, alignment_in_bits); - } - x = gen_rtx_MEM (mode, addr); set_mem_align (x, alignment_in_bits); MEM_NOTRAP_P (x) = 1; @@ -2960,8 +2913,21 @@ assign_parm_setup_block (struct assign_p if (stack_parm == 0) { SET_DECL_ALIGN (parm, MAX (DECL_ALIGN (parm), BITS_PER_WORD)); - stack_parm = assign_stack_local (BLKmode, size_stored, - DECL_ALIGN (parm)); + if (DECL_ALIGN (parm) > MAX_SUPPORTED_STACK_ALIGNMENT) + { + rtx allocsize = gen_int_mode (size, Pmode); + get_dynamic_stack_size (&allocsize, 0, DECL_ALIGN (parm), NULL); + stack_parm = assign_stack_local (BLKmode, UINTVAL (allocsize), + MAX_SUPPORTED_STACK_ALIGNMENT); + rtx addr = align_dynamic_address (XEXP (stack_parm, 0), + DECL_ALIGN (parm)); + mark_reg_pointer (addr, DECL_ALIGN (parm)); + stack_parm = gen_rtx_MEM (GET_MODE (stack_parm), addr); + MEM_NOTRAP_P (stack_parm) = 1; + } + else + stack_parm = assign_stack_local (BLKmode, size_stored, + DECL_ALIGN (parm)); if (known_eq (GET_MODE_SIZE (GET_MODE (entry_parm)), size)) PUT_MODE (stack_parm, GET_MODE (entry_parm)); set_mem_attributes (stack_parm, parm, 1); Jakub