From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7795 invoked by alias); 18 Nov 2012 17:36:30 -0000 Received: (qmail 7744 invoked by uid 22791); 18 Nov 2012 17:36:29 -0000 X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL,BAYES_50,DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,DKIM_VALID,FREEMAIL_FROM,KHOP_RCVD_TRUST,NML_ADSP_CUSTOM_MED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE,TW_EG,TW_LW,TW_LZ,TW_SR,TW_TL X-Spam-Check-By: sourceware.org Received: from mail-wi0-f173.google.com (HELO mail-wi0-f173.google.com) (209.85.212.173) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 18 Nov 2012 17:36:24 +0000 Received: by mail-wi0-f173.google.com with SMTP id hm2so5363542wib.8 for ; Sun, 18 Nov 2012 09:36:22 -0800 (PST) Received: by 10.180.75.135 with SMTP id c7mr5796728wiw.10.1353260182903; Sun, 18 Nov 2012 09:36:22 -0800 (PST) Received: from localhost ([2.26.182.84]) by mx.google.com with ESMTPS id dm3sm11165451wib.3.2012.11.18.09.36.21 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 18 Nov 2012 09:36:22 -0800 (PST) From: Richard Sandiford To: Eric Botcazou Mail-Followup-To: Eric Botcazou ,gcc-patches@gcc.gnu.org, rdsandiford@googlemail.com Cc: gcc-patches@gcc.gnu.org Subject: Re: [5/8] Tweak bitfield alignment handling References: <87k3u3eybu.fsf@talisman.home> <87y5ijdj00.fsf@talisman.home> <1588464.rhT82gxiiJ@polaris> Date: Sun, 18 Nov 2012 17:36:00 -0000 In-Reply-To: <1588464.rhT82gxiiJ@polaris> (Eric Botcazou's message of "Tue, 13 Nov 2012 14:49:25 +0100") Message-ID: <87ip92x17v.fsf@talisman.default> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-11/txt/msg01503.txt.bz2 Eric Botcazou writes: >> This patch replaces: >> >> /* Stop if the mode is wider than the alignment of the containing >> object. >> >> It is tempting to omit the following line unless STRICT_ALIGNMENT >> is true. But that is incorrect, since if the bitfield uses part >> of 3 bytes and we use a 4-byte mode, we could get a spurious segv >> if the extra 4th byte is past the end of memory. >> (Though at least one Unix compiler ignores this problem: >> that on the Sequent 386 machine. */ >> if (unit > align_) >> break; >> >> with two checks: one for whether the final byte of the mode is known >> to be mapped, and one for whether the bitfield is sufficiently aligned. >> Later patches in the series rely on this in order not to pessimise >> memory handling. >> >> However, as described in the patch, I found that extending this >> behaviour to get_best_mode affected code generation for x86_64-linux-gnu >> and powerpc64-linux-gnu, and led to a failure in bb-slp-5.c on both. >> I therefore chickened out and added the original check back to >> get_best_mode. >> >> I'm certainly interested in any feedback on the comment, but at the >> same time I'd like this series to be a no-op on targets that keep >> the traditional .md patterns. Any change to get_best_mode is probably >> best done as a separate patch. > > I agree with your conservative approach here. > >> gcc/ >> * stor-layout.c (bit_field_mode_iterator::bit_field_mode_iterator): >> Set up a default value of bitregion_end_. >> (bit_field_mode_iterator::next_mode): Always apply bitregion_end_ >> check. Include SLOW_UNALIGNED_ACCESS in the alignment check. >> (get_best_mode): Ignore modes that are wider than the alignment. > > Fine with me, modulo: > >> @@ -2754,6 +2753,62 @@ get_best_mode (int bitsize, int bitpos, >> enum machine_mode widest_mode = VOIDmode; >> enum machine_mode mode; >> while (iter.next_mode (&mode) >> + /* ??? For compatiblity, reject modes that are wider than the >> + alignment. This has both advantages and disadvantages. > > "For compatibility" is a bit vague (compatibility with what?). I'd write "For > historical reasons" or something along these lines. Yeah, that's better. Here's what I committed after updating to the patch for the inclusive bitregion_end_. Richard gcc/ * stor-layout.c (bit_field_mode_iterator::bit_field_mode_iterator): Set up a default value of bitregion_end_. (bit_field_mode_iterator::next_mode): Always apply bitregion_end_ check. Include SLOW_UNALIGNED_ACCESS in the alignment check. (get_best_mode): Ignore modes that are wider than the alignment. Index: gcc/stor-layout.c =================================================================== --- gcc/stor-layout.c 2012-11-18 17:26:31.159273478 +0000 +++ gcc/stor-layout.c 2012-11-18 17:28:17.481177253 +0000 @@ -2646,6 +2646,13 @@ fixup_unsigned_type (tree type) bitregion_end_ (bitregion_end), align_ (MIN (align, BIGGEST_ALIGNMENT)), volatilep_ (volatilep), count_ (0) { + if (!bitregion_end_) + { + /* We can assume that any aligned chunk of ALIGN_ bits that overlaps + the bitfield is mapped and won't trap. */ + bitregion_end_ = bitpos + bitsize + align_ - 1; + bitregion_end_ -= bitregion_end_ % align_ + 1; + } } /* Calls to this function return successively larger modes that can be used @@ -2676,23 +2683,15 @@ bit_field_mode_iterator::next_mode (enum if (count_ > 0 && unit > BITS_PER_WORD) break; - /* Stop if the mode is wider than the alignment of the containing - object. - - It is tempting to omit the following line unless STRICT_ALIGNMENT - is true. But that is incorrect, since if the bitfield uses part - of 3 bytes and we use a 4-byte mode, we could get a spurious segv - if the extra 4th byte is past the end of memory. - (Though at least one Unix compiler ignores this problem: - that on the Sequent 386 machine. */ - if (unit > align_) - break; - /* Stop if the mode goes outside the bitregion. */ HOST_WIDE_INT start = bitpos_ - (bitpos_ % unit); if (bitregion_start_ && start < bitregion_start_) break; - if (bitregion_end_ && start + unit > bitregion_end_ + 1) + if (start + unit > bitregion_end_ + 1) + break; + + /* Stop if the mode requires too much alignment. */ + if (unit > align_ && SLOW_UNALIGNED_ACCESS (mode_, align_)) break; *out_mode = mode_; @@ -2751,6 +2750,62 @@ get_best_mode (int bitsize, int bitpos, enum machine_mode widest_mode = VOIDmode; enum machine_mode mode; while (iter.next_mode (&mode) + /* ??? For historical reasons, reject modes that are wider than + the alignment. This has both advantages and disadvantages. + Removing this check means that something like: + + struct s { unsigned int x; unsigned int y; }; + int f (struct s *s) { return s->x == 0 && s->y == 0; } + + can be implemented using a single load and compare on + 64-bit machines that have no alignment restrictions. + For example, on powerpc64-linux-gnu, we would generate: + + ld 3,0(3) + cntlzd 3,3 + srdi 3,3,6 + blr + + rather than: + + lwz 9,0(3) + cmpwi 7,9,0 + bne 7,.L3 + lwz 3,4(3) + cntlzw 3,3 + srwi 3,3,5 + extsw 3,3 + blr + .p2align 4,,15 + .L3: + li 3,0 + blr + + However, accessing more than one field can make life harder + for the gimple optimizers. For example, gcc.dg/vect/bb-slp-5.c + has a series of unsigned short copies followed by a series of + unsigned short comparisons. With this check, both the copies + and comparisons remain 16-bit accesses and FRE is able + to eliminate the latter. Without the check, the comparisons + can be done using 2 64-bit operations, which FRE isn't able + to handle in the same way. + + Either way, it would probably be worth disabling this check + during expand. One particular example where removing the + check would help is the get_best_mode call in store_bit_field. + If we are given a memory bitregion of 128 bits that is aligned + to a 64-bit boundary, and the bitfield we want to modify is + in the second half of the bitregion, this check causes + store_bitfield to turn the memory into a 64-bit reference + to the _first_ half of the region. We later use + adjust_bitfield_address to get a reference to the correct half, + but doing so looks to adjust_bitfield_address as though we are + moving past the end of the original object, so it drops the + associated MEM_EXPR and MEM_OFFSET. Removing the check + causes store_bit_field to keep a 128-bit memory reference, + so that the final bitfield reference still has a MEM_EXPR + and MEM_OFFSET. */ + && GET_MODE_BITSIZE (mode) <= align && (largest_mode == VOIDmode || GET_MODE_SIZE (mode) <= GET_MODE_SIZE (largest_mode))) {