From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20489 invoked by alias); 6 Feb 2006 10:18:19 -0000 Received: (qmail 20479 invoked by uid 22791); 6 Feb 2006 10:18:18 -0000 X-Spam-Check-By: sourceware.org Received: from Unknown (HELO mxout5.netvision.net.il) (194.90.9.29) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 06 Feb 2006 10:18:16 +0000 Received: from [192.168.0.202] ([62.0.88.5]) by mxout5.netvision.net.il (Sun Java System Messaging Server 6.1 HotFix 0.11 (built Jan 28 2005)) with ESMTPA id <0IU9004ZOGM2UN10@mxout5.netvision.net.il> for gcc-help@gcc.gnu.org; Mon, 06 Feb 2006 12:18:03 +0200 (IST) Date: Mon, 06 Feb 2006 10:18:00 -0000 From: Yaro Pollak Subject: Re: Unaligned access to packed structs on ppc405 In-reply-to: <200602052102.k15L2BD28758@makai.watson.ibm.com> To: David Edelsohn , gcc-help@gcc.gnu.org Message-id: <43E72258.708@altair-semi.com> MIME-version: 1.0 Content-type: text/plain; charset=ISO-8859-1; format=flowed Content-transfer-encoding: 7BIT References: <4D87F853B8020F4888896B1507DC0F09026798@mail2.netezza.com> <200602052102.k15L2BD28758@makai.watson.ibm.com> User-Agent: Thunderbird 1.5 (Windows/20051201) X-IsSubscribed: yes Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org X-SW-Source: 2006-02/txt/msg00049.txt.bz2 What seems odd to me is that packed structures accesses are inherently less efficient than non-packed structures. In my example, the 3 lbz instructions instead of one lwz require 3 memory accesses instead of 1, that is a penalty of 2 extra memory access over the slow bus, and in addition to that there is extra penalty when the bit field overlaps byte boundary (as in my example), where GCC must generate extra code to "or" those bytes, which, BTW, in my opinion contradicts what you wrote earlier: David Edelsohn wrote: " The lbz has to do with the size and the packed alignment. With the packed structure, GCC chooses the smallest memory access that covers the bitfield. Once GCC has chosen bytes, it cannot merge the loads together. If the structure were not declared packed, GCC would use wider loads with masking, and then determine that the loads refer to the same object." In my case it shouldn't have chosen byte because it doesn't cover the bitfield that spans over byte boundary. I don't know whether what GCC does is "Right", and I guess if it was implemented in 4.1 somebody decided that it was "Right", but, if the code generated is 3 times the instruction count, and 3 times the memory accesses, for no apparent reason, then I can't see any reason why anyone would want this behavior. I mean the code produced in 4.0.1 for the same structure accessed not through a pointer is just fine, why break it like that? Something just doesn't seem right, I'm sorry. I think I can summarize it by saying that if it's less efficient then there is no justification for it. Yaro David Edelsohn wrote: >>>>>> John Yates writes: >>>>>> > > John> Do I read this correctly? Are you truly saying that two structs > John> with identical layout will trigger different code sequences just > John> because one was declared packed? > > Yes. Why is that strange? attribute packed assigns the smallest > possible alignment so that the compiler composes the layout of the > structure or bitfield in the more compact form possible. Even if the > layout produced is the same, the smaller alignment is carried around with > the fields and causes the compiler to use more conservative access > operations. > > David > > >