From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 69625 invoked by alias); 12 Dec 2017 20:12:35 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 69614 invoked by uid 89); 12 Dec 2017 20:12:34 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=aaaa X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 12 Dec 2017 20:12:33 +0000 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BF2FE800A4; Tue, 12 Dec 2017 20:12:31 +0000 (UTC) Received: from [127.0.0.1] (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3BF0D620C5; Tue, 12 Dec 2017 20:12:19 +0000 (UTC) Subject: Re: [PATCH v3 2/2] Implement pahole-like 'ptype /o' option To: Sergio Durigan Junior References: <20171121160709.23248-1-sergiodj@redhat.com> <20171211195757.18044-1-sergiodj@redhat.com> <20171211195757.18044-3-sergiodj@redhat.com> <2b1ed21e-2d41-19d2-a0cf-3b4780cf9060@ericsson.com> <87lgi8c0g6.fsf@redhat.com> <87indczcw9.fsf@redhat.com> <87lgi7x00t.fsf@redhat.com> Cc: Simon Marchi , Simon Marchi , GDB Patches , Tom Tromey , Eli Zaretskii From: Pedro Alves Message-ID: Date: Tue, 12 Dec 2017 20:12:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <87lgi7x00t.fsf@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-SW-Source: 2017-12/txt/msg00283.txt.bz2 On 12/12/2017 06:40 PM, Sergio Durigan Junior wrote: > On Tuesday, December 12 2017, Pedro Alves wrote: > >> On 12/12/2017 06:19 AM, Sergio Durigan Junior wrote: >> >>> Apparently the patch can't handle bitfields very well. I've found a few >>> cases where the bitfield handling gets confused, printing wrong >>> offsets/sizes/holes. Bitfields can be extremely complex to deal with >>> when it comes to offsets... >>> >>> I spent hours trying to improve the patch, managed to make some >>> progress, but there are still corner cases to fix. For example, the >>> patch doesn't deal well with this case: >>> >>> struct aa { >>> /* 0 | 1 */ char aa; >>> /* 1: 1 | 1 */ unsigned char a : 7; >>> /* 1:15 | 4 */ int b : 10; >>> } /* total size: 4 bytes */ >>> >>> In this case, the bitfield "b" would be combined with the previous >>> bitfield "a", like pahole reports: >>> >>> struct aa { >>> char aa; /* 0 1 */ >>> unsigned char a:7; /* 1: 1 1 */ >>> >>> /* Bitfield combined with previous fields */ >>> >>> int b:10; /* 0: 7 4 */ >>> } >>> >>> Confusing... I'm not sure why pahole reports b's offset to be 0. >> >> 0 seems right to me. The bitfield's type is int, with size 4, >> and it lives at byte offset 0: >> >> <2><53>: Abbrev Number: 5 (DW_TAG_member) >> <54> DW_AT_name : (indirect string, offset: 0x4ae): b >> <58> DW_AT_decl_file : 1 >> <59> DW_AT_decl_line : 7 >> <5a> DW_AT_type : <0x71> >> <5e> DW_AT_byte_size : 4 >> <5f> DW_AT_bit_size : 10 >> <60> DW_AT_bit_offset : 7 >> <61> DW_AT_data_member_location: 0 <<< >> >> Replacing the 0 with 1, like: >> >> int b:10; /* 1: 7 4 */ >> >> would look incorrect to me, because that'd make '(char*)&aa + 1 + 4' >> (address of containing object + byte offset + byte size) overshoot the >> size of the containing object. > > OK. The GDB patch is printing "1:15" because the offset is calculated > as: > > (bitpos + offset_bitpos) / TARGET_CHAR_BIT > > In this particular case, offset_bitpos is zero because we're not inside > an inner struct, so we don't care about it. bitpos is TYPE_FIELD_BITPOS > (type, field_idx), which is 15 in this case, because sizeof (aa.aa) + > bitsof (aa.a) = 15. Well, actually it's 15 because that's the bit number of the LSB of that field, and x86 is a !gdbarch_bits_big_endian machine. Because we have: <2><53>: Abbrev Number: 4 (DW_TAG_member) <54> DW_AT_name : b <56> DW_AT_decl_file : 1 <57> DW_AT_decl_line : 7 <58> DW_AT_type : <0x6f> <5c> DW_AT_byte_size : 4 <5d> DW_AT_bit_size : 10 <5e> DW_AT_bit_offset : 7 <5f> DW_AT_data_member_location: 0 and with that, we reach here: static void dwarf2_add_field (struct field_info *fip, struct die_info *die, struct dwarf2_cu *cu) { ... attr = dwarf2_attr (die, DW_AT_bit_offset, cu); if (attr) { if (gdbarch_bits_big_endian (gdbarch)) { ... } else { ... attr = dwarf2_attr (die, DW_AT_byte_size, cu); if (attr) { /* The size of the anonymous object containing the bit field is explicit, so use the indicated size (in bytes). */ anonymous_size = DW_UNSND (attr); } else { ... } SET_FIELD_BITPOS (*fp, (FIELD_BITPOS (*fp) + anonymous_size * bits_per_byte - bit_offset - FIELD_BITSIZE (*fp))); } } which gives us: bitpos = DW_AT_byte_size * 8 - DW_AT_bit_offset - DW_AT_bit_size = 4 * 8 - 7 - 10 = 15 and since x86 is a !gdbarch_bits_big_endian machine, 15 goes here: byte: 0 1 2 3 bits: 7-0 15-8 23-16 31-24 ^ which is what we see if we write a "1" to aa.b: (gdb) p aa.b = 1 $1 = 1 (gdb) x /4xb &aa 0x60103c : 0x00 0x80 0x00 0x00 ^^^^ > When we divide it by TARGET_CHAR_BIT, we get 1. It > seems using TYPE_FIELD_BITPOS is not reliable when dealing with > bitfields. I noticed: (gdb) p /x & aa.aa $1 = 0x601040 (gdb) p /x & aa.a $2 = 0x601041 (gdb) p /x & aa.b $3 = 0x601040 (gdb) Ignoring the fact that taking the address of bitfields is not valid C/C++ [ :-) ], it seems like the addresses above match what we'd expect the byte offset to be. At least for this example. So maybe what you need is to factor out or duplicate the logic used by the related value printing code. Here in value_primitive_field: if (TYPE_FIELD_BITSIZE (arg_type, fieldno)) { /* Handle packed fields. ... LONGEST bitpos = TYPE_FIELD_BITPOS (arg_type, fieldno); LONGEST container_bitsize = TYPE_LENGTH (type) * 8; v = allocate_value_lazy (type); v->bitsize = TYPE_FIELD_BITSIZE (arg_type, fieldno); if ((bitpos % container_bitsize) + v->bitsize <= container_bitsize && TYPE_LENGTH (type) <= (int) sizeof (LONGEST)) v->bitpos = bitpos % container_bitsize; else v->bitpos = bitpos % 8; v->offset = (value_embedded_offset (arg1) + offset + (bitpos - v->bitpos) / 8); } Above "v->offset" may be the byte offset that you're after. > > So there is something I'm not accounting here. > >> What is that number after ":" in bitfields supposed to mean in >> pahole's output (and I assume that that's what you're trying >> to emulate)? We're missing documentation for that. > > Yeah. I tried to find documentation on the output, but couldn't. > Yesterday I was reading pahole's code. From what I understand, it is > the number of bits left in the object. > >> It seems like it's supposed to mean the number of bits left in the >> containing anonymous object (i.e., in the 4 bytes of the declared >> int)? Then "0:7" seems right, given: >> >> sizeof (int) * 8 - bitsof(aa.a) - bitsof(aa.a) - bits(aa.b) >> => sizeof (int) * 8 - 7 - 10 >> => 7 > > I guess you meant: > > sizeof (int) * 8 - bitsof (aa.a) - bitsof(aa.b) No, because that's be => 32 - 7 - 10 => 15 I meant: sizeof (int) * 8 - bitsof(aa.aa) - bitsof(aa.a) - bits(aa.b) => sizeof (int) * 8 - 8 - 7 - 10 => 7 which gives us the 7 unused bits that pahole reports. > > Right? > > But yeah, it makes sense when you consider that the first bitfield got > "promoted" from "unsigned char" to "int". I haven't figured out a way > to keep track of these type changes in order to calculate the right > offsets, remaining space and holes. Do we really need to track type changes, or do we simply need to detect that the byte offset moved in the negative direction? I.e., here: struct S { char aa; /* 0 1 */ unsigned char a:7; /* 1: 1 1 */ /* Bitfield combined with previous fields */ int b:10; /* 0: 7 4 */ }; The byte offset went "0 -> 1 -> 0", and the "1 -> 0" would mean that the bitfield was "combined"? Thanks, Pedro Alves