Re: [RFC][PATCH?] fixed some segfaults and bugs in mdebug support

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Andrew Burgess <aburgess@redhat.com>
To: Zeck S <zeck654321@gmail.com>, gdb-patches@sourceware.org
Cc: Zeck S <zeck654321@gmail.com>
Subject: Re: [RFC][PATCH?] fixed some segfaults and bugs in mdebug support
Date: Mon, 11 Dec 2023 14:03:26 +0000	[thread overview]
Message-ID: <87jzpkg7kh.fsf@redhat.com> (raw)
In-Reply-To: <20231211114252.451602-2-zeck654321@gmail.com>

Zeck S <zeck654321@gmail.com> writes:

> I'm not quite sure what to do.
> I have done some testing using the qemu-mips board I made.
> There are a lot of limitations I'm running into.
> I can just run as much of the testsuite as I can and see what happens.
>
>
> Here are the main problems:
> 1. IDO only supports ansi c. This is a problem for some headers in glibc which use c99.
> 2. Test code sometimes has GCC features in them like asm()
> 3. Test expectations sometimes assume features from nicer formats like DWARF.
>
> I knew I probably couldn't get standard library stuff working.
> Even if I hack past the C99 issues (variadic macros),
> there's also problems
> with compiler specific files like stddef.h.
>
> I started testing by just running gdb.base/info-types-c.exp alone,
> since I knew I fixed bugs in that area, and the test didn't use stdio.h.
> Found that the .c file used asm() which IDO doesn't support.
> Commented that code out, ran the test. Failed.
> Fixed a segfault that had been introduced since I started this project.
> Still failed.
> Fixed some typedef handling to get closer to the expected output.
> However, the expected output includes line numbers for types.
> mdebug format doesn't include line information for types.
> There are also differences in output stemming from
> mdebug assuming the debugger knows size information etc about "basic"
> types, unlike dwarf, which includes information about them.
>
> So at this point, I realize, I've modified the .c file for the test
> which already seems not great, and I realize due to limitations of mdebug,
> I can't ever make the test pass.
> Not sure how to demonstrate this as progress, but uh, it feels like
> progress?

So, I know nothing about targets that make use of ECOFF / mdebug format
stuff, and I don't really have time to try setting up a test
environment.

I guess my expectation for at least some of these fixes would be that
you could write a small sample program that does compile with your
compiler, and then a .exp file which triggers the right GDB commands to
expose the bug.

From what you've said, it doesn't sound like the existing test framework
actually has problems invoking the compiler and running the test, it's
just that the test has too many DWARF/ISO-C assumptions within.

Personally, I'm a huge fan of writing tests, I think it is worth the
effort in the long run, so I think it would be worth your time to try
and write some new tests for these issues.

That said, given the nature of these fixes and the difficulties you're
experiencing, if you don't feel that's possible, I'm not going to oppose
merging these fixes without tests...

... but I do have a couple of style issues that would need fixing first,
see inline below:

>
>
> For the curious, this is the bash script I wrapped IDO with.
> stderr is directed to dev null, because qemu-irix puts some non fatal
> errors there, which confuses the test runner.
> Put the script in a directory in PATH.
> Passing an executable path as CC_FOR_TARGET does not work.
> It must be the name of a file in PATH.
>
> #!/bin/sh
>
> echo "$@" >> theargs
>
> for arg do
>   if [ "$arg" = "-static" ]
>   then
>     link=1
>   fi
> done
>
> if [ "$link" = 1 ]
> then
>   mips-unknown-linux-gnu-gcc "$@"
> else
> for arg do
>   shift
>   [ "$arg" = "-fdiagnostics-color=never" ] && continue
>   [ "$arg" = "-g" ] && continue
>   set -- "$@" "$arg"
> done
>
> /path/to/qemu-irix -L /path/to/ido/ido7.1_compiler /path/to/ido/ido7.1_compiler/usr/bin/cc -Xcpluscomm -nostdlib -nostdinc -mips2 -g2 "$@" 2>/dev/null
> fi
>
>
> ---
>  gdb/mdebugread.c | 89 ++++++++++++++++++++++++++++--------------------
>  1 file changed, 52 insertions(+), 37 deletions(-)
>
> diff --git a/gdb/mdebugread.c b/gdb/mdebugread.c
> index cd6638224e7..191932e91cd 100644
> --- a/gdb/mdebugread.c
> +++ b/gdb/mdebugread.c
> @@ -237,7 +237,7 @@ static struct type *new_type (char *);
>  enum block_type { FUNCTION_BLOCK, NON_FUNCTION_BLOCK };
>  
>  static struct block *new_block (struct objfile *objfile,
> -				enum block_type, enum language);
> +				enum block_type, enum language, bool isGlobal);

GDB style is for snake_case rather than camelCase, so:
s/isGlobal/is_global/ throughout.

>  
>  static struct compunit_symtab *new_symtab (const char *, int, struct objfile *);
>  
> @@ -246,7 +246,7 @@ static struct linetable *new_linetable (int);
>  static struct blockvector *new_bvect (int);
>  
>  static struct type *parse_type (int, union aux_ext *, unsigned int, int *,
> -				int, const char *);
> +				int, const char *, bool);
>  
>  static struct symbol *mylookup_symbol (const char *, const struct block *,
>  				       domain_enum, enum address_class);
> @@ -572,7 +572,7 @@ add_data_symbol (SYMR *sh, union aux_ext *ax, int bigend,
>        || sh->sc == scNil || sh->index == indexNil)
>      s->set_type (builtin_type (objfile)->nodebug_data_symbol);
>    else
> -    s->set_type (parse_type (cur_fd, ax, sh->index, 0, bigend, name));
> +    s->set_type (parse_type (cur_fd, ax, sh->index, 0, bigend, name, false));
>    /* Value of a data symbol is its memory address.  */
>  }
>  
> @@ -705,7 +705,7 @@ parse_symbol (SYMR *sh, union aux_ext *ax, char *ext_sh, int bigend,
>  	  break;
>  	}
>        s->set_value_longest (svalue);
> -      s->set_type (parse_type (cur_fd, ax, sh->index, 0, bigend, name));
> +      s->set_type (parse_type (cur_fd, ax, sh->index, 0, bigend, name, false));
>        add_symbol (s, top_stack->cur_st, top_stack->cur_block);
>        break;
>  
> @@ -761,7 +761,7 @@ parse_symbol (SYMR *sh, union aux_ext *ax, char *ext_sh, int bigend,
>  	t = builtin_type (objfile)->builtin_int;
>        else
>  	{
> -	  t = parse_type (cur_fd, ax, sh->index + 1, 0, bigend, name);
> +	  t = parse_type (cur_fd, ax, sh->index + 1, 0, bigend, name, false);
>  	  if (strcmp (name, "malloc") == 0
>  	      && t->code () == TYPE_CODE_VOID)
>  	    {
> @@ -805,7 +805,7 @@ parse_symbol (SYMR *sh, union aux_ext *ax, char *ext_sh, int bigend,
>  	s->type ()->set_is_prototyped (true);
>  
>        /* Create and enter a new lexical context.  */
> -      b = new_block (objfile, FUNCTION_BLOCK, s->language ());
> +      b = new_block (objfile, FUNCTION_BLOCK, s->language (), false);
>        s->set_value_block (b);
>        b->set_function (s);
>        b->set_start (sh->value);
> @@ -1135,7 +1135,7 @@ parse_symbol (SYMR *sh, union aux_ext *ax, char *ext_sh, int bigend,
>  	}
>  
>        top_stack->blocktype = stBlock;
> -      b = new_block (objfile, NON_FUNCTION_BLOCK, psymtab_language);
> +      b = new_block (objfile, NON_FUNCTION_BLOCK, psymtab_language, false);
>        b->set_start (sh->value + top_stack->procadr);
>        b->set_superblock (top_stack->cur_block);
>        top_stack->cur_block = b;
> @@ -1247,7 +1247,7 @@ parse_symbol (SYMR *sh, union aux_ext *ax, char *ext_sh, int bigend,
>  	f->set_loc_bitpos (sh->value);
>  	bitsize = 0;
>  	f->set_type (parse_type (cur_fd, ax, sh->index, &bitsize, bigend,
> -				 name));
> +				 name, false));
>  	f->set_bitsize (bitsize);
>        }
>        break;
> @@ -1269,7 +1269,7 @@ parse_symbol (SYMR *sh, union aux_ext *ax, char *ext_sh, int bigend,
>        pend = is_pending_symbol (cur_fdr, ext_sh);
>        if (pend == NULL)
>  	{
> -	  t = parse_type (cur_fd, ax, sh->index, NULL, bigend, name);
> +	  t = parse_type (cur_fd, ax, sh->index, NULL, bigend, name, true);
>  	  add_pending (cur_fdr, ext_sh, t);
>  	}
>        else
> @@ -1382,7 +1382,7 @@ basic_type (int bt, struct objfile *objfile)
>    if (map_bt[bt])
>      return map_bt[bt];
>  
> -  type_allocator alloc (objfile, get_current_subfile ()->language);
> +  type_allocator alloc (objfile, psymtab_language);

With fixes like this, it can (I think) be useful to mention the commit
that is being fixed, in this case 76fc0f62138e.  Wouldn't be much, I'd
just say:

  Commit 76fc0f62138e added calls to 'get_current_subfile ()->language',
  but at the time this was known to be a bit of a guess, and untested.
  We should actually have used 'psymtab_language' because ....

Then, if in the future, someone looks at your patch and wonders why was
this change made?  There's a super quick way that can find the history
of this code, and understand the original change, and your update.

>  
>    switch (bt)
>      {
> @@ -1514,7 +1514,7 @@ basic_type (int bt, struct objfile *objfile)
>  
>  static struct type *
>  parse_type (int fd, union aux_ext *ax, unsigned int aux_index, int *bs,
> -	    int bigend, const char *sym_name)
> +	    int bigend, const char *sym_name, bool isStTypedef)

Switch to snake_case for isStTypedef.

>  {
>    TIR t[1];
>    struct type *tp = 0;
> @@ -1571,7 +1571,7 @@ parse_type (int fd, union aux_ext *ax, unsigned int aux_index, int *bs,
>  	}
>      }
>  
> -  type_allocator alloc (mdebugread_objfile, get_current_subfile ()->language);
> +  type_allocator alloc (mdebugread_objfile, psymtab_language);
>  
>    /* Move on to next aux.  */
>    ax++;
> @@ -1628,7 +1628,7 @@ parse_type (int fd, union aux_ext *ax, unsigned int aux_index, int *bs,
>        xref_fh = get_rfd (fd, rf);
>        xref_fd = xref_fh - debug_info->fdr;
>        tp = parse_type (xref_fd, debug_info->external_aux + xref_fh->iauxBase,
> -		    rn->index, NULL, xref_fh->fBigendian, sym_name);
> +		    rn->index, NULL, xref_fh->fBigendian, sym_name, false);
>      }
>  
>    /* All these types really point to some (common) MIPS type
> @@ -1785,6 +1785,13 @@ parse_type (int fd, union aux_ext *ax, unsigned int aux_index, int *bs,
>    if (t->continued)
>      complaint (_("illegal TIR continued for %s"), sym_name);
>  
> +  if (isStTypedef)
> +  {
> +	struct type *wrap = alloc.new_type (TYPE_CODE_TYPEDEF, 0, sym_name);
> +	wrap->set_target_type(tp);
> +	tp = wrap;
> +  }

GNU style indents the '{' and '}', this should look like:

  if (isStTypedef)
    {
      struct type *wrap = alloc.new_type (TYPE_CODE_TYPEDEF, 0, sym_name);
      wrap->set_target_type(tp);
      tp = wrap;
    }

> +
>    return tp;
>  }
>  
> @@ -1839,7 +1846,7 @@ upgrade_type (int fd, struct type **tpp, int tq, union aux_ext *ax, int bigend,
>  
>        indx = parse_type (fh - debug_info->fdr,
>  			 debug_info->external_aux + fh->iauxBase,
> -			 id, NULL, bigend, sym_name);
> +			 id, NULL, bigend, sym_name, false);
>  
>        /* The bounds type should be an integer type, but might be anything
>  	 else due to corrupt aux entries.  */
> @@ -2154,8 +2161,7 @@ parse_external (EXTR *es, int bigend, const section_offsets &section_offsets,
>     with that and do not need to reorder our linetables.  */
>  
>  static void
> -parse_lines (FDR *fh, PDR *pr, struct linetable *lt, int maxlines,
> -	     CORE_ADDR lowest_pdr_addr)
> +parse_lines (FDR *fh, PDR *pr, struct linetable *lt, int maxlines)

As above, this seems to be reverting parts of 1acc9dca423f78e4, it would
be great to more of an explanation for why that commit was wrong.

Remember, someone thought that commit was correct once.  You're looking
at their code and thinking they got it wrong.  What's to stop someone
looking at your code and thinking _you_ got it wrong?

Hopefully, what stops that is that you add more details in the commit
message, along with a link to the previous commit.

>  {
>    unsigned char *base;
>    int j, k;
> @@ -2169,7 +2175,6 @@ parse_lines (FDR *fh, PDR *pr, struct linetable *lt, int maxlines,
>    for (j = 0; j < fh->cpd; j++, pr++)
>      {
>        CORE_ADDR l;
> -      CORE_ADDR adr;
>        unsigned char *halt;
>  
>        /* No code for this one.  */
> @@ -2186,9 +2191,7 @@ parse_lines (FDR *fh, PDR *pr, struct linetable *lt, int maxlines,
>  	halt = base + fh->cbLine;
>        base += pr->cbLineOffset;
>  
> -      adr = pr->adr - lowest_pdr_addr;
> -
> -      l = adr >> 2;		/* in words */
> +      l = pr->adr >> 2;		/* in words */
>        for (lineno = pr->lnLow; base < halt;)
>  	{
>  	  count = *base & 0x0f;
> @@ -4053,6 +4056,14 @@ mdebug_expand_psymtab (legacy_psymtab *pst, struct objfile *objfile)
>  	    {
>  	      (*swap_pdr_in) (cur_bfd, pdr_ptr, pdr_in);
>  
> +	      char *sym_ptr = (char *)(debug_info->external_sym) +
> +	        (fh->isymBase + pdr_in->isym) * external_sym_size;

The layout seems wrong here.  There should for sure be a single space
after '(char *)', and the '+' should be at the start of the new line
rather than the end of the previous line.

Additionally, we usually wrap expressions like this within '( ... )' to
force editors to align things better.  Turns out that this pattern of
code is repeated lots in this file.  Search for: '* external_sym_size'
and you'll find lots of example for how to align this code.  I'd go
with:

	      char *sym_ptr = ((char *) debug_info->external_sym
			       + (fh->isymBase + pdr_in->isym)
			       * external_sym_size);

maybe?

> +
> +	      SYMR sh;
> +	      (*swap_sym_in) (cur_bfd, sym_ptr, &sh);
> +
> +	      pdr_in->adr = sh.value;
> +
>  	      /* Determine lowest PDR address, the PDRs are not always
>  		 sorted.  */
>  	      if (pdr_in == pr_block.data ())
> @@ -4154,16 +4165,16 @@ mdebug_expand_psymtab (legacy_psymtab *pst, struct objfile *objfile)
>  		{
>  		  (*swap_pdr_in) (cur_bfd, pdr_ptr, pdr_in);
>  
> -		  /* Determine lowest PDR address, the PDRs are not always
> -		     sorted.  */
> -		  if (pdr_in == pr_block.data ())
> -		    lowest_pdr_addr = pdr_in->adr;
> -		  else if (pdr_in->adr < lowest_pdr_addr)
> -		    lowest_pdr_addr = pdr_in->adr;
> +                  sym_ptr = (char *)(debug_info->external_sym) +
> +                    (fh->isymBase + pdr_in->isym) * external_sym_size;

Similar layout issues to above.

> +
> +                  SYMR sh;
> +                  (*swap_sym_in) (cur_bfd, sym_ptr, &sh);
> +
> +                  pdr_in->adr = sh.value;

These lines are all indented using only spaces.  GDB (unfortunately)
uses a hybrid tab and space approach for indentation.  If you 'git diff'
and look at this file, git _should_ highlight the whitespace issues for
you -- we have a .gitattributes file in the binutils-gdb root that asks
for whitespace issues to be highlighted.

>  		}
>  
> -	      parse_lines (fh, pr_block.data (), lines, maxlines,
> -			   lowest_pdr_addr);
> +	      parse_lines (fh, pr_block.data (), lines, maxlines);
>  	      if (lines->nitems < fh->cline)
>  		lines = shrink_linetable (lines);
>  
> @@ -4291,7 +4302,7 @@ cross_ref (int fd, union aux_ext *ax, struct type **tpp,
>        rf = rn->rfd;
>      }
>  
> -  type_allocator alloc (mdebugread_objfile, get_current_subfile ()->language);
> +  type_allocator alloc (mdebugread_objfile, psymtab_language);
>  
>    /* mips cc uses a rf of -1 for opaque struct definitions.
>       Set TYPE_STUB for these types so that check_typedef will
> @@ -4412,7 +4423,8 @@ cross_ref (int fd, union aux_ext *ax, struct type **tpp,
>  				 sh.index,
>  				 NULL,
>  				 fh->fBigendian,
> -				 debug_info->ss + fh->issBase + sh.iss);
> +				 debug_info->ss + fh->issBase + sh.iss,
> +				 sh.st == stTypedef);
>  	      add_pending (fh, esh, *tpp);
>  	      break;
>  
> @@ -4438,7 +4450,8 @@ cross_ref (int fd, union aux_ext *ax, struct type **tpp,
>  			     sh.index,
>  			     NULL,
>  			     fh->fBigendian,
> -			     debug_info->ss + fh->issBase + sh.iss);
> +			     debug_info->ss + fh->issBase + sh.iss,
> +			     true);
>  	}
>        else
>  	{
> @@ -4542,6 +4555,7 @@ add_line (struct linetable *lt, int lineno, CORE_ADDR adr, int last)
>      return lineno;
>  
>    lt->item[lt->nitems].line = lineno;
> +  lt->item[lt->nitems].is_stmt = 1;
>    lt->item[lt->nitems++].set_unrelocated_pc (unrelocated_addr (adr << 2));
>    return lineno;
>  }
> @@ -4634,9 +4648,10 @@ new_symtab (const char *name, int maxlines, struct objfile *objfile)
>  
>    /* All symtabs must have at least two blocks.  */
>    bv = new_bvect (2);
> -  bv->set_block (GLOBAL_BLOCK, new_block (objfile, NON_FUNCTION_BLOCK, lang));
> -  bv->set_block (STATIC_BLOCK, new_block (objfile, NON_FUNCTION_BLOCK, lang));
> +  bv->set_block (GLOBAL_BLOCK, new_block (objfile, NON_FUNCTION_BLOCK, lang, true));
> +  bv->set_block (STATIC_BLOCK, new_block (objfile, NON_FUNCTION_BLOCK, lang, false));
>    bv->static_block ()->set_superblock (bv->global_block ());
> +  bv->global_block ()->set_compunit_symtab(cust);
>    cust->set_blockvector (bv);
>  
>    cust->set_debugformat ("ECOFF");
> @@ -4723,9 +4738,10 @@ new_bvect (int nblocks)
>  
>  static struct block *
>  new_block (struct objfile *objfile, enum block_type type,
> -	   enum language language)
> +	   enum language language, bool isGlobal)
>  {
> -  struct block *retval = new (&objfile->objfile_obstack) block;
> +  struct block *retval = isGlobal ? new (&objfile->objfile_obstack) global_block
> +    : new (&objfile->objfile_obstack) block;

By the time you've s/isGlobal/is_global/ I think you should format this
as:

  struct block *retval = (is_global
			  ? new (&objfile->objfile_obstack) global_block
			  : new (&objfile->objfile_obstack) block);


Thanks,
Andrew


>  
>    if (type == FUNCTION_BLOCK)
>      retval->set_multidict (mdict_create_linear_expandable (language));
> @@ -4754,8 +4770,7 @@ new_type (char *name)
>  {
>    struct type *t;
>  
> -  t = type_allocator (mdebugread_objfile,
> -		      get_current_subfile ()->language).new_type ();
> +  t = type_allocator (mdebugread_objfile, psymtab_language).new_type ();
>    t->set_name (name);
>    INIT_CPLUS_SPECIFIC (t);
>    return t;
> -- 
> 2.41.0

next prev parent reply	other threads:[~2023-12-11 14:03 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-22  8:13 Zeck S
2023-10-23  9:40 ` Andrew Burgess
2023-10-24  0:25   ` Zeck S
2023-11-11  3:07     ` Zeck S
2023-12-04  3:36       ` Zeck S
2023-12-11 11:42         ` Zeck S
2023-12-11 14:03           ` Andrew Burgess [this message]
2023-12-11 14:48             ` Zeck S
2023-12-15 19:26               ` Tom Tromey
2023-12-18 15:50               ` Andrew Burgess
2023-12-25  5:42                 ` Zeck S
2024-02-07 13:33                   ` [PATCH] mdebug fix Zeck S
2024-02-16  2:45                   ` [RFC][PATCH?] fixed some segfaults and bugs in mdebug support Zeck S
2024-03-13  2:09                     ` Zeck S
2023-12-15 19:27             ` Tom Tromey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzpkg7kh.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=zeck654321@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).