public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Indu Bhagat <indu.bhagat@oracle.com>
Cc: binutils@sourceware.org
Subject: Re: [PATCH, V2 07/10] gas: synthesize CFI for hand-written asm
Date: Thu, 2 Nov 2023 16:53:14 +0100	[thread overview]
Message-ID: <ce890459-be9e-20ad-5769-8fc53db55c04@suse.com> (raw)
In-Reply-To: <20231030165137.2570939-8-indu.bhagat@oracle.com>

On 30.10.2023 17:51, Indu Bhagat wrote:
> --- /dev/null
> +++ b/gas/ginsn.c
> @@ -0,0 +1,1225 @@
> +/* ginsn.h - GAS instruction representation.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +
> +   This file is part of GAS, the GNU Assembler.
> +
> +   GAS is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3, or (at your option)
> +   any later version.
> +
> +   GAS is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with GAS; see the file COPYING.  If not, write to the Free
> +   Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA
> +   02110-1301, USA.  */
> +
> +#include "as.h"
> +#include "subsegs.h"
> +#include "ginsn.h"
> +#include "scfi.h"
> +
> +#ifdef TARGET_USE_GINSN
> +
> +const char *const ginsn_type_names[] =

This looks like it wants to be static.

> +{
> +#define _GINSN_TYPE_ITEM(NAME, STR) STR,
> +  _GINSN_TYPES
> +#undef _GINSN_TYPE_ITEM
> +};
> +
> +const char *const ginsn_src_type_names[] =

This and ...

> +{
> +#define _GINSN_SRC_TYPE_ITEM(NAME, STR) STR,
> +  _GINSN_SRC_TYPES
> +#undef _GINSN_SRC_TYPE_ITEM
> +};
> +const char *const ginsn_dst_type_names[] =

... this also look so, but then they're also unused inside this CU,
so when making them static the compiler would complain. If they're
needed by a later patch, perhaps they should be moved there? And
if entirely unused, perhaps they should both be deleted (or at least
commented / #ifdef-ed out)?

With the many symbols further down it's also hard to spot whether any
of them should perhaps also be static.

> +{
> +#define _GINSN_DST_TYPE_ITEM(NAME, STR) STR,
> +  _GINSN_DST_TYPES
> +#undef _GINSN_DST_TYPE_ITEM
> +};
> +
> +static
> +ginsnS *ginsn_alloc (void)
> +{
> +  ginsnS *ginsn = XCNEW (ginsnS);
> +  return ginsn;
> +}
> +
> +static ginsnS*
> +ginsn_init (enum ginsn_type type, symbolS *sym, bool real_p)
> +{
> +  ginsnS *ginsn = ginsn_alloc ();
> +  ginsn->type = type;
> +  ginsn->sym = sym;

Is a symbol hanging off of a ginsn ever intended to be altered? If
not, field and function argument would want to be pointer-to-const.

> +  if (real_p)
> +    ginsn->flags |= GINSN_F_INSN_REAL;
> +  return ginsn;
> +}
> +
> +static void
> +ginsn_cleanup (ginsnS **ginsnp)
> +{
> +  ginsnS *ginsn;
> +
> +  if (!ginsnp || !*ginsnp)
> +    return;
> +
> +  ginsn = *ginsnp;
> +  if (ginsn->scfi_ops)
> +    {
> +      scfi_ops_cleanup (ginsn->scfi_ops);
> +      ginsn->scfi_ops = NULL;
> +    }
> +
> +  free (ginsn);
> +  ginsn = NULL;

This doesn't do anything useful. Did you perhaps mean to set *ginsnp to NULL,
and then also ahead of calling free()?

> +}
> +
> +static void
> +ginsn_set_src (struct ginsn_src *src, enum ginsn_src_type type, uint32_t reg,
> +	       int32_t immdisp)

I find the use of fixed-width types suspicious here: For reg, likely it wants
to be unsigned int. For immdisp it's less clear: Immediates can be wider than
32 bits, and they may also be either signed ot unsigned. From an abstract
perspective, assuming the immediate value actually is used for anything, I'd
expect offsetT to be used, following struct expressionS' X_add_number.

> +{
> +  if (!src)
> +    return;
> +
> +  src->type = type;
> +  /* Even when the use-case is SCFI, the value of reg may be > SCFI_NUM_REGS.
> +     E.g., in AMD64, push fs etc.  */
> +  src->reg = reg;
> +
> +  if (type == GINSN_SRC_IMM || type == GINSN_SRC_INDIRECT)
> +    src->immdisp = immdisp;

What's the point of the conditional around the assignment?

> +ginsnS *
> +ginsn_new_add (symbolS *sym, bool real_p,
> +	       enum ginsn_src_type src1_type, uint32_t src1_val, int32_t src1_disp,
> +	       enum ginsn_src_type src2_type, uint32_t src2_val, int32_t src2_disp,
> +	       enum ginsn_dst_type dst_type, uint32_t dst_reg, int32_t dst_disp)
> +{
> +  ginsnS *ginsn = ginsn_init (GINSN_TYPE_ADD, sym, real_p);
> +  /* src info.  */
> +  ginsn_set_src (&ginsn->src[0], src1_type, src1_val, src1_disp);
> +  ginsn_set_src (&ginsn->src[1], src2_type, src2_val, src2_disp);
> +  /* dst info.  */
> +  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp);
> +
> +  return ginsn;
> +}
> +
> +ginsnS *
> +ginsn_new_and (symbolS *sym, bool real_p,
> +	       enum ginsn_src_type src1_type, uint32_t src1_val, int32_t src1_disp,
> +	       enum ginsn_src_type src2_type, uint32_t src2_val, int32_t src2_disp,
> +	       enum ginsn_dst_type dst_type, uint32_t dst_reg, int32_t dst_disp)
> +{
> +  ginsnS *ginsn = ginsn_init (GINSN_TYPE_AND, sym, real_p);
> +  /* src info.  */
> +  ginsn_set_src (&ginsn->src[0], src1_type, src1_val, src1_disp);
> +  ginsn_set_src (&ginsn->src[1], src2_type, src2_val, src2_disp);
> +  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp);
> +
> +  return ginsn;
> +}

The comments in the two functions don't look overly useful to me. But if
you have them, please have them be consistent across similar functions.

> +ginsnS *
> +ginsn_new_mov (symbolS *sym, bool real_p,
> +	       enum ginsn_src_type src_type, uint32_t src_reg, int32_t src_disp,
> +	       enum ginsn_dst_type dst_type, uint32_t dst_reg, int32_t dst_disp)
> +{
> +  ginsnS *ginsn = ginsn_init (GINSN_TYPE_MOV, sym, real_p);
> +  /* src info.  */
> +  ginsn_set_src (&ginsn->src[0], src_type, src_reg, src_disp);
> +  /* dst info.  */
> +  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp);
> +
> +  return ginsn;
> +}

As indicated before, if both src and dst can be indirect here, ...

> +ginsnS *
> +ginsn_new_store (symbolS *sym, bool real_p,
> +		 enum ginsn_src_type src_type, uint32_t src_reg,
> +		 enum ginsn_dst_type dst_type, uint32_t dst_reg, int32_t dst_disp)
> +{
> +  ginsnS *ginsn = ginsn_init (GINSN_TYPE_STORE, sym, real_p);
> +  /* src info.  */
> +  ginsn_set_src (&ginsn->src[0], src_type, src_reg, 0);
> +  /* dst info.  */
> +  gas_assert (dst_type == GINSN_DST_INDIRECT);
> +  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp);
> +
> +  return ginsn;
> +}
> +
> +ginsnS *
> +ginsn_new_load (symbolS *sym, bool real_p,
> +		enum ginsn_src_type src_type, uint32_t src_reg, int32_t src_disp,
> +		enum ginsn_dst_type dst_type, uint32_t dst_reg)
> +{
> +  ginsnS *ginsn = ginsn_init (GINSN_TYPE_LOAD, sym, real_p);
> +  /* src info.  */
> +  gas_assert (src_type == GINSN_SRC_INDIRECT);
> +  ginsn_set_src (&ginsn->src[0], src_type, src_reg, src_disp);
> +  /* dst info.  */
> +  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, 0);
> +
> +  return ginsn;
> +}

... I can't see what these are needed for.

> +ginsnS *
> +ginsn_new_sub (symbolS *sym, bool real_p,
> +	       enum ginsn_src_type src1_type, uint32_t src1_val, int32_t src1_disp,
> +	       enum ginsn_src_type src2_type, uint32_t src2_val, int32_t src2_disp,
> +	       enum ginsn_dst_type dst_type, uint32_t dst_reg, int32_t dst_disp)
> +{
> +  ginsnS *ginsn = ginsn_init (GINSN_TYPE_SUB, sym, real_p);
> +  /* src info.  */
> +  ginsn_set_src (&ginsn->src[0], src1_type, src1_val, src1_disp);
> +  ginsn_set_src (&ginsn->src[1], src2_type, src2_val, src2_disp);
> +  /* dst info.  */
> +  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp);
> +
> +  return ginsn;
> +}
> +
> +/* PS: Note this API does not identify the displacement values of
> +   src1/src2/dst.  At this time, it is unnecessary for correctness to support
> +   the additional argument.  */

And why do the displacements need tracking for add, and, and sub?

> +static bool
> +ginsn_indirect_jump_p (ginsnS *ginsn)
> +{
> +  bool ret_p = false;
> +  if (!ginsn)
> +    return ret_p;
> +
> +  ret_p = (ginsn->type == GINSN_TYPE_JUMP
> +	   && ginsn->src[0].type == GINSN_SRC_REG);

On x86 an indirect jump can also have a memory source operand (i.e. what
you call "indirect").

> +static bool
> +ginsn_direct_local_jump_p (ginsnS *ginsn)
> +{
> +  bool ret_p = false;
> +  if (!ginsn)
> +    return ret_p;
> +
> +  ret_p |= (ginsn->type == GINSN_TYPE_JUMP
> +	    && ginsn->src[0].type == GINSN_SRC_SYMBOL
> +	    && S_IS_LOCAL (ginsn->src[0].sym));

This looks fragile: You may not yet have seen the .globl or .weak directive.

> +static char*
> +ginsn_src_print (struct ginsn_src *src)
> +{
> +  size_t len = 39;
> +  char *src_str = XNEWVEC (char, len);
> +
> +  memset (src_str, 0, len);
> +
> +  if (src->type == GINSN_SRC_REG)
> +    {
> +      char *buf = XNEWVEC (char, 32);
> +      sprintf (buf, "%%r%d, ", ginsn_get_src_reg (src));
> +      strcat (src_str, buf);
> +    }
> +  else if (src->type == GINSN_SRC_IMM)
> +    {
> +      char *buf = XNEWVEC (char, 32);
> +      sprintf (buf, "%d, ", ginsn_get_src_imm (src));

Here and below, if you stick to int32_t as the type, this wants to use
PRId32.

> +static const char*
> +ginsn_type_sym_begin_end_print (ginsnS *ginsn)
> +{
> +  int id = 0;
> +  const char *ginsn_sym_strs[]
> +    = { "", "FUNC_BEGIN", "FUNC_END" };

static and with a 2nd const?

> +static char*
> +ginsn_print (ginsnS *ginsn)
> +{
> +  struct ginsn_src *src;
> +  struct ginsn_dst *dst;
> +  size_t len = GINSN_LISTING_LEN;
> +  char *ginsn_str = XNEWVEC (char, len);
> +
> +  memset (ginsn_str, 0, len);
> +
> +  strcpy (ginsn_str, "ginsn: ");
> +
> +  strcat (ginsn_str, ginsn_type_names[ginsn->type]);
> +  strcat (ginsn_str, " ");
> +
> +  /* For some ginsn types, no further information is printed for now.  */
> +  if (ginsn->type == GINSN_TYPE_CALL
> +      || ginsn->type == GINSN_TYPE_RETURN
> +      || ginsn->type == GINSN_TYPE_OTHER)
> +    goto end;
> +  else if (ginsn->type == GINSN_TYPE_SYMBOL)
> +    {
> +      if (GINSN_F_USER_LABEL_P (ginsn))
> +	strncat (ginsn_str, S_GET_NAME (ginsn->sym), len - 10);
> +      else
> +	strcat (ginsn_str, ginsn_type_sym_begin_end_print (ginsn));
> +      goto end;
> +    }
> +
> +  /* src 1.  */
> +  src = ginsn_get_src1 (ginsn);
> +  strcat (ginsn_str, ginsn_src_print (src));
> +
> +  /* src 2.  */
> +  src = ginsn_get_src2 (ginsn);
> +  strcat (ginsn_str, ginsn_src_print (src));
> +
> +  /* dst.  */
> +  dst = ginsn_get_dst (ginsn);
> +  strcat (ginsn_str, ginsn_dst_print (dst));
> +
> +end:
> +  gas_assert (strlen (ginsn_str) < GINSN_LISTING_LEN);

This check comes too late - if it triggers, you've corrupted memory
already. I wonder anyway why you construct all this "by hand", rather
than using snprintf().

Jan

  parent reply	other threads:[~2023-11-02 15:53 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-30 16:51 [PATCH, V2 00/10] Synthesize " Indu Bhagat
2023-10-30 16:51 ` [PATCH, V2 01/10] gas: dw2gencfi: minor rejig for cfi_sections_set and all_cfi_sections Indu Bhagat
2023-10-30 16:51 ` [PATCH, V2 02/10] gas: dw2gencfi: use all_cfi_sections instead of cfi_sections Indu Bhagat
2023-10-30 16:51 ` [PATCH, V2 03/10] gas: dw2gencfi: expose a new cfi_set_last_fde API Indu Bhagat
2023-10-30 16:51 ` [PATCH, V2 04/10] gas: dw2gencfi: move some tc_* defines to the header file Indu Bhagat
2023-10-30 16:51 ` [PATCH, V2 05/10] gas: add new command line option --scfi[=all,none] Indu Bhagat
2023-10-30 16:51 ` [PATCH, V2 06/10] gas: scfidw2gen: new functionality to prepapre for SCFI Indu Bhagat
2023-10-31 11:28   ` Jan Beulich
2023-10-31 22:06     ` Indu Bhagat
2023-11-02 10:35       ` Jan Beulich
2023-11-02 16:32         ` Indu Bhagat
2023-10-30 16:51 ` [PATCH, V2 07/10] gas: synthesize CFI for hand-written asm Indu Bhagat
2023-10-31 14:10   ` Jan Beulich
2023-11-02  8:15     ` Indu Bhagat
2023-11-02 11:39       ` Jan Beulich
2023-12-10 10:22         ` Indu Bhagat
2023-12-11  7:59           ` Jan Beulich
2023-12-19  7:07             ` Indu Bhagat
2023-11-02 15:53   ` Jan Beulich [this message]
2023-11-04  7:29     ` Indu Bhagat
2023-11-06 11:03       ` Jan Beulich
2023-12-10 10:23         ` Indu Bhagat
2023-12-11  8:02           ` Jan Beulich
2023-10-30 16:51 ` [PATCH, V2 08/10] gas: doc: update documentation for the new listing option Indu Bhagat
2023-10-30 16:51 ` [PATCH, V2 09/10] gas: testsuite: add a x86_64 testsuite for SCFI Indu Bhagat
2023-10-31 16:13   ` Jan Beulich
2023-11-01  6:24     ` Indu Bhagat
2023-11-02 12:28       ` Jan Beulich
2023-11-03  5:45         ` Indu Bhagat
2023-10-30 16:51 ` [PATCH, V2 10/10] gas/NEWS: announce the new command line option Indu Bhagat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce890459-be9e-20ad-5769-8fc53db55c04@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=indu.bhagat@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).