public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 02/14] Add D frontend (GDC) implementation.
@ 2018-09-18  0:34 Iain Buclaw
  2018-09-19 20:06 ` Iain Buclaw
  2018-10-15 14:25 ` David Malcolm
  0 siblings, 2 replies; 19+ messages in thread
From: Iain Buclaw @ 2018-09-18  0:34 UTC (permalink / raw)
  To: gcc-patches

This patch adds the D front-end implementation, the only part of the
compiler that interacts with GCC directly, and being the parts that I
maintain, is something that I can talk about more directly.

For the actual code generation pass, that converts the front-end AST
to GCC trees, most parts use a separate Visitor interfaces to do a
certain kind of lowering, for instance, types.cc builds *_TYPE trees
from AST Type's.  The Visitor class is part of the DMD front-end, and
is defined in dfrontend/visitor.h.

There are also a few interfaces which have their headers in the DMD
frontend, but are implemented here because they do something that
requires knowledge of the GCC backend (d-target.cc), does something
that may not be portable, or differ between D compilers
(d-frontend.cc) or are a thin wrapper around something that is managed
by GCC (d-diagnostic.cc).

Many high level operations result in generation of calls to D runtime
library functions (runtime.def), all with require some kind of runtime
type information (typeinfo.cc).  The compiler also generates functions
for registering/deregistering compiled modules with the D runtime
library (modules.cc).

As well as the D language having it's own built-in functions
(intrinsics.cc), we also expose GCC builtins to D code via a
`gcc.builtins' module (d-builtins.cc), and give special treatment to a
number of UDAs that could be applied to functions (d-attribs.cc).


That is roughly the high level jist of how things are currently organized.

ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch

---
 gcc/d/d-attribs.c     |  846 +++++++++++
 gcc/d/d-builtins.cc   | 1162 +++++++++++++++
 gcc/d/d-codegen.cc    | 2730 +++++++++++++++++++++++++++++++++++
 gcc/d/d-convert.cc    |  807 +++++++++++
 gcc/d/d-diagnostic.cc |  347 +++++
 gcc/d/d-frontend.cc   |  534 +++++++
 gcc/d/d-incpath.cc    |  195 +++
 gcc/d/d-lang.cc       | 1847 ++++++++++++++++++++++++
 gcc/d/d-longdouble.cc |  204 +++
 gcc/d/d-spec.c        |  493 +++++++
 gcc/d/d-target-def.h  |   20 +
 gcc/d/d-target.cc     |  502 +++++++
 gcc/d/d-target.def    |   60 +
 gcc/d/d-target.h      |   34 +
 gcc/d/d-tree.def      |   34 +
 gcc/d/d-tree.h        |  669 +++++++++
 gcc/d/decl.cc         | 2177 ++++++++++++++++++++++++++++
 gcc/d/expr.cc         | 3158 +++++++++++++++++++++++++++++++++++++++++
 gcc/d/imports.cc      |  203 +++
 gcc/d/intrinsics.cc   |  895 ++++++++++++
 gcc/d/intrinsics.def  |  154 ++
 gcc/d/lang-specs.h    |   29 +
 gcc/d/lang.opt        |  366 +++++
 gcc/d/longdouble.h    |  136 ++
 gcc/d/modules.cc      |  849 +++++++++++
 gcc/d/runtime.cc      |  315 ++++
 gcc/d/runtime.def     |  224 +++
 gcc/d/toir.cc         | 1438 +++++++++++++++++++
 gcc/d/typeinfo.cc     | 1667 ++++++++++++++++++++++
 gcc/d/types.cc        |  980 +++++++++++++
 gcc/d/verstr.h        |    1 +
 31 files changed, 23076 insertions(+)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-09-18  0:34 [PATCH 02/14] Add D frontend (GDC) implementation Iain Buclaw
@ 2018-09-19 20:06 ` Iain Buclaw
  2018-10-14 19:28   ` Richard Sandiford
  2018-10-16 11:14   ` [PATCH 02/14] Add D frontend (GDC) implementation Richard Sandiford
  2018-10-15 14:25 ` David Malcolm
  1 sibling, 2 replies; 19+ messages in thread
From: Iain Buclaw @ 2018-09-19 20:06 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jeff Law

On 18 September 2018 at 02:33, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
> This patch adds the D front-end implementation, the only part of the
> compiler that interacts with GCC directly, and being the parts that I
> maintain, is something that I can talk about more directly.
>
> For the actual code generation pass, that converts the front-end AST
> to GCC trees, most parts use a separate Visitor interfaces to do a
> certain kind of lowering, for instance, types.cc builds *_TYPE trees
> from AST Type's.  The Visitor class is part of the DMD front-end, and
> is defined in dfrontend/visitor.h.
>
> There are also a few interfaces which have their headers in the DMD
> frontend, but are implemented here because they do something that
> requires knowledge of the GCC backend (d-target.cc), does something
> that may not be portable, or differ between D compilers
> (d-frontend.cc) or are a thin wrapper around something that is managed
> by GCC (d-diagnostic.cc).
>
> Many high level operations result in generation of calls to D runtime
> library functions (runtime.def), all with require some kind of runtime
> type information (typeinfo.cc).  The compiler also generates functions
> for registering/deregistering compiled modules with the D runtime
> library (modules.cc).
>
> As well as the D language having it's own built-in functions
> (intrinsics.cc), we also expose GCC builtins to D code via a
> `gcc.builtins' module (d-builtins.cc), and give special treatment to a
> number of UDAs that could be applied to functions (d-attribs.cc).
>
>
> That is roughly the high level jist of how things are currently organized.
>
> ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch
>

https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00587.html

During the last round, the last comment given was that the reviewer
wasn't going to dive deep into this code, as it's essentially
converting between the different representations and is code that I'd
be maintaining.

As this is code that other gcc maintainers will be potentially looking
after as well, I'd like any glaring problems to be dealt with
immediately.

Iain.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-09-19 20:06 ` Iain Buclaw
@ 2018-10-14 19:28   ` Richard Sandiford
  2018-10-15  7:36     ` Iain Buclaw
  2018-10-16 11:14   ` [PATCH 02/14] Add D frontend (GDC) implementation Richard Sandiford
  1 sibling, 1 reply; 19+ messages in thread
From: Richard Sandiford @ 2018-10-14 19:28 UTC (permalink / raw)
  To: Iain Buclaw; +Cc: gcc-patches, Jeff Law

[Sorry if this turns out to do be a dup]

Iain Buclaw <ibuclaw@gdcproject.org> writes:
> On 18 September 2018 at 02:33, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>> This patch adds the D front-end implementation, the only part of the
>> compiler that interacts with GCC directly, and being the parts that I
>> maintain, is something that I can talk about more directly.
>>
>> For the actual code generation pass, that converts the front-end AST
>> to GCC trees, most parts use a separate Visitor interfaces to do a
>> certain kind of lowering, for instance, types.cc builds *_TYPE trees
>> from AST Type's.  The Visitor class is part of the DMD front-end, and
>> is defined in dfrontend/visitor.h.
>>
>> There are also a few interfaces which have their headers in the DMD
>> frontend, but are implemented here because they do something that
>> requires knowledge of the GCC backend (d-target.cc), does something
>> that may not be portable, or differ between D compilers
>> (d-frontend.cc) or are a thin wrapper around something that is managed
>> by GCC (d-diagnostic.cc).
>>
>> Many high level operations result in generation of calls to D runtime
>> library functions (runtime.def), all with require some kind of runtime
>> type information (typeinfo.cc).  The compiler also generates functions
>> for registering/deregistering compiled modules with the D runtime
>> library (modules.cc).
>>
>> As well as the D language having it's own built-in functions
>> (intrinsics.cc), we also expose GCC builtins to D code via a
>> `gcc.builtins' module (d-builtins.cc), and give special treatment to a
>> number of UDAs that could be applied to functions (d-attribs.cc).
>>
>>
>> That is roughly the high level jist of how things are currently organized.
>>
>> ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch
>>
>
> https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00587.html
>
> During the last round, the last comment given was that the reviewer
> wasn't going to dive deep into this code, as it's essentially
> converting between the different representations and is code that I'd
> be maintaining.
>
> As this is code that other gcc maintainers will be potentially looking
> after as well, I'd like any glaring problems to be dealt with
> immediately.

I won't claim that I dived deep into the code either, since with a patch
of this size that would take weeks.  But FWIW I read it through trying
to scan for anything that stood out.

I think the patch is OK besides the below.  The comments are in patch
order and so mix very trivial stuff with things that seem more fundamental.
I think the only real blocker is the point below about translations.

> +/* Define attributes that are mutually exclusive with one another.  */
> +static const struct attribute_spec::exclusions attr_noreturn_exclusions[] =
> +{
> +  ATTR_EXCL ("noreturn", true, true, true),

Why does noreturn exclude itself?  Probably worth a comment if deliberate.

> +tree
> +build_attributes (Expressions *eattrs)
> +{
> [...]
> +      tree list = build_tree_list (get_identifier (name), args);
> +      attribs =  chainon (attribs, list);

Too many spaces before "chainon".

> +static tree
> +handle_const_attribute (tree *node, tree ARG_UNUSED (name),
> +			tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			bool * ARG_UNUSED (no_add_attrs))
> +{
> +  tree type = TREE_TYPE (*node);
> +
> +  /* See FIXME comment on noreturn in c_common_attribute_table.  */

Seems strange to be referencing c-family code here, especially since
there's no corresponding comment for handle_noreturn_attribute and
also no FIXME comment in the D version of the table.  Think we should
just drop the comment above.

It's unfortunate that there's so much cut-&-paste with c-attribs.c,
but there again, the current split between target-independent code
(attribs.*) and frontend code (c-attribs.*) isn't very clear-cut,
so I don't think this should hold up acceptance.

I'm going to review it as new code though, so:

> +/* Handle a "malloc" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +tree
> +handle_malloc_attribute (tree *node, tree ARG_UNUSED (name),
> +			 tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			 bool * ARG_UNUSED (no_add_attrs))
> +{
> +  if (TREE_CODE (*node) == FUNCTION_DECL
> +      && POINTER_TYPE_P (TREE_TYPE (TREE_TYPE (*node))))
> +    DECL_IS_MALLOC (*node) = 1;
> +  else
> +    gcc_unreachable ();
> +
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "pure" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_pure_attribute (tree *node, tree ARG_UNUSED (name),
> +		       tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +		       bool * ARG_UNUSED (no_add_attrs))
> +{
> +  if (TREE_CODE (*node) == FUNCTION_DECL)
> +    DECL_PURE_P (*node) = 1;
> +  else
> +    gcc_unreachable ();
> +
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "no vops" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_novops_attribute (tree *node, tree ARG_UNUSED (name),
> +			 tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			 bool * ARG_UNUSED (no_add_attrs))
> +{
> +  gcc_assert (TREE_CODE (*node) == FUNCTION_DECL);
> +  DECL_IS_NOVOPS (*node) = 1;
> +  return NULL_TREE;
> +}

Think the first two should follow the example of novops and use
gcc_assert rather than gcc_unreaachable.  Same for some later functions.

> +/* Build D frontend type from tree TYPE type given.  This will set the backend
> +   type symbol directly for complex types to save build_ctype() the work.
> +   For other types, it is not useful or will causes errors, such as casting

s/causes/cause/ or s/will //

> +  /* For now, we need to tell the D frontend what platform is being targeted.
> +     This should be removed once the frontend has been fixed.  */
> +  if (strcmp (ident, "linux") == 0)
> +    global.params.isLinux = true;
> +  else if (strcmp (ident, "OSX") == 0)
> +    global.params.isOSX = true;
> +  else if (strcmp (ident, "Windows") == 0)
> +    global.params.isWindows = true;
> +  else if (strcmp (ident, "FreeBSD") == 0)
> +    global.params.isFreeBSD = true;
> +  else if (strcmp (ident, "OpenBSD") == 0)
> +    global.params.isOpenBSD = true;
> +  else if (strcmp (ident, "Solaris") == 0)
> +    global.params.isSolaris = true;
> +  else if (strcmp (ident, "X86_64") == 0)
> +    global.params.is64bit = true;

Probably worth adding a comment to say why only X86_64 causes is64bit
to be set, and to say that it's OK to silently ignore other strings
(assuming it is).

> +      tf->purity = DECL_PURE_P (decl) ?   PUREstrong :
> +	TREE_READONLY (decl) ? PUREconst :
> +	DECL_IS_NOVOPS (decl) ? PUREweak :
> +	!DECL_ASSEMBLER_NAME_SET_P (decl) ? PUREweak :
> +	PUREimpure;
> +      tf->trust = !DECL_ASSEMBLER_NAME_SET_P (decl) ? TRUSTsafe :
> +	TREE_NOTHROW (decl) ? TRUSTtrusted :
> +	TRUSTsystem;

Formatting nit, think this should be:

      tf->purity = (DECL_PURE_P (decl) ? PUREstrong
		    : TREE_READONLY (decl) ? PUREconst
		    : DECL_IS_NOVOPS (decl) ? PUREweak
		    : !DECL_ASSEMBLER_NAME_SET_P (decl) ? PUREweak
		    : PUREimpure);

etc.

> +/* Search for any `extern(C)' functions that match any known GCC library builtin
> +   function in D and override it's internal backend symbol.  */

s/it's/its/

> +/* Used to help initialize the builtin-types.def table.  When a type of
> +   the correct size doesn't exist, use error_mark_node instead of NULL.
> +   The later results in segfaults even when a decl using the type doesn't
> +   get invoked.  */

s/later/latter/.  (The typo is in the code you copied this from, but might
as will fix it here.)

> +/* Build nodes that would have be created by the C front-end; necessary
> +   for including builtin-types.def and ultimately builtins.def.  */

"would be" or "would have been"

> +/* Build nodes that are used by the D front-end.
> +   These are distinct from C types.  */
> +
> +static void
> +d_build_d_type_nodes (void)
> +{
> +  /* Integral types.  */
> +  byte_type_node = make_signed_type (8);
> +  ubyte_type_node = make_unsigned_type (8);
> +
> +  short_type_node = make_signed_type (16);
> +  ushort_type_node = make_unsigned_type (16);
> +
> +  int_type_node = make_signed_type (32);
> +  uint_type_node = make_unsigned_type (32);
> +
> +  long_type_node = make_signed_type (64);
> +  ulong_type_node = make_unsigned_type (64);

It's a bit confusing for the D type to be long_type_node but the C/ABI type
to be long_integer_type_node.  The D type is surely an integer too. :-)
With this coming among the handling of built-in functions, it initially
looked related, and I was wondering how it was safe on ILP32 systems
before realising the difference.

Maybe prefixing them all with "d_" would be too ugly, but it would at
least be good to clarify the comment to say that these are "distinct
type nodes" (rather than just distinct definitions, as I'd initially
assumed) and that they're not used outside the frontend, or by the C
imports.

> +/* Return the GCC location for the D frontend location LOC.   */
> +
> +location_t
> +get_linemap (const Loc& loc)

FWIW, I think GCC style is to add a space before "&" rather than
after it, as for pointers.  But since you use this style consistently
(and I'm guessing would instinctively use it in later work), it's
probably better to leave it as-is.  The go interface similarly
has local conventions regarding things like spaces before "(".

> +tree
> +d_array_value (tree type, tree len, tree data)
> +{
> +  /* TODO: Assert type is a D array.  */

Just curious, what makes asserting difficult enough to be a TODO?

> +/* Returns the .funcptr component from the D delegate EXP.  */
> +
> +tree
> +delegate_method (tree exp)
> +{
> +  /* Get the backend type for the array and pick out the array length
> +     field (assumed to be the second field).  */
> +  tree method_field = TREE_CHAIN (TYPE_FIELDS (TREE_TYPE (exp)));
> +  return component_ref (exp, method_field);
> +}
> +
> +/* Returns the .object component from the delegate EXP.  */
> +
> +tree
> +delegate_object (tree exp)
> +{
> +  /* Get the backend type for the array and pick out the array data
> +     pointer field (assumed to be the first field).  */
> +  tree obj_field = TYPE_FIELDS (TREE_TYPE (exp));
> +  return component_ref (exp, obj_field);
> +}

Looks like the body comments might be a cut-&-pasto?  They reference the
right index, but don't match the function comments.

> +/* Return TRUE if EXP is a valid lvalue.  Lvalues references cannot be
> +   made into temporaries, otherwise any assignments will be lost.  */

s/Lvalues references/Lvalue references/?

> +	      /* If the index is NULL, then just assign it to the next field.
> +		 This is expect of layout_typeinfo(), which generates a flat
> +		 list of values that we must shape into the record type.  */
> +	      if (index == field || index == NULL_TREE)
> +		{
> +		  init->ordered_remove (idx);
> +		  if (!finished)
> +		    is_initialized = true;
> +		  break;
> +		}

"This is expect of" looks like a typo.

The function is quadratic in the initialiser size.  It would be
good to avoid that if possible.  That said, it looks like
build_class_instance is also quadratic, so maybe just changing
this on its own wouldn't help much.

> +	  for (size_t i = 0; fd->parameters && i < fd->parameters->dim; i++)
> +	    {
> +	      VarDeclaration *v = (*fd->parameters)[i];
> +	      /* Remove if already in closureVars so can push to front.  */
> +	      for (size_t j = i; j < fd->closureVars.dim; j++)
> +		{
> +		  Dsymbol *s = fd->closureVars[j];
> +		  if (s == v)
> +		    {
> +		      fd->closureVars.remove (j);
> +		      break;
> +		    }
> +		}
> +	      fd->closureVars.insert (i, v);
> +	    }

Looks like this also is quadratic.

> +      /* For nested functions, default to creating a frame.  Even if there is no
> +	 fields to populate the frame, create it anyway, as this will be used as
> +	 the record type instead of `void*` for the this parameter.  */

s/there is no fields/there are no fields/

> +      /* If we are taking the address of a decl that can never be null,
> +	 then the return result is always true.  */
> +      if (DECL_P (TREE_OPERAND (expr, 0))
> +	  && (TREE_CODE (TREE_OPERAND (expr, 0)) == PARM_DECL
> +	      || TREE_CODE (TREE_OPERAND (expr, 0)) == LABEL_DECL
> +	      || !DECL_WEAK (TREE_OPERAND (expr, 0))))
> +	return boolean_true_node;

Does something warn about this?

> +tree
> +convert (tree type, tree expr)
> ...
> +    case COMPLEX_TYPE:
> +      if (TREE_CODE (etype) == REAL_TYPE && TYPE_IMAGINARY_FLOAT (etype))
> +	ret = build2 (COMPLEX_EXPR, type,
> +		      build_zero_cst (TREE_TYPE (type)),
> +		      convert (TREE_TYPE (type), expr));
> +      else
> +	ret = convert_to_complex (type, e);
> +      goto maybe_fold;
> ...
> +    maybe_fold:
> +      if (!TREE_CONSTANT (ret))
> +	ret = fold (ret);
> +      return ret;

etc.  Any reason not to use fold_build* here and in subroutines (for all
cases, not just COMPLEX_TYPE), rather than fold at the end?  Or is the
!TREE_CONSTANT deliberately trying to avoid folding constants that
fold_build* would normally fold?   build* routines set TREE_CONSTANT
for operations with constant operands as well as constant literals,
so is the code trying to avoid folding those?

The routines use fold_build* in some places and plain build* with
a goto in others, but it wasn't obvious why.  Seems like it deserves
a comment.

> +	  /* Strings are treated as dynamic arrays D2.  */

Should that be by/in D2?

> +  return result ? result :
> +    convert (build_ctype (totype), exp);

Odd formatting, fits easily on a line.  If you want to split it,
the ":" should be on the second line rather than the first.

> +/* Apply semantics of assignment to a values of type TOTYPE to EXPR
> +   (e.g., pointer = array -> pointer = &array[0])

Type ("to a values")

> +   Return a TREE representation of EXPR implictly converted to TOTYPE
> +   for use in assignment expressions MODIFY_EXPR, INIT_EXPR.  */

"implicitly"

> +/* Helper routine for all error routines.  Reports a diagnostic specified by
> +   KIND at the explicit location LOC, where the message FORMAT has not yet
> +   been translated by the gcc diagnostic routines.  */
> +
> +static void ATTRIBUTE_GCC_DIAG(3,0)
> +d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char *format,
> +				va_list ap, diagnostic_t kind, bool verbatim)
> +{
> +  va_list argp;
> +  va_copy (argp, ap);
> +
> +  if (loc.filename || !verbatim)
> +    {
> +      rich_location rich_loc (line_table, get_linemap (loc));
> +      diagnostic_info diagnostic;
> +      char *xformat = expand_format (format);
> +
> +      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);

How does this work with translation?  xgettext will only see the original
format string, not the result of expand_format.  Do you have some scripting
to do the same format mangling when collecting the translation strings?
Same concern:

> +void ATTRIBUTE_GCC_DIAG(2,0)
> +verror (const Loc& loc, const char *format, va_list ap,
> +	const char *p1, const char *p2, const char *)
> +{
> +  if (!global.gag || global.params.showGaggedErrors)
> +    {
> +      char *xformat;
> +
> +      /* Build string and emit.  */
> +      if (p2 != NULL)
> +	xformat = xasprintf ("%s %s %s", p1, p2, format);
> +      else if (p1 != NULL)
> +	xformat = xasprintf ("%s %s", p1, format);
> +      else
> +	xformat = xasprintf ("%s", format);

...with the concatenation here.

It looks at first glance like the callers to d_diagnostic_report_diagnostic
should be doing the conversion themselves via _(format), before doing
any concatenation.  Then d_diagnostic_report_diagnostic can use
expand_format on the translated string and use
diagnostic_set_info_translated.  But I'm not an expert on this stuff.

Also, the function should document what p1 and p2 are.  Do they
need to be translated?

Does xgettext get to see all frontend error messages?  Haven't got
to the later patches yet.

Please add an overview comment somewhere in the diagnostic routines
explaining how translation works wrt errors that come from the D
frontend.

> +void ATTRIBUTE_GCC_DIAG(2,0)
> +vmessage(const Loc& loc, const char *format, va_list ap)

Missing space before "(".

+/* Start gagging. Return the current number of gagged errors.  */

Should be two spaces before "Return"

> +/* Write a little-endian 32-bit VALUE to BUFFER.  */
> +
> +void
> +Port::writelongLE (unsigned value, void *buffer)
> +{
> +    unsigned char *p = (unsigned char*) buffer;
> +
> +    p[0] = (unsigned) value;
> +    p[1] = (unsigned) value >> 8;
> +    p[2] = (unsigned) value >> 16;
> +    p[3] = (unsigned) value >> 24;
> +}
> ...
> +/* Write a big-endian 32-bit VALUE to BUFFER.  */
> +
> +void
> +Port::writelongBE (unsigned value, void *buffer)
> +{
> +    unsigned char *p = (unsigned char*) buffer;
> +
> +    p[0] = (unsigned) value >> 24;
> +    p[1] = (unsigned) value >> 16;
> +    p[2] = (unsigned) value >> 8;
> +    p[3] = (unsigned) value;
> +}

Overindented bodies.  Missing space before "*" in "(unsigned char*)"
in all these functions.

Obviously this stuff assumes host CHAR_BIT == 8, but let's be realistic :-)
Is it also used in ways that require the target BITS_PER_UNIT to be 8
as well?  That could realistically be a different value (and we've had
ports like that in the past).

> +/* Return a hash value for real_t value R.  */
> +
> +size_t
> +CTFloat::hash (real_t r)
> +{
> +    return real_hash (&r.rv ());
> +}

Overindented body.

> +/* Write out all dependencies of a given MODULE to the specified BUFFER.
> +   COLMAX is the number of columns to word-wrap at (0 means don't wrap).  */
> +
> +static void
> +deps_write (Module *module, OutBuffer *buffer, unsigned colmax = 72)
> +{
> ...
> +  /* Write out all make dependencies.  */
> +  while (modlist.dim > 0)
> +  {

Misindented while body.

> +      else if (empty_modify_p (TREE_TYPE (op0), op1))
> +	{
> +	  /* Remove any copies of empty aggregates.  Also drop volatile
> +	     loads on the RHS to avoid infinite recursion from
> +	     gimplify_expr trying to load the value.  */
> +	  gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, post_p,
> +			 is_gimple_lvalue, fb_lvalue);
> +	  if (TREE_SIDE_EFFECTS (op1))
> +	    {
> +	      if (TREE_THIS_VOLATILE (op1)
> +		  && (REFERENCE_CLASS_P (op1) || DECL_P (op1)))
> +		op1 = build_fold_addr_expr (op1);
> +
> +	      gimplify_and_add (op1, pre_p);
> +	    }
> +	  *expr_p = TREE_OPERAND (*expr_p, 0);
> +	  ret = GS_OK;
> +	}

It seems odd to gimplify the address of a volatile decl.  When would
that have meaningful side-effects?

What goes wrong if you don't have this block and let the gimplifier
handle empty modify_exprs normally?  The comments explain clearly what
the code is trying to do, but it isn't obvious why this needs to be
treated as a special case.

> +/* Implements the lang_hooks.get_alias_set routine for language D.
> +   Get the alias set corresponding the type or expression T.

s/corresponding the/corresponding to/

> +/* Implements the lang_hooks.get_alias_set routine for language D.
> +   Get the alias set corresponding the type or expression T.
> +   Return -1 if we don't do anything special.  */
> +
> +static alias_set_type
> +d_get_alias_set (tree t)
> +{
> +  /* Permit type-punning when accessing a union, provided the access
> +     is directly through the union.  */
> +  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
> +    {
> +      if (TREE_CODE (u) == COMPONENT_REF
> +	  && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
> +	return 0;
> +    }
> +
> +  /* That's all the expressions we handle.  */
> +  if (!TYPE_P (t))
> +    return get_alias_set (TREE_TYPE (t));
> +
> +  /* For now in D, assume everything aliases everything else,
> +     until we define some solid rules.  */
> +  return 0;
> +}

Any reason for not returning 0 unconditionally?  Won't the queries
deferred to get_alias_set either return 0 directly or come back here
and get 0?

Might be worth adding a comment referencing build_vconvert, which stood
out as a source of what in C would be alias violations.

> +static tree
> +d_eh_personality (void)
> +{
> +  if (!d_eh_personality_decl)
> +    {
> +      d_eh_personality_decl
> +	= build_personality_function ("gdc");
> +    }

Redundant braces, canonical formatting would be:

  if (!d_eh_personality_decl)
    d_eh_personality_decl = build_personality_function ("gdc");

> +/* this bit is set if they did `-lstdc++'.  */

s/this/This/

> +/* What do with libgphobos:
> +   -1 means we should not link in libgphobos
> +   0  means we should link in libgphobos if it is needed
> +   1  means libgphobos is needed and should be linked in.
> +   2  means libgphobos is needed and should be linked statically.
> +   3  means libgphobos is needed and should be linked dynamically. */
> +static int library = 0;

Might be worth using a more specific variable name, since the code
also deals with libgcc and libstdc++.  Maybe add named constants
for the values above?

> +void
> +lang_specific_driver (cl_decoded_option **in_decoded_options,
> +		      unsigned int *in_decoded_options_count,
> +		      int *in_added_libraries)
> +{
> ...
> +  /* If true, the user gave `-g'.  Used by -debuglib */
> ...
> +  /* What default library to use instead of phobos */
> ...
> +  /* What debug library to use instead of phobos */

Comments should end with ".  */"

> +	case OPT_SPECIAL_input_file:
> +	    {

Over-indented block.

> +	      /* Record that this is a D source file.  */
> +	      int len = strlen (arg);
> +	      if (len <= 2 || strcmp (arg + len - 2, ".d") == 0)
> +		{
> +		  if (first_d_file == NULL)
> +		    first_d_file = arg;
> +
> +		  args[i] |= DSOURCE;
> +		}
> +
> +	      /* If we don't know that this is a interface file, we might
> +		 need to link against libphobos library.  */
> +	      if (library == 0)
> +		{
> +		  if (len <= 3 || strcmp (arg + len - 3, ".di") != 0)
> +		    library = 1;
> +		}
> +
> +	      /* If this is a C++ source file, we'll need to link
> +		 against libstdc++ library.  */
> +	      if ((len <= 3 || strcmp (arg + len - 3, ".cc") == 0)
> +		  || (len <= 4 || strcmp (arg + len - 4, ".cpp") == 0)
> +		  || (len <= 4 || strcmp (arg + len - 4, ".c++") == 0))
> +		need_stdcxx = true;
> +
> +	      break;

The len handling looks a bit odd here, since something like "a.d" would
be treated as a c++ file too.  Might be better to use:

    dot = strrchr (arg, '.');

    if (dot && strcmp (dot, ".cc") == 0)

etc.

> +/* Called before linking.  Returns 0 on success and -1 on failure.  */
> +
> +int lang_specific_pre_link (void)

int
lang_specific_pre_link (void)

> +  /* If there is no hardware support, check if we an safely emulate it.  */

s/we an/we can/

> +/* For a vendor-specific type, return a string containing the C++ mangling.
> +   In all other cases, return NULL.  */
> +
> +const char *
> +Target::cppTypeMangle (Type *type)
> +{
> +    if (type->isTypeBasic () || type->ty == Tvector || type->ty == Tstruct)
> +    {
> +        tree ctype = build_ctype (type);
> +        return targetm.mangle_type (ctype);
> +    }
> +
> +    return NULL;
> +}

Misindented if and function body.

> +/* Return the type that will really be used for passing the given parameter
> +   ARG to an extern(C++) function.  */
> +
> +Type *
> +Target::cppParameterType (Parameter *arg)
> +{
> +    Type *t = arg->type->merge2 ();
> +    if (arg->storageClass & (STCout | STCref))
> +      t = t->referenceTo ();
> +    else if (arg->storageClass & STClazy)
> +      {
> +	/* Mangle as delegate.  */
> +	Type *td = TypeFunction::create (NULL, t, 0, LINKd);
> +	td = TypeDelegate::create (td);
> +	t = t->merge2 ();
> +      }
> +
> +    /* Could be a va_list, which we mangle as a pointer.  */
> +    if (t->ty == Tsarray && Type::tvalist->ty == Tsarray)
> +      {
> +	Type *tb = t->toBasetype ()->mutableOf ();
> +	if (tb == Type::tvalist)
> +	  {
> +	    tb = t->nextOf ()->pointerTo ();
> +	    t = tb->castMod (t->mod);
> +	  }
> +      }
> +
> +    return t;
> +}

Function body indented too far.

> +/* Used to represent a D inline assembly statement, an intermediate
> +   representation of an ASM_EXPR.  Reserved for future use and
> +   implementation yet to be defined.  */
> +DEFTREECODE (IASM_EXPR, "iasm_expr", tcc_statement, 5)

Do you need to reserve it?  The codes are local to the frontend
and you can add new ones whenever you want, so it would be good
to leave this out if possible.

> +/* Frame information for a function declaration.  */
> +
> +struct GTY(()) tree_frame_info
> +{
> +    struct tree_common common;
> +    tree frame_type;
> +};

Over-indented body.

> +/* Gate for when the D frontend make an early call into the codegen pass, such

s/make/makes/

> +    /* Anonymous structs/unions only exist as part of others,
> +       do not output forward referenced structs's.  */

s/'s//, unless I've misunderstood what the comment is saying.

> +    /* Ensure all semantic passes have ran.  */

"passes have run" or "passes ran".

> +      /* The real function type may differ from it's declaration.  */

s/it's/its/

> +      /* Initialiser must never be bigger than symbol size.  */

"Initializer" (alas)

> +/* Thunk code is based on g++ */

Comment should end with ".  */"

> +/* Get offset of base class's vtbl[] initialiser from start of csym.  */
> +
> +unsigned
> +base_vtable_offset (ClassDeclaration *cd, BaseClass *bc)
> +{
> ...
> +  return ~0;
> +}

Would be good to document the ~0 return, and probably also assert
that it isn't ~0:

> +	    /* The vtable of the interface length and ptr.  */
> +	    size_t voffset = base_vtable_offset (cd, b);
> +	    value = build_offset (csym, size_int (voffset));

...here.

> +  /* Build an expression of code CODE, data type TYPE, and operands ARG0 and
> +     ARG1.  Perform relevant conversions needs for correct code operations.  */

"relevant conversions needs" looks like a typo.

> +    if (POINTER_TYPE_P (t0) && POINTER_TYPE_P (t1))
> +      {
> +	/* Need to convert pointers to integers because tree-vrp asserts
> +	   against (ptr MINUS ptr).  */
> +	tree ptrtype = lang_hooks.types.type_for_mode (ptr_mode,
> +						       TYPE_UNSIGNED (type));
> +	arg0 = d_convert (ptrtype, arg0);
> +	arg1 = d_convert (ptrtype, arg1);
> +
> +	ret = fold_build2 (code, ptrtype, arg0, arg1);

Should probably now use POINTER_DIFF_EXPR for this.

> +    /* The LHS expression could be an assignment, to which it's operation gets
> +       lost during gimplification.  */
> +    if (TREE_CODE (lhs) == MODIFY_EXPR)
> +      {
> +	lexpr = compound_expr (lexpr, lhs);
> +	lhs = TREE_OPERAND (lhs, 0);
> +      }

s/it's/its/.  But the code looks a bit odd -- won't you end up with
double evaluation of the lhs of the MODIFY_EXPR?  Or is the creator of
the MODIFY_EXPR already guaranteed to have wrapped the lhs in a SAVE_EXPR
where necessary?  Probably worth a comment if so.

> +  /* Build a power expression, which raises its left operand to the power of
> +     its right operand.   */
> +
> +  void visit (PowExp *e)
> +  {
> ...
> +    /* If type is int, implicitly convert to double.  This allows backend
> +       to fold the call into a constant return value.  */
> +    if (e->type->isintegral ())
> +      powtype = double_type_node;

Seems dangerous if pow is defined as an integer operation.
E.g. ((1 << 62) | 1) ** 1 should be (1 << 62) | 1, which isn't
representable in double.

> +  /* Build a assignment operator expression.  The right operand is implicitly
> +     converted to the type of the left operand, and assigned to it.  */

"an assignment operator"

> +  /* Convert for initialising the DECL_RESULT.  */

"initializing"

> +/* Clear the DECL_BUILT_IN_CLASS flag on the function in CALLEXP.  */
> +
> +static void
> +clear_intrinsic_flag (tree callexp)
> +{
> +  tree decl = CALL_EXPR_FN (callexp);
> +
> +  if (TREE_CODE (decl) == ADDR_EXPR)
> +    decl = TREE_OPERAND (decl, 0);
> +
> +  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
> +
> +  DECL_INTRINSIC_CODE (decl) = INTRINSIC_NONE;
> +  DECL_BUILT_IN_CLASS (decl) = NOT_BUILT_IN;
> +  DECL_FUNCTION_CODE (decl) = BUILT_IN_NONE;
> +}

It seems wrong that a particular call to a function would change the
function's properties for all calls, and I don't think there's any
expectation in the midend that this kind of change might happen.

Why's it needed?  Whatever the reason, I think it needs to be done in a
different way.

> +static tree
> +expand_intrinsic_bsf (tree callexp)
> ...
> +  built_in_function code = (argsize <= INT_TYPE_SIZE) ? BUILT_IN_CTZ :
> +    (argsize <= LONG_TYPE_SIZE) ? BUILT_IN_CTZL :
> +    (argsize <= LONG_LONG_TYPE_SIZE) ? BUILT_IN_CTZLL : END_BUILTINS;

":" should be at the start of the line rather than the end.  Same for
later functions.

> +static tree
> +expand_intrinsic_bswap (tree callexp)
> +{
> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> +  int argsize = TYPE_PRECISION (TREE_TYPE (arg));
> +
> +  /* Which variant of __builtin_bswap* should we call?  */
> +  built_in_function code = (argsize == 32) ? BUILT_IN_BSWAP32 :
> +    (argsize == 64) ? BUILT_IN_BSWAP64 : END_BUILTINS;
> +
> +  /* Fallback on runtime implementation, which shouldn't happen as the
> +     argument for bswap() is either 32-bit or 64-bit.  */
> +  if (code == END_BUILTINS)
> +    {
> +      clear_intrinsic_flag (callexp);
> +      return callexp;
> +    }

The earlier intrinsics had this too, but I assumed there it was
because the code was mapping D types to target-dependent ABI types,
and you couldn't guarantee that long long was a 64-bit type.
In the above function there's no doubt that 32 and 64 bits are
the sizes available, so shouldn't this simply assert that
code != END_BUILTINS?

> +static tree
> +expand_intrinsic_sqrt (intrinsic_code intrinsic, tree callexp)
> +{
> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> +  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (arg));
> +
> +  /* arg is an integral type - use double precision.  */
> +  if (INTEGRAL_TYPE_P (type))
> +    arg = fold_convert (double_type_node, arg);
> +
> +  /* Which variant of __builtin_sqrt* should we call?  */
> +  built_in_function code = (intrinsic == INTRINSIC_SQRT) ? BUILT_IN_SQRT :
> +    (intrinsic == INTRINSIC_SQRTF) ? BUILT_IN_SQRTF :
> +    (intrinsic == INTRINSIC_SQRTL) ? BUILT_IN_SQRTL : END_BUILTINS;
> +
> +  gcc_assert (code != END_BUILTINS);
> +  return call_builtin_fn (callexp, code, 1, arg);
> +}

Converting integers to double precision isn't right for SQRTF and SQRTL.
But how do such calls reach here?  It looks like the deco strings in the
function entries provide types for the 3 sqrt intrinsics, is that right?
Does that translate to a well-typed FUNCTION_DECL, or do the function
types use some opaque types instead?  If the intrinsic function itself
is well-typed then an incoming CALLEXP with integer arguments would be
invalid.

> +static tree
> +expand_intrinsic_copysign (tree callexp)
> +{
> ...
> +  /* Which variant of __builtin_copysign* should we call?  */
> +  built_in_function code = (type == float_type_node) ? BUILT_IN_COPYSIGNF :
> +    (type == double_type_node) ? BUILT_IN_COPYSIGN :
> +    (type == long_double_type_node) ? BUILT_IN_COPYSIGNL : END_BUILTINS;

You can use mathfn_built_in for this.

> +  /* Assign returned result to overflow parameter, however if overflow is
> +     already true, maintain it's value.  */

s/it's/its/

> +/* DEF_D_INTRINSIC (CODE, ALIAS, NAME, MODULE, DECO, CTFE)
> +   CODE	    The enum code used to refer this intrinsic.
> +   ALIAS    The enum code used to refer the function DECL_FUNCTION_CODE, if

s/refer/refer to/ in both cases.  Same in runtime.texi.

> +/* This is the contribution to the `default_compilers' array in gcc.c
> +   for the D language.  */
> +
> +{".d", "@d", 0, 1, 0 },
> +{".dd", "@d", 0, 1, 0 },
> +{".di", "@d", 0, 1, 0 },
> +{"@d",
> +  "%{!E:cc1d %i %(cc1_options) %I %{nostdinc*} %{i*} %{I*} %{J*} \
> +    %{H} %{Hd*} %{Hf*} %{MD:-MD %b.deps} %{MMD:-MMD %b.deps} \
> +    %{M} %{MM} %{MF*} %{MG} %{MP} %{MQ*} %{MT*} \
> +    %{X:-Xf %b.json} %{Xf*} \
> +    %{v} %{!fsyntax-only:%(invoke_as)}}", 0, 1, 0 },

Guess this has been said before, but cc1d seems a strange name,
since "cc1" is "C Compiler pass 1".

> +fdeps
> +D
> +This switch is deprecated; do not use.
> +
> +fdeps=
> +D Joined RejectNegative
> +This switch is deprecated; do not use.
> ...
> +fintfc
> +D Alias(H)
> +; Deprecated in favour of -H
> +
> +fintfc-dir=
> +D Joined RejectNegative Alias(Hd)
> +; Deprecated in favour of -Hd
> +
> +fintfc-file=
> +D Joined RejectNegative Alias(Hf)
> +; Deprecated in favour of -Hf
> ...
> +ftransition=safe
> +D Alias(ftransition=dip1000)
> +; Deprecated in favor of -ftransition=dip1000

Any chance of just removing them?  Would be better not to add new code
(from GCC's POV) that's already deprecated.

> +ftransition=all
> +D RejectNegative
> +List information on all language changes

All docstrings should end in ".".  This should be tested by adding
the D frontend to:

foreach cls { "ada" "c" "c++" "fortran" "go" \
                    "optimizers" "param" "target" "warnings" } {

in gcc.misc-tests/help.exp

> +struct longdouble
> +{
> +public:
> +   /* Return the hidden real_value from the longdouble type.  */
> +  const real_value& rv (void) const
> +  { return *(const real_value *) this; }

Overindented comment.

> +/* Use ldouble() to explicitely create a longdouble value.  */

Typo: "explicitly"

> +/* D generates module information to inform the runtime library which modules
> +   needs some kind of special handling.  All `static this()', `static ~this()',

s/needs/need/

> +/* Record information about module initialization, termination,
> +   unit testing, and thread local storage in the compilation.  */
> +
> +struct module_info
> +{
> +  vec<tree, va_gc> *ctors;
> +  vec<tree, va_gc> *dtors;
> +  vec<tree, va_gc> *ctorgates;
> +
> +  vec<tree, va_gc> *sharedctors;
> +  vec<tree, va_gc> *shareddtors;
> +  vec<tree, va_gc> *sharedctorgates;
> +
> +  vec<tree, va_gc> *unitTests;
> +};
> ...
> +/* Static constructors and destructors (not D `static this').  */
> +
> +static vec<tree, va_gc> *static_ctor_list;
> +static vec<tree, va_gc> *static_dtor_list;

Looks odd to be using GC data structures without marking them.  Think it
deserves a comment if correct.

> +/* Generate a call to LIBCALL, returning the result as TYPE.  NARGS is the
> +   number of call arguments, the expressions of wwhich are provided in `...'.

s/wwhich/which/

> +/* Finish the statement tree rooted at T.  */
> +
> +tree
> +pop_stmt_list (void)
> +{
> +  tree t = d_function_chain->stmt_list->pop ();
> +
> +  /* If the statement list is completely empty, just return it.  This is just
> +     as good small as build_empty_stmt, with the advantage that statement
> +     lists are merged when they appended to one another.  So using the

Typo: "good small"
s/they appended/they are appended/

> +	/* Convert for initialising the DECL_RESULT.  */

"initializing"

> +  /* D Inline Assembler is not implemented, as would require a writing
> +     an assembly parser for each supported target.  Instead we leverage
> +     GCC extended assembler using the GccAsmStatement class.  */

s/as would require a writing/as it would require writing/

> +  /* Write out the interfacing vtable[] of base class BCD that will be accessed
> +     from the overriding class CD.  If both are the same class, then this will
> +     be it's own vtable.  INDEX is the offset in the interfaces array of the

s/it's/its/

> +    /* Internal reference should be public, but not visible outside the
> +       compilation unit, as itself is referencing public COMDATs.  */

Comment is hard to parse.

> +/* Returns true if the TypeInfo for type should be placed in
> +   the runtime library.  */
> +
> +static bool
> +builtin_typeinfo_p (Type *type)

"TypeInfo for TYPE"

> +  /* Let C++ do the RTTI generation, and just reference the symbol as
> +     extern, the knowing the underlying type is not required.  */

s/the knowing/knowing/?

> +	  if (!tinfo_types[tk])
> +	    {
> +	      make_internal_typeinfo (tk, ident, ptr_type_node,
> +				      array_type_node, NULL);
> +	    }

Redundant braces.

> +/* Implements a visitor interface to check whether a type is speculative.
> +   TypeInfo_Struct would refer the members of the struct it is representing

"refer to" or "reference"

> +Type *
> +get_object_type (void)
> +{
> +  if (ClassDeclaration::object)
> +    return ClassDeclaration::object->type;
> +
> +  ::error ("missing or corrupt object.d");

Why "::"?  Is it needed to avoid shadowing issues?

> +	  /* Insert the field declaration at it's given offset.  */

"its"

> +      /* Strip const modifiers from type before building.  This is done
> +	 to ensure that backend treats i.e: const (T) as a variant of T,
> +	 and not as two distinct types.  */

s/i.e/e.g./

Probably worth running tabify on the sources.  I noticed a couple
of indents by 8 spaces instead of tabs, but it didn't seem worth
listing them individually.  You can use contrib/check_GNU_style.{sh,py}
to find that kind of thing (although it generates a lot of false
positives for other coding style violations, so take with a pinch
of salt).

Also, s/layed out/laid out/ throughout.

I'll try to review the other pending bits this coming week.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-10-14 19:28   ` Richard Sandiford
@ 2018-10-15  7:36     ` Iain Buclaw
  2018-10-20 10:51       ` Richard Sandiford
  0 siblings, 1 reply; 19+ messages in thread
From: Iain Buclaw @ 2018-10-15  7:36 UTC (permalink / raw)
  To: Iain Buclaw, gcc-patches, Jeff Law, rdsandiford

On 14 October 2018 at 17:29, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> [Sorry if this turns out to do be a dup]
>
> Iain Buclaw <ibuclaw@gdcproject.org> writes:
>> On 18 September 2018 at 02:33, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>>> This patch adds the D front-end implementation, the only part of the
>>> compiler that interacts with GCC directly, and being the parts that I
>>> maintain, is something that I can talk about more directly.
>>>
>>> For the actual code generation pass, that converts the front-end AST
>>> to GCC trees, most parts use a separate Visitor interfaces to do a
>>> certain kind of lowering, for instance, types.cc builds *_TYPE trees
>>> from AST Type's.  The Visitor class is part of the DMD front-end, and
>>> is defined in dfrontend/visitor.h.
>>>
>>> There are also a few interfaces which have their headers in the DMD
>>> frontend, but are implemented here because they do something that
>>> requires knowledge of the GCC backend (d-target.cc), does something
>>> that may not be portable, or differ between D compilers
>>> (d-frontend.cc) or are a thin wrapper around something that is managed
>>> by GCC (d-diagnostic.cc).
>>>
>>> Many high level operations result in generation of calls to D runtime
>>> library functions (runtime.def), all with require some kind of runtime
>>> type information (typeinfo.cc).  The compiler also generates functions
>>> for registering/deregistering compiled modules with the D runtime
>>> library (modules.cc).
>>>
>>> As well as the D language having it's own built-in functions
>>> (intrinsics.cc), we also expose GCC builtins to D code via a
>>> `gcc.builtins' module (d-builtins.cc), and give special treatment to a
>>> number of UDAs that could be applied to functions (d-attribs.cc).
>>>
>>>
>>> That is roughly the high level jist of how things are currently organized.
>>>
>>> ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch
>>>
>>
>> https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00587.html
>>
>> During the last round, the last comment given was that the reviewer
>> wasn't going to dive deep into this code, as it's essentially
>> converting between the different representations and is code that I'd
>> be maintaining.
>>
>> As this is code that other gcc maintainers will be potentially looking
>> after as well, I'd like any glaring problems to be dealt with
>> immediately.
>
> I won't claim that I dived deep into the code either, since with a patch
> of this size that would take weeks.  But FWIW I read it through trying
> to scan for anything that stood out.
>
> I think the patch is OK besides the below.  The comments are in patch
> order and so mix very trivial stuff with things that seem more fundamental.
> I think the only real blocker is the point below about translations.
>

Having a quick look at the fix-ups, I would say all look reasonable.

>> +/* Define attributes that are mutually exclusive with one another.  */
>> +static const struct attribute_spec::exclusions attr_noreturn_exclusions[] =
>> +{
>> +  ATTR_EXCL ("noreturn", true, true, true),
>
> Why does noreturn exclude itself?  Probably worth a comment if deliberate.
>

That does seem wrong indeed, so I'll remove it.

>> +static tree
>> +handle_const_attribute (tree *node, tree ARG_UNUSED (name),
>> +                     tree ARG_UNUSED (args), int ARG_UNUSED (flags),
>> +                     bool * ARG_UNUSED (no_add_attrs))
>> +{
>> +  tree type = TREE_TYPE (*node);
>> +
>> +  /* See FIXME comment on noreturn in c_common_attribute_table.  */
>
> Seems strange to be referencing c-family code here, especially since
> there's no corresponding comment for handle_noreturn_attribute and
> also no FIXME comment in the D version of the table.  Think we should
> just drop the comment above.
>
> It's unfortunate that there's so much cut-&-paste with c-attribs.c,
> but there again, the current split between target-independent code
> (attribs.*) and frontend code (c-attribs.*) isn't very clear-cut,
> so I don't think this should hold up acceptance.
>

Attributes handlers present only for built-in functions would have
been initially copied from LTO if memory serves right.

> I'm going to review it as new code though, so:
>

All suggestions seem reasonable to me.

>> +  /* For now, we need to tell the D frontend what platform is being targeted.
>> +     This should be removed once the frontend has been fixed.  */
>> +  if (strcmp (ident, "linux") == 0)
>> +    global.params.isLinux = true;
>> +  else if (strcmp (ident, "OSX") == 0)
>> +    global.params.isOSX = true;
>> +  else if (strcmp (ident, "Windows") == 0)
>> +    global.params.isWindows = true;
>> +  else if (strcmp (ident, "FreeBSD") == 0)
>> +    global.params.isFreeBSD = true;
>> +  else if (strcmp (ident, "OpenBSD") == 0)
>> +    global.params.isOpenBSD = true;
>> +  else if (strcmp (ident, "Solaris") == 0)
>> +    global.params.isSolaris = true;
>> +  else if (strcmp (ident, "X86_64") == 0)
>> +    global.params.is64bit = true;
>
> Probably worth adding a comment to say why only X86_64 causes is64bit
> to be set, and to say that it's OK to silently ignore other strings
> (assuming it is).
>

The reference D compiler (DMD) only supports x86 and x86_64.

These global.params.isXXX parameters are something inherited from the
DMD front-end, and its been a long term goal of mine to weed them all
out so this is no longer necessary.


>> +/* Build nodes that are used by the D front-end.
>> +   These are distinct from C types.  */
>> +
>> +static void
>> +d_build_d_type_nodes (void)
>> +{
>> +  /* Integral types.  */
>> +  byte_type_node = make_signed_type (8);
>> +  ubyte_type_node = make_unsigned_type (8);
>> +
>> +  short_type_node = make_signed_type (16);
>> +  ushort_type_node = make_unsigned_type (16);
>> +
>> +  int_type_node = make_signed_type (32);
>> +  uint_type_node = make_unsigned_type (32);
>> +
>> +  long_type_node = make_signed_type (64);
>> +  ulong_type_node = make_unsigned_type (64);
>
> It's a bit confusing for the D type to be long_type_node but the C/ABI type
> to be long_integer_type_node.  The D type is surely an integer too. :-)
> With this coming among the handling of built-in functions, it initially
> looked related, and I was wondering how it was safe on ILP32 systems
> before realising the difference.
>
> Maybe prefixing them all with "d_" would be too ugly, but it would at
> least be good to clarify the comment to say that these are "distinct
> type nodes" (rather than just distinct definitions, as I'd initially
> assumed) and that they're not used outside the frontend, or by the C
> imports.
>

If prefixing with "d_", perhaps dropping the "_node" would make them
sufficiently not ugly (d_long_type, d_uint_type, d_byte_type).

Will add a comment though regardless, as that's something you can't
have too much of.

>> +/* Return the GCC location for the D frontend location LOC.   */
>> +
>> +location_t
>> +get_linemap (const Loc& loc)
>
> FWIW, I think GCC style is to add a space before "&" rather than
> after it, as for pointers.  But since you use this style consistently
> (and I'm guessing would instinctively use it in later work), it's
> probably better to leave it as-is.  The go interface similarly
> has local conventions regarding things like spaces before "(".
>
>> +tree
>> +d_array_value (tree type, tree len, tree data)
>> +{
>> +  /* TODO: Assert type is a D array.  */
>
> Just curious, what makes asserting difficult enough to be a TODO?
>

Should be nothing, as all D dynamic array types should be marked as
TYPE_DYNAMIC_ARRAY starting from early last year.

Looking at the history, it looks like that comment pre-dates 2009.

I'll test the corrective action and remove the comment.


>> +      /* If we are taking the address of a decl that can never be null,
>> +      then the return result is always true.  */
>> +      if (DECL_P (TREE_OPERAND (expr, 0))
>> +       && (TREE_CODE (TREE_OPERAND (expr, 0)) == PARM_DECL
>> +           || TREE_CODE (TREE_OPERAND (expr, 0)) == LABEL_DECL
>> +           || !DECL_WEAK (TREE_OPERAND (expr, 0))))
>> +     return boolean_true_node;
>
> Does something warn about this?
>

Nothing warns about this.  Can look into adding if this is a concern.


>> +tree
>> +convert (tree type, tree expr)
>> ...
>> +    case COMPLEX_TYPE:
>> +      if (TREE_CODE (etype) == REAL_TYPE && TYPE_IMAGINARY_FLOAT (etype))
>> +     ret = build2 (COMPLEX_EXPR, type,
>> +                   build_zero_cst (TREE_TYPE (type)),
>> +                   convert (TREE_TYPE (type), expr));
>> +      else
>> +     ret = convert_to_complex (type, e);
>> +      goto maybe_fold;
>> ...
>> +    maybe_fold:
>> +      if (!TREE_CONSTANT (ret))
>> +     ret = fold (ret);
>> +      return ret;
>
> etc.  Any reason not to use fold_build* here and in subroutines (for all
> cases, not just COMPLEX_TYPE), rather than fold at the end?  Or is the
> !TREE_CONSTANT deliberately trying to avoid folding constants that
> fold_build* would normally fold?   build* routines set TREE_CONSTANT
> for operations with constant operands as well as constant literals,
> so is the code trying to avoid folding those?
>
> The routines use fold_build* in some places and plain build* with
> a goto in others, but it wasn't obvious why.  Seems like it deserves
> a comment.
>

Most places where fold_build* vs. build* have been a trial and error
process.  I couldn't comment on any individual one.  Though will run
your suggestion through the testsuite, as it is a nice clean-up.

>> +/* Helper routine for all error routines.  Reports a diagnostic specified by
>> +   KIND at the explicit location LOC, where the message FORMAT has not yet
>> +   been translated by the gcc diagnostic routines.  */
>> +
>> +static void ATTRIBUTE_GCC_DIAG(3,0)
>> +d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char *format,
>> +                             va_list ap, diagnostic_t kind, bool verbatim)
>> +{
>> +  va_list argp;
>> +  va_copy (argp, ap);
>> +
>> +  if (loc.filename || !verbatim)
>> +    {
>> +      rich_location rich_loc (line_table, get_linemap (loc));
>> +      diagnostic_info diagnostic;
>> +      char *xformat = expand_format (format);
>> +
>> +      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);
>
> How does this work with translation?  xgettext will only see the original
> format string, not the result of expand_format.  Do you have some scripting
> to do the same format mangling when collecting the translation strings?
> Same concern:
>

These diagnostic routines handle errors coming from the dmd front-end,
which are not translated - all sources are listed under po/EXCLUDES in
another patch.


>> +void ATTRIBUTE_GCC_DIAG(2,0)
>> +verror (const Loc& loc, const char *format, va_list ap,
>> +     const char *p1, const char *p2, const char *)
>> +{
>> +  if (!global.gag || global.params.showGaggedErrors)
>> +    {
>> +      char *xformat;
>> +
>> +      /* Build string and emit.  */
>> +      if (p2 != NULL)
>> +     xformat = xasprintf ("%s %s %s", p1, p2, format);
>> +      else if (p1 != NULL)
>> +     xformat = xasprintf ("%s %s", p1, format);
>> +      else
>> +     xformat = xasprintf ("%s", format);
>
> ...with the concatenation here.
>
> It looks at first glance like the callers to d_diagnostic_report_diagnostic
> should be doing the conversion themselves via _(format), before doing
> any concatenation.  Then d_diagnostic_report_diagnostic can use
> expand_format on the translated string and use
> diagnostic_set_info_translated.  But I'm not an expert on this stuff.
>
> Also, the function should document what p1 and p2 are.  Do they
> need to be translated?
>
> Does xgettext get to see all frontend error messages?  Haven't got
> to the later patches yet.
>
> Please add an overview comment somewhere in the diagnostic routines
> explaining how translation works wrt errors that come from the D
> frontend.
>

OK, will do.

>> +/* Write a little-endian 32-bit VALUE to BUFFER.  */
>> +
>> +void
>> +Port::writelongLE (unsigned value, void *buffer)
>> +{
>> +    unsigned char *p = (unsigned char*) buffer;
>> +
>> +    p[0] = (unsigned) value;
>> +    p[1] = (unsigned) value >> 8;
>> +    p[2] = (unsigned) value >> 16;
>> +    p[3] = (unsigned) value >> 24;
>> +}
>> ...
>> +/* Write a big-endian 32-bit VALUE to BUFFER.  */
>> +
>> +void
>> +Port::writelongBE (unsigned value, void *buffer)
>> +{
>> +    unsigned char *p = (unsigned char*) buffer;
>> +
>> +    p[0] = (unsigned) value >> 24;
>> +    p[1] = (unsigned) value >> 16;
>> +    p[2] = (unsigned) value >> 8;
>> +    p[3] = (unsigned) value;
>> +}
>
> Overindented bodies.  Missing space before "*" in "(unsigned char*)"
> in all these functions.
>
> Obviously this stuff assumes host CHAR_BIT == 8, but let's be realistic :-)
> Is it also used in ways that require the target BITS_PER_UNIT to be 8
> as well?  That could realistically be a different value (and we've had
> ports like that in the past).
>

These read(long|word)(BE|LE) functions should only ever be used when
reading the BOM of a UTF-16 or UTF-32 file.

I've done a grep, and the write(long|word)(BE|LE) are no longer used
by the dmd frontend, so there's little point keeping them around.

If there's any utility in libiberty or another location then I'd be
more than happy to delegate this action to that here.


>> +      else if (empty_modify_p (TREE_TYPE (op0), op1))
>> +     {
>> +       /* Remove any copies of empty aggregates.  Also drop volatile
>> +          loads on the RHS to avoid infinite recursion from
>> +          gimplify_expr trying to load the value.  */
>> +       gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, post_p,
>> +                      is_gimple_lvalue, fb_lvalue);
>> +       if (TREE_SIDE_EFFECTS (op1))
>> +         {
>> +           if (TREE_THIS_VOLATILE (op1)
>> +               && (REFERENCE_CLASS_P (op1) || DECL_P (op1)))
>> +             op1 = build_fold_addr_expr (op1);
>> +
>> +           gimplify_and_add (op1, pre_p);
>> +         }
>> +       *expr_p = TREE_OPERAND (*expr_p, 0);
>> +       ret = GS_OK;
>> +     }
>
> It seems odd to gimplify the address of a volatile decl.  When would
> that have meaningful side-effects?
>
> What goes wrong if you don't have this block and let the gimplifier
> handle empty modify_exprs normally?  The comments explain clearly what
> the code is trying to do, but it isn't obvious why this needs to be
> treated as a special case.
>

I'll remove it and report back.  I suspect I won't hit any problems
though as "volatile" has been removed from the D language.  There
should be no decls marked as such anymore.

>> +/* Implements the lang_hooks.get_alias_set routine for language D.
>> +   Get the alias set corresponding the type or expression T.
>> +   Return -1 if we don't do anything special.  */
>> +
>> +static alias_set_type
>> +d_get_alias_set (tree t)
>> +{
>> +  /* Permit type-punning when accessing a union, provided the access
>> +     is directly through the union.  */
>> +  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
>> +    {
>> +      if (TREE_CODE (u) == COMPONENT_REF
>> +       && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
>> +     return 0;
>> +    }
>> +
>> +  /* That's all the expressions we handle.  */
>> +  if (!TYPE_P (t))
>> +    return get_alias_set (TREE_TYPE (t));
>> +
>> +  /* For now in D, assume everything aliases everything else,
>> +     until we define some solid rules.  */
>> +  return 0;
>> +}
>
> Any reason for not returning 0 unconditionally?  Won't the queries
> deferred to get_alias_set either return 0 directly or come back here
> and get 0?
>
> Might be worth adding a comment referencing build_vconvert, which stood
> out as a source of what in C would be alias violations.
>

All previous attempts at doing more with this has caused silent
corruption at runtime, the existence of build_vconvert likely doesn't
help either.

Although it isn't in the spec, D should be "strict aliasing".  But
there's a lot of production code that looks like `intvar = *(int
*)&floatvar;` that is currently expected to just work.

By rule of thumb in D, the C behaviour should be followed if it looks
like C code.  The only places where we deviate are made to be a
compiler error.

>> +/* Used to represent a D inline assembly statement, an intermediate
>> +   representation of an ASM_EXPR.  Reserved for future use and
>> +   implementation yet to be defined.  */
>> +DEFTREECODE (IASM_EXPR, "iasm_expr", tcc_statement, 5)
>
> Do you need to reserve it?  The codes are local to the frontend
> and you can add new ones whenever you want, so it would be good
> to leave this out if possible.
>

At this point in time, I never intend to add support for DMD-style
inline asm, so it's better removed.

>> +    /* The LHS expression could be an assignment, to which it's operation gets
>> +       lost during gimplification.  */
>> +    if (TREE_CODE (lhs) == MODIFY_EXPR)
>> +      {
>> +     lexpr = compound_expr (lexpr, lhs);
>> +     lhs = TREE_OPERAND (lhs, 0);
>> +      }
>
> s/it's/its/.  But the code looks a bit odd -- won't you end up with
> double evaluation of the lhs of the MODIFY_EXPR?  Or is the creator of
> the MODIFY_EXPR already guaranteed to have wrapped the lhs in a SAVE_EXPR
> where necessary?  Probably worth a comment if so.
>

It should be guaranteed to be saved if there's a side effect, but that
can be easily checked.

>> +  /* Build a power expression, which raises its left operand to the power of
>> +     its right operand.   */
>> +
>> +  void visit (PowExp *e)
>> +  {
>> ...
>> +    /* If type is int, implicitly convert to double.  This allows backend
>> +       to fold the call into a constant return value.  */
>> +    if (e->type->isintegral ())
>> +      powtype = double_type_node;
>
> Seems dangerous if pow is defined as an integer operation.
> E.g. ((1 << 62) | 1) ** 1 should be (1 << 62) | 1, which isn't
> representable in double.
>

I'll have to check when this could ever be the case and leave an explanation.

The operation: `x ^^ y` in D is rewritten as `std.math.pow(x, y)` if
there's any templates that match argument constraints.  All integral
pow() operations in user code should never reach here, so I suspect
this is something special for CTFE or static var initialisers.


>> +/* Clear the DECL_BUILT_IN_CLASS flag on the function in CALLEXP.  */
>> +
>> +static void
>> +clear_intrinsic_flag (tree callexp)
>> +{
>> +  tree decl = CALL_EXPR_FN (callexp);
>> +
>> +  if (TREE_CODE (decl) == ADDR_EXPR)
>> +    decl = TREE_OPERAND (decl, 0);
>> +
>> +  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
>> +
>> +  DECL_INTRINSIC_CODE (decl) = INTRINSIC_NONE;
>> +  DECL_BUILT_IN_CLASS (decl) = NOT_BUILT_IN;
>> +  DECL_FUNCTION_CODE (decl) = BUILT_IN_NONE;
>> +}
>
> It seems wrong that a particular call to a function would change the
> function's properties for all calls, and I don't think there's any
> expectation in the midend that this kind of change might happen.
>
> Why's it needed?  Whatever the reason, I think it needs to be done in a
> different way.
>

There are some D library math functions that are only treated as
built-in during compile-time only.  When it comes to run-time code
generation, we want to call the D library functions, not the C library
(or built-ins).


>> +static tree
>> +expand_intrinsic_bswap (tree callexp)
>> +{
>> +  tree arg = CALL_EXPR_ARG (callexp, 0);
>> +  int argsize = TYPE_PRECISION (TREE_TYPE (arg));
>> +
>> +  /* Which variant of __builtin_bswap* should we call?  */
>> +  built_in_function code = (argsize == 32) ? BUILT_IN_BSWAP32 :
>> +    (argsize == 64) ? BUILT_IN_BSWAP64 : END_BUILTINS;
>> +
>> +  /* Fallback on runtime implementation, which shouldn't happen as the
>> +     argument for bswap() is either 32-bit or 64-bit.  */
>> +  if (code == END_BUILTINS)
>> +    {
>> +      clear_intrinsic_flag (callexp);
>> +      return callexp;
>> +    }
>
> The earlier intrinsics had this too, but I assumed there it was
> because the code was mapping D types to target-dependent ABI types,
> and you couldn't guarantee that long long was a 64-bit type.
> In the above function there's no doubt that 32 and 64 bits are
> the sizes available, so shouldn't this simply assert that
> code != END_BUILTINS?
>

I'll run that through the testsuite, I can only think that it may trip
if integer promotions weren't done as they should.


>> +static tree
>> +expand_intrinsic_sqrt (intrinsic_code intrinsic, tree callexp)
>> +{
>> +  tree arg = CALL_EXPR_ARG (callexp, 0);
>> +  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (arg));
>> +
>> +  /* arg is an integral type - use double precision.  */
>> +  if (INTEGRAL_TYPE_P (type))
>> +    arg = fold_convert (double_type_node, arg);
>> +
>> +  /* Which variant of __builtin_sqrt* should we call?  */
>> +  built_in_function code = (intrinsic == INTRINSIC_SQRT) ? BUILT_IN_SQRT :
>> +    (intrinsic == INTRINSIC_SQRTF) ? BUILT_IN_SQRTF :
>> +    (intrinsic == INTRINSIC_SQRTL) ? BUILT_IN_SQRTL : END_BUILTINS;
>> +
>> +  gcc_assert (code != END_BUILTINS);
>> +  return call_builtin_fn (callexp, code, 1, arg);
>> +}
>
> Converting integers to double precision isn't right for SQRTF and SQRTL.
> But how do such calls reach here?  It looks like the deco strings in the
> function entries provide types for the 3 sqrt intrinsics, is that right?
> Does that translate to a well-typed FUNCTION_DECL, or do the function
> types use some opaque types instead?  If the intrinsic function itself
> is well-typed then an incoming CALLEXP with integer arguments would be
> invalid.
>

As with pow(), I'll have to check by running this through the
testsuite to see what fails and why.

From memory, the reason is likely some requirement of CTFE, where this
call *must* be const folded at compile-time, which may not happen if
the types are wrong.

>> +/* This is the contribution to the `default_compilers' array in gcc.c
>> +   for the D language.  */
>> +
>> +{".d", "@d", 0, 1, 0 },
>> +{".dd", "@d", 0, 1, 0 },
>> +{".di", "@d", 0, 1, 0 },
>> +{"@d",
>> +  "%{!E:cc1d %i %(cc1_options) %I %{nostdinc*} %{i*} %{I*} %{J*} \
>> +    %{H} %{Hd*} %{Hf*} %{MD:-MD %b.deps} %{MMD:-MMD %b.deps} \
>> +    %{M} %{MM} %{MF*} %{MG} %{MP} %{MQ*} %{MT*} \
>> +    %{X:-Xf %b.json} %{Xf*} \
>> +    %{v} %{!fsyntax-only:%(invoke_as)}}", 0, 1, 0 },
>
> Guess this has been said before, but cc1d seems a strange name,
> since "cc1" is "C Compiler pass 1".
>

It was brought up at the Cauldron.  This predates me, and I've never
had any reason to change it.

I have no problems giving it a better name, I think it's only
Debian/Ubuntu package maintainers that need to be updated on the name
change.

>> +fdeps
>> +D
>> +This switch is deprecated; do not use.
>> +
>> +fdeps=
>> +D Joined RejectNegative
>> +This switch is deprecated; do not use.
>> ...
>> +fintfc
>> +D Alias(H)
>> +; Deprecated in favour of -H
>> +
>> +fintfc-dir=
>> +D Joined RejectNegative Alias(Hd)
>> +; Deprecated in favour of -Hd
>> +
>> +fintfc-file=
>> +D Joined RejectNegative Alias(Hf)
>> +; Deprecated in favour of -Hf
>> ...
>> +ftransition=safe
>> +D Alias(ftransition=dip1000)
>> +; Deprecated in favor of -ftransition=dip1000
>
> Any chance of just removing them?  Would be better not to add new code
> (from GCC's POV) that's already deprecated.
>

I don't think there'd be any tooling that uses these options, so
that's fine by me.


>> +ftransition=all
>> +D RejectNegative
>> +List information on all language changes
>
> All docstrings should end in ".".  This should be tested by adding
> the D frontend to:
>
> foreach cls { "ada" "c" "c++" "fortran" "go" \
>                     "optimizers" "param" "target" "warnings" } {
>
> in gcc.misc-tests/help.exp
>

Ah, good to know!


>> +/* Record information about module initialization, termination,
>> +   unit testing, and thread local storage in the compilation.  */
>> +
>> +struct module_info
>> +{
>> +  vec<tree, va_gc> *ctors;
>> +  vec<tree, va_gc> *dtors;
>> +  vec<tree, va_gc> *ctorgates;
>> +
>> +  vec<tree, va_gc> *sharedctors;
>> +  vec<tree, va_gc> *shareddtors;
>> +  vec<tree, va_gc> *sharedctorgates;
>> +
>> +  vec<tree, va_gc> *unitTests;
>> +};
>> ...
>> +/* Static constructors and destructors (not D `static this').  */
>> +
>> +static vec<tree, va_gc> *static_ctor_list;
>> +static vec<tree, va_gc> *static_dtor_list;
>
> Looks odd to be using GC data structures without marking them.  Think it
> deserves a comment if correct.
>

No, they should be marked.  Will fix that.


>> +Type *
>> +get_object_type (void)
>> +{
>> +  if (ClassDeclaration::object)
>> +    return ClassDeclaration::object->type;
>> +
>> +  ::error ("missing or corrupt object.d");
>
> Why "::"?  Is it needed to avoid shadowing issues?
>

Likely a case of function was converted from a method to a free
function without the body being updated.  I'll correct this.


>> +      /* Strip const modifiers from type before building.  This is done
>> +      to ensure that backend treats i.e: const (T) as a variant of T,
>> +      and not as two distinct types.  */
>
> s/i.e/e.g./
>
> Probably worth running tabify on the sources.  I noticed a couple
> of indents by 8 spaces instead of tabs, but it didn't seem worth
> listing them individually.  You can use contrib/check_GNU_style.{sh,py}
> to find that kind of thing (although it generates a lot of false
> positives for other coding style violations, so take with a pinch
> of salt).
>

Sure, I'll run it though that, thanks for pointing it out.


And thanks again for your time checking this, I'll try to get all
comments addressed in a day or two.

-- 
Iain

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-09-18  0:34 [PATCH 02/14] Add D frontend (GDC) implementation Iain Buclaw
  2018-09-19 20:06 ` Iain Buclaw
@ 2018-10-15 14:25 ` David Malcolm
  2018-10-15 16:46   ` Iain Buclaw
  1 sibling, 1 reply; 19+ messages in thread
From: David Malcolm @ 2018-10-15 14:25 UTC (permalink / raw)
  To: Iain Buclaw, gcc-patches

On Tue, 2018-09-18 at 02:33 +0200, Iain Buclaw wrote:
> This patch adds the D front-end implementation, the only part of the
> compiler that interacts with GCC directly, and being the parts that I
> maintain, is something that I can talk about more directly.
> 
> For the actual code generation pass, that converts the front-end AST
> to GCC trees, most parts use a separate Visitor interfaces to do a
> certain kind of lowering, for instance, types.cc builds *_TYPE trees
> from AST Type's.  The Visitor class is part of the DMD front-end, and
> is defined in dfrontend/visitor.h.
> 
> There are also a few interfaces which have their headers in the DMD
> frontend, but are implemented here because they do something that
> requires knowledge of the GCC backend (d-target.cc), does something
> that may not be portable, or differ between D compilers
> (d-frontend.cc) or are a thin wrapper around something that is
> managed
> by GCC (d-diagnostic.cc).
> 
> Many high level operations result in generation of calls to D runtime
> library functions (runtime.def), all with require some kind of
> runtime
> type information (typeinfo.cc).  The compiler also generates
> functions
> for registering/deregistering compiled modules with the D runtime
> library (modules.cc).
> 
> As well as the D language having it's own built-in functions
> (intrinsics.cc), we also expose GCC builtins to D code via a
> `gcc.builtins' module (d-builtins.cc), and give special treatment to
> a
> number of UDAs that could be applied to functions (d-attribs.cc).
> 
> 
> That is roughly the high level jist of how things are currently
> organized.
> 
> ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch

Hi Iain.  I took at look at this patch, focusing on the diagnostics
side of things.

These are more suggestions than hard review blockers.

diff --git a/gcc/d/d-attribs.c b/gcc/d/d-attribs.c
new file mode 100644
index 00000000000..6c65b8cad9e
--- /dev/null
+++ b/gcc/d/d-attribs.c

I believe all new C++ source files are meant to be .cc, rather than .c,
so this should be d-attribs.cc, rather that d-attribs.c.

[...snip...]

diff --git a/gcc/d/d-codegen.cc b/gcc/d/d-codegen.cc
new file mode 100644
index 00000000000..c698890ba07
--- /dev/null
+++ b/gcc/d/d-codegen.cc

[...snip...]

+/* Return the GCC location for the D frontend location LOC.   */
+
+location_t
+get_linemap (const Loc& loc)
+{

I don't like the name "get_linemap", as it suggests to me that it's
getting the "struct line_map *" for LOC, rather than a location_t.

How about "get_location_t" instead, or "make_location_t"?  The latter
feels more appropriate, as it's doing non-trivial work.

+  location_t gcc_location = input_location;
+
+  if (loc.filename)
+    {
+      linemap_add (line_table, LC_ENTER, 0, loc.filename, loc.linnum);
+      linemap_line_start (line_table, loc.linnum, 0);
+      gcc_location = linemap_position_for_column (line_table, loc.charnum);
+      linemap_add (line_table, LC_LEAVE, 0, NULL, 0);
+    }
+
+  return gcc_location;
+}

[...snip...]

diff --git a/gcc/d/d-diagnostic.cc b/gcc/d/d-diagnostic.cc
new file mode 100644
index 00000000000..8b8a21e31b6
--- /dev/null
+++ b/gcc/d/d-diagnostic.cc
@@ -0,0 +1,347 @@
+/* d-diagnostics.cc -- D frontend interface to gcc diagnostics.
+   Copyright (C) 2017-2018 Free Software Foundation, Inc.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+
+#include "dmd/globals.h"
+#include "dmd/errors.h"
+
+#include "tree.h"
+#include "options.h"
+#include "diagnostic.h"
+
+#include "d-tree.h"
+
+
+/* Rewrite the format string FORMAT to deal with any format extensions not
+   supported by pp_format().  The result should be freed by the caller.  */
+static char *
+expand_format (const char *format)

Am I right in thinking this is to handle FORMAT strings coming from the
upstream D frontend, and this has its own formatting conventions?
Please can the leading comment have example(s) of the format, and what
it becomes (e.g. the backticks thing).

Maybe adopt a naming convention in the file, to distinguish d format
strings from pp format strings?  Maybe "d_format" vs "gcc_format"
?(though given the verbatim vs !verbatim below, am not sure how
feasible that is).

Maybe rename this function to expand_d_format??

(Might be nice to add some unittesting of this function via "selftest",
but that's definitely not a requirement; it's awkward to add right now)

+{
+  OutBuffer buf;
+  bool inbacktick = false;
+
+  for (const char *p = format; *p; )
+    {
+      while (*p != '\0' && *p != '%' && *p != '`')
+	{
+	  buf.writeByte (*p);
+	  p++;
+	}
+
+      if (*p == '\0')
+	break;
+
+      if (*p == '`')
+	{
+	  /* Text enclosed by `...` are translated as a quoted string.  */
+	  if (inbacktick)
+	    {
+	      buf.writestring ("%>");
+	      inbacktick = false;
+	    }
+	  else
+	    {
+	      buf.writestring ("%<");
+	      inbacktick = true;
+	    }
+	  p++;
+	  continue;
+	}
+
+      /* Check the conversion specification for unhandled flags.  */
+      buf.writeByte (*p);
+      p++;
+
+    Lagain:
+      switch (*p)
+	{
+	case '\0':
+	  /* Malformed format string.  */
+	  gcc_unreachable ();
+
+	case '-':
+	  /* Remove whitespace formatting. */
+	  p++;
+	  while (ISDIGIT (*p))
+	    p++;
+	  goto Lagain;
+
+	case '0':
+	  /* Remove zero padding from format string.  */
+	  while (ISDIGIT (*p))
+	    p++;
+	  goto Lagain;
+
+	case 'X':
+	  /* Hex format only supports lower-case.  */
+	  buf.writeByte ('x');
+	  p++;
+	  break;
+
+	default:
+	  break;
+	}
+    }
+
+  gcc_assert (!inbacktick);
+  return buf.extractString ();
+}
+
+/* Helper routine for all error routines.  Reports a diagnostic specified by
+   KIND at the explicit location LOC, where the message FORMAT has not yet
+   been translated by the gcc diagnostic routines.  */
+
+static void ATTRIBUTE_GCC_DIAG(3,0)
+d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char *format,
+				va_list ap, diagnostic_t kind, bool verbatim)

Given the call to expand_format for the !verbatim case, it seems that
ATTRIBUTE_GCC_DIAG isn't necessarily quite right here, as FORMAT would
then be a D format string.  But it's probably close enough for now.

+{
+  va_list argp;
+  va_copy (argp, ap);
+
+  if (loc.filename || !verbatim)
+    {
+      rich_location rich_loc (line_table, get_linemap (loc));
+      diagnostic_info diagnostic;
+      char *xformat = expand_format (format);
+
+      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);
+      if (opt != 0)
+	diagnostic.option_index = opt;
+
+      diagnostic_report_diagnostic (global_dc, &diagnostic);
+      free (xformat);
+    }
+  else
+    {
+      /* Write verbatim messages with no location direct to stream.  */
+      text_info text;
+      text.err_no = errno;
+      text.args_ptr = &argp;
+      text.format_spec = expand_format (format);
+      text.x_data = NULL;
+
+      pp_format_verbatim (global_dc->printer, &text);
+      pp_newline_and_flush (global_dc->printer);
+    }
+
+  va_end (argp);
+}
+
+/* Print a hard error message with explicit location LOC, increasing the
+   global or gagged error count.  */
+
+void ATTRIBUTE_GCC_DIAG(2,3)
+error (const Loc& loc, const char *format, ...)
+{
+  va_list ap;
+  va_start (ap, format);
+  verror (loc, format, ap);
+  va_end (ap);
+}
+
+void ATTRIBUTE_GCC_DIAG(2,0)
+verror (const Loc& loc, const char *format, va_list ap,
+	const char *p1, const char *p2, const char *)

This one needs a leading comment: what's the deal with P1 and P2?

+{
+  if (!global.gag || global.params.showGaggedErrors)
+    {
+      char *xformat;
+
+      /* Build string and emit.  */
+      if (p2 != NULL)
+	xformat = xasprintf ("%s %s %s", p1, p2, format);
+      else if (p1 != NULL)
+	xformat = xasprintf ("%s %s", p1, format);
+      else
+	xformat = xasprintf ("%s", format);
+
+      d_diagnostic_report_diagnostic (loc, 0, xformat, ap,
+				      global.gag ? DK_ANACHRONISM : DK_ERROR,
+				      false);
+      free (xformat);
+    }
+
+  if (global.gag)
+    global.gaggedErrors++;
+
+  global.errors++;
+}
+
+/* Print supplementary message about the last error with explicit location LOC.
+   This doesn't increase the global error count.  */
+
+void ATTRIBUTE_GCC_DIAG(2,3)
+errorSupplemental (const Loc& loc, const char *format, ...)
+{
+  va_list ap;
+  va_start (ap, format);
+  verrorSupplemental (loc, format, ap);
+  va_end (ap);
+}
+
+void ATTRIBUTE_GCC_DIAG(2,0)
+verrorSupplemental (const Loc& loc, const char *format, va_list ap)
+{
+  if (global.gag && !global.params.showGaggedErrors)
+    return;
+
+  d_diagnostic_report_diagnostic (loc, 0, format, ap, DK_NOTE, false);
+}
+
+/* Print a warning message with explicit location LOC, increasing the
+   global warning count.  */
+
+void ATTRIBUTE_GCC_DIAG(2,3)
+warning (const Loc& loc, const char *format, ...)
+{
+  va_list ap;
+  va_start (ap, format);
+  vwarning (loc, format, ap);
+  va_end (ap);
+}
+
+void ATTRIBUTE_GCC_DIAG(2,0)
+vwarning (const Loc& loc, const char *format, va_list ap)
+{
+  if (global.params.warnings && !global.gag)
+    {
+      /* Warnings don't count if gagged.  */
+      if (global.params.warnings == 1)

What's the magic "1" value above?  Can it be an enum?  (or is this
something from the upstream parsing code?)

+	global.warnings++;
+
+      d_diagnostic_report_diagnostic (loc, 0, format, ap, DK_WARNING, false);
+    }
+}
+
+/* Print supplementary message about the last warning with explicit location
+   LOC.  This doesn't increase the global warning count.  */
+
+void ATTRIBUTE_GCC_DIAG(2,3)
+warningSupplemental (const Loc& loc, const char *format, ...)
+{
+  va_list ap;
+  va_start (ap, format);
+  vwarningSupplemental (loc, format, ap);
+  va_end (ap);
+}
+
+void ATTRIBUTE_GCC_DIAG(2,0)
+vwarningSupplemental (const Loc& loc, const char *format, va_list ap)
+{
+  if (!global.params.warnings || global.gag)
+    return;
+
+  d_diagnostic_report_diagnostic (loc, 0, format, ap, DK_NOTE, false);
+}
+
+/* Print a deprecation message with explicit location LOC, increasing the
+   global warning or error count depending on how deprecations are treated.  */
+
+void ATTRIBUTE_GCC_DIAG(2,3)
+deprecation (const Loc& loc, const char *format, ...)
+{
+  va_list ap;
+  va_start (ap, format);
+  vdeprecation (loc, format, ap);
+  va_end (ap);
+}
+
+void ATTRIBUTE_GCC_DIAG(2,0)
+vdeprecation (const Loc& loc, const char *format, va_list ap,
+	      const char *p1, const char *p2)
+{

Again: please can this have a leading comment explaining P1 and P2.

+  if (global.params.useDeprecated == 0)
+    verror (loc, format, ap, p1, p2);
+  else if (global.params.useDeprecated == 2 && !global.gag)

Can these "0" and "2" values be enums?

+    {
+      char *xformat;
+
+      /* Build string and emit.  */
+      if (p2 != NULL)
+	xformat = xasprintf ("%s %s %s", p1, p2, format);
+      else if (p1 != NULL)
+	xformat = xasprintf ("%s %s", p1, format);
+      else
+	xformat = xasprintf ("%s", format);
+
+      d_diagnostic_report_diagnostic (loc, OPT_Wdeprecated, xformat, ap,
+				      DK_WARNING, false);
+      free (xformat);
+    }
+}
+
+/* Print supplementary message about the last deprecation with explicit
+   location LOC.  This does not increase the global error count.  */
+
+void ATTRIBUTE_GCC_DIAG(2,3)
+deprecationSupplemental (const Loc& loc, const char *format, ...)
+{
+  va_list ap;
+  va_start (ap, format);
+  vdeprecationSupplemental (loc, format, ap);
+  va_end (ap);
+}
+
+void ATTRIBUTE_GCC_DIAG(2,0)
+vdeprecationSupplemental (const Loc& loc, const char *format, va_list ap)
+{
+  if (global.params.useDeprecated == 0)
+    verrorSupplemental (loc, format, ap);
+  else if (global.params.useDeprecated == 2 && !global.gag)
+    d_diagnostic_report_diagnostic (loc, 0, format, ap, DK_NOTE, false);

As before for the "0" and "2".

+}
+
+/* Print a verbose message with explicit location LOC.  */
+
+void ATTRIBUTE_GCC_DIAG(2, 3)
+message (const Loc& loc, const char *format, ...)
+{
+  va_list ap;
+  va_start (ap, format);
+  vmessage (loc, format, ap);
+  va_end (ap);
+}
+
+void ATTRIBUTE_GCC_DIAG(2,0)
+vmessage(const Loc& loc, const char *format, va_list ap)
+{
+  d_diagnostic_report_diagnostic (loc, 0, format, ap, DK_NOTE, true);
+}
+
+/* Same as above, but doesn't take a location argument.  */
+
+void ATTRIBUTE_GCC_DIAG(1, 2)
+message (const char *format, ...)
+{
+  va_list ap;
+  va_start (ap, format);
+  vmessage (Loc (), format, ap);
+  va_end (ap);
+}
+
+/* Call this after printing out fatal error messages to clean up and
+   exit the compiler.  */
+
+void
+fatal (void)
+{
+  exit (FATAL_EXIT_CODE);
+}
[...snip...]

diff --git a/gcc/d/d-spec.c b/gcc/d/d-spec.c
new file mode 100644
index 00000000000..6a68ebbcf24
--- /dev/null
+++ b/gcc/d/d-spec.c

As above, this new C++ file should be .cc, rather than .c.

[...snip...]

Hope this is constructive
Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-10-15 14:25 ` David Malcolm
@ 2018-10-15 16:46   ` Iain Buclaw
  0 siblings, 0 replies; 19+ messages in thread
From: Iain Buclaw @ 2018-10-15 16:46 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

On Mon, 15 Oct 2018 at 16:19, David Malcolm <dmalcolm@redhat.com> wrote:
>
> On Tue, 2018-09-18 at 02:33 +0200, Iain Buclaw wrote:
> > This patch adds the D front-end implementation, the only part of the
> > compiler that interacts with GCC directly, and being the parts that I
> > maintain, is something that I can talk about more directly.
> >
> > For the actual code generation pass, that converts the front-end AST
> > to GCC trees, most parts use a separate Visitor interfaces to do a
> > certain kind of lowering, for instance, types.cc builds *_TYPE trees
> > from AST Type's.  The Visitor class is part of the DMD front-end, and
> > is defined in dfrontend/visitor.h.
> >
> > There are also a few interfaces which have their headers in the DMD
> > frontend, but are implemented here because they do something that
> > requires knowledge of the GCC backend (d-target.cc), does something
> > that may not be portable, or differ between D compilers
> > (d-frontend.cc) or are a thin wrapper around something that is
> > managed
> > by GCC (d-diagnostic.cc).
> >
> > Many high level operations result in generation of calls to D runtime
> > library functions (runtime.def), all with require some kind of
> > runtime
> > type information (typeinfo.cc).  The compiler also generates
> > functions
> > for registering/deregistering compiled modules with the D runtime
> > library (modules.cc).
> >
> > As well as the D language having it's own built-in functions
> > (intrinsics.cc), we also expose GCC builtins to D code via a
> > `gcc.builtins' module (d-builtins.cc), and give special treatment to
> > a
> > number of UDAs that could be applied to functions (d-attribs.cc).
> >
> >
> > That is roughly the high level jist of how things are currently
> > organized.
> >
> > ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch
>
> Hi Iain.  I took at look at this patch, focusing on the diagnostics
> side of things.
>
> These are more suggestions than hard review blockers.
>
> diff --git a/gcc/d/d-attribs.c b/gcc/d/d-attribs.c
> new file mode 100644
> index 00000000000..6c65b8cad9e
> --- /dev/null
> +++ b/gcc/d/d-attribs.c
>
> I believe all new C++ source files are meant to be .cc, rather than .c,
> so this should be d-attribs.cc, rather that d-attribs.c.
>
> [...snip...]
>
> diff --git a/gcc/d/d-codegen.cc b/gcc/d/d-codegen.cc
> new file mode 100644
> index 00000000000..c698890ba07
> --- /dev/null
> +++ b/gcc/d/d-codegen.cc
>
> [...snip...]
>
> +/* Return the GCC location for the D frontend location LOC.   */
> +
> +location_t
> +get_linemap (const Loc& loc)
> +{
>
> I don't like the name "get_linemap", as it suggests to me that it's
> getting the "struct line_map *" for LOC, rather than a location_t.
>
> How about "get_location_t" instead, or "make_location_t"?  The latter
> feels more appropriate, as it's doing non-trivial work.
>

OK.

> +/* Rewrite the format string FORMAT to deal with any format extensions not
> +   supported by pp_format().  The result should be freed by the caller.  */
> +static char *
> +expand_format (const char *format)
>
> Am I right in thinking this is to handle FORMAT strings coming from the
> upstream D frontend, and this has its own formatting conventions?
> Please can the leading comment have example(s) of the format, and what
> it becomes (e.g. the backticks thing).
>
> Maybe adopt a naming convention in the file, to distinguish d format
> strings from pp format strings?  Maybe "d_format" vs "gcc_format"
> ?(though given the verbatim vs !verbatim below, am not sure how
> feasible that is).
>
> Maybe rename this function to expand_d_format??
>
> (Might be nice to add some unittesting of this function via "selftest",
> but that's definitely not a requirement; it's awkward to add right now)
>

Yes, you are right, it's handling format specifiers coming from the
upstream front-end.

The only peculiar convention is the use of backticks to quote parts of
the message (dmd actually does syntax highlighting).

Others in this function just rewrite format specifiers that GCC
diagnostics don't understand.

> +
> +void ATTRIBUTE_GCC_DIAG(2,0)
> +verror (const Loc& loc, const char *format, va_list ap,
> +       const char *p1, const char *p2, const char *)
>
> This one needs a leading comment: what's the deal with P1 and P2?
>

I'll rename them to prefix1 and prefix2 with appropriate comments.

> +
> +void ATTRIBUTE_GCC_DIAG(2,0)
> +vwarning (const Loc& loc, const char *format, va_list ap)
> +{
> +  if (global.params.warnings && !global.gag)
> +    {
> +      /* Warnings don't count if gagged.  */
> +      if (global.params.warnings == 1)
>
> What's the magic "1" value above?  Can it be an enum?  (or is this
> something from the upstream parsing code?)
>

This is from upstream, there are a few places that use numbers instead
of enums.  This is one such place, I've not really given myself any
time to go through them and correct them.  If you feel strongly about
this, I can set aside some time to get this fixed in upstream.

> +  if (global.params.useDeprecated == 0)
> +    verror (loc, format, ap, p1, p2);
> +  else if (global.params.useDeprecated == 2 && !global.gag)
>
> Can these "0" and "2" values be enums?
>

Same as above, I'll push this upstream and trickle it down here.

>
> Hope this is constructive

Thanks
-- 
Iain

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-09-19 20:06 ` Iain Buclaw
  2018-10-14 19:28   ` Richard Sandiford
@ 2018-10-16 11:14   ` Richard Sandiford
  2018-10-16 11:39     ` Richard Sandiford
  2018-10-21  0:02     ` Iain Buclaw
  1 sibling, 2 replies; 19+ messages in thread
From: Richard Sandiford @ 2018-10-16 11:14 UTC (permalink / raw)
  To: Iain Buclaw; +Cc: gcc-patches, Jeff Law

Iain Buclaw <ibuclaw@gdcproject.org> writes:
> On 18 September 2018 at 02:33, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>> This patch adds the D front-end implementation, the only part of the
>> compiler that interacts with GCC directly, and being the parts that I
>> maintain, is something that I can talk about more directly.
>>
>> For the actual code generation pass, that converts the front-end AST
>> to GCC trees, most parts use a separate Visitor interfaces to do a
>> certain kind of lowering, for instance, types.cc builds *_TYPE trees
>> from AST Type's.  The Visitor class is part of the DMD front-end, and
>> is defined in dfrontend/visitor.h.
>>
>> There are also a few interfaces which have their headers in the DMD
>> frontend, but are implemented here because they do something that
>> requires knowledge of the GCC backend (d-target.cc), does something
>> that may not be portable, or differ between D compilers
>> (d-frontend.cc) or are a thin wrapper around something that is managed
>> by GCC (d-diagnostic.cc).
>>
>> Many high level operations result in generation of calls to D runtime
>> library functions (runtime.def), all with require some kind of runtime
>> type information (typeinfo.cc).  The compiler also generates functions
>> for registering/deregistering compiled modules with the D runtime
>> library (modules.cc).
>>
>> As well as the D language having it's own built-in functions
>> (intrinsics.cc), we also expose GCC builtins to D code via a
>> `gcc.builtins' module (d-builtins.cc), and give special treatment to a
>> number of UDAs that could be applied to functions (d-attribs.cc).
>>
>>
>> That is roughly the high level jist of how things are currently organized.
>>
>> ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch
>>
>
> https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00587.html
>
> During the last round, the last comment given was that the reviewer
> wasn't going to dive deep into this code, as it's essentially
> converting between the different representations and is code that I'd
> be maintaining.
>
> As this is code that other gcc maintainers will be potentially looking
> after as well, I'd like any glaring problems to be dealt with
> immediately.

I won't claim that I dived deep into the code either, since with a patch
of this size that would take weeks.  But FWIW I read it through trying
to scan for anything that stood out.

I think the patch is OK besides the below.  The comments are in patch
order and so mix very trivial stuff with things that seem more fundamental.
I think the only real blocker is the point below about translations.

> +/* Define attributes that are mutually exclusive with one another.  */
> +static const struct attribute_spec::exclusions attr_noreturn_exclusions[] =
> +{
> +  ATTR_EXCL ("noreturn", true, true, true),

Why does noreturn exclude itself?  Probably worth a comment if deliberate.

> +tree
> +build_attributes (Expressions *eattrs)
> +{
> [...]
> +      tree list = build_tree_list (get_identifier (name), args);
> +      attribs =  chainon (attribs, list);

Too many spaces before "chainon".

> +static tree
> +handle_const_attribute (tree *node, tree ARG_UNUSED (name),
> +			tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			bool * ARG_UNUSED (no_add_attrs))
> +{
> +  tree type = TREE_TYPE (*node);
> +
> +  /* See FIXME comment on noreturn in c_common_attribute_table.  */

Seems strange to be referencing c-family code here, especially since
there's no corresponding comment for handle_noreturn_attribute and
also no FIXME comment in the D version of the table.  Think we should
just drop the comment above.

It's unfortunate that there's so much cut-&-paste with c-attribs.c,
but there again, the current split between target-independent code
(attribs.*) and frontend code (c-attribs.*) isn't very clear-cut,
so I don't think this should hold up acceptance.

I'm going to review it as new code though, so:

> +/* Handle a "malloc" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +tree
> +handle_malloc_attribute (tree *node, tree ARG_UNUSED (name),
> +			 tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			 bool * ARG_UNUSED (no_add_attrs))
> +{
> +  if (TREE_CODE (*node) == FUNCTION_DECL
> +      && POINTER_TYPE_P (TREE_TYPE (TREE_TYPE (*node))))
> +    DECL_IS_MALLOC (*node) = 1;
> +  else
> +    gcc_unreachable ();
> +
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "pure" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_pure_attribute (tree *node, tree ARG_UNUSED (name),
> +		       tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +		       bool * ARG_UNUSED (no_add_attrs))
> +{
> +  if (TREE_CODE (*node) == FUNCTION_DECL)
> +    DECL_PURE_P (*node) = 1;
> +  else
> +    gcc_unreachable ();
> +
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "no vops" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_novops_attribute (tree *node, tree ARG_UNUSED (name),
> +			 tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			 bool * ARG_UNUSED (no_add_attrs))
> +{
> +  gcc_assert (TREE_CODE (*node) == FUNCTION_DECL);
> +  DECL_IS_NOVOPS (*node) = 1;
> +  return NULL_TREE;
> +}

Think the first two should follow the example of novops and use
gcc_assert rather than gcc_unreaachable.  Same for some later functions.

> +/* Build D frontend type from tree TYPE type given.  This will set the backend
> +   type symbol directly for complex types to save build_ctype() the work.
> +   For other types, it is not useful or will causes errors, such as casting

s/causes/cause/ or s/will //

> +  /* For now, we need to tell the D frontend what platform is being targeted.
> +     This should be removed once the frontend has been fixed.  */
> +  if (strcmp (ident, "linux") == 0)
> +    global.params.isLinux = true;
> +  else if (strcmp (ident, "OSX") == 0)
> +    global.params.isOSX = true;
> +  else if (strcmp (ident, "Windows") == 0)
> +    global.params.isWindows = true;
> +  else if (strcmp (ident, "FreeBSD") == 0)
> +    global.params.isFreeBSD = true;
> +  else if (strcmp (ident, "OpenBSD") == 0)
> +    global.params.isOpenBSD = true;
> +  else if (strcmp (ident, "Solaris") == 0)
> +    global.params.isSolaris = true;
> +  else if (strcmp (ident, "X86_64") == 0)
> +    global.params.is64bit = true;

Probably worth adding a comment to say why only X86_64 causes is64bit
to be set, and to say that it's OK to silently ignore other strings
(assuming it is).

> +      tf->purity = DECL_PURE_P (decl) ?   PUREstrong :
> +	TREE_READONLY (decl) ? PUREconst :
> +	DECL_IS_NOVOPS (decl) ? PUREweak :
> +	!DECL_ASSEMBLER_NAME_SET_P (decl) ? PUREweak :
> +	PUREimpure;
> +      tf->trust = !DECL_ASSEMBLER_NAME_SET_P (decl) ? TRUSTsafe :
> +	TREE_NOTHROW (decl) ? TRUSTtrusted :
> +	TRUSTsystem;

Formatting nit, think this should be:

      tf->purity = (DECL_PURE_P (decl) ? PUREstrong
		    : TREE_READONLY (decl) ? PUREconst
		    : DECL_IS_NOVOPS (decl) ? PUREweak
		    : !DECL_ASSEMBLER_NAME_SET_P (decl) ? PUREweak
		    : PUREimpure);

etc.

> +/* Search for any `extern(C)' functions that match any known GCC library builtin
> +   function in D and override it's internal backend symbol.  */

s/it's/its/

> +/* Used to help initialize the builtin-types.def table.  When a type of
> +   the correct size doesn't exist, use error_mark_node instead of NULL.
> +   The later results in segfaults even when a decl using the type doesn't
> +   get invoked.  */

s/later/latter/.  (The typo is in the code you copied this from, but might
as will fix it here.)

> +/* Build nodes that would have be created by the C front-end; necessary
> +   for including builtin-types.def and ultimately builtins.def.  */

"would be" or "would have been"

> +/* Build nodes that are used by the D front-end.
> +   These are distinct from C types.  */
> +
> +static void
> +d_build_d_type_nodes (void)
> +{
> +  /* Integral types.  */
> +  byte_type_node = make_signed_type (8);
> +  ubyte_type_node = make_unsigned_type (8);
> +
> +  short_type_node = make_signed_type (16);
> +  ushort_type_node = make_unsigned_type (16);
> +
> +  int_type_node = make_signed_type (32);
> +  uint_type_node = make_unsigned_type (32);
> +
> +  long_type_node = make_signed_type (64);
> +  ulong_type_node = make_unsigned_type (64);

It's a bit confusing for the D type to be long_type_node but the C/ABI type
to be long_integer_type_node.  The D type is surely an integer too. :-)
With this coming among the handling of built-in functions, it initially
looked related, and I was wondering how it was safe on ILP32 systems
before realising the difference.

Maybe prefixing them all with "d_" would be too ugly, but it would at
least be good to clarify the comment to say that these are "distinct
type nodes" (rather than just distinct definitions, as I'd initially
assumed) and that they're not used outside the frontend, or by the C
imports.

> +/* Return the GCC location for the D frontend location LOC.   */
> +
> +location_t
> +get_linemap (const Loc& loc)

FWIW, I think GCC style is to add a space before "&" rather than
after it, as for pointers.  But since you use this style consistently
(and I'm guessing would instinctively use it in later work), it's
probably better to leave it as-is.  The go interface similarly
has local conventions regarding things like spaces before "(".

> +tree
> +d_array_value (tree type, tree len, tree data)
> +{
> +  /* TODO: Assert type is a D array.  */

Just curious, what makes asserting difficult enough to be a TODO?

> +/* Returns the .funcptr component from the D delegate EXP.  */
> +
> +tree
> +delegate_method (tree exp)
> +{
> +  /* Get the backend type for the array and pick out the array length
> +     field (assumed to be the second field).  */
> +  tree method_field = TREE_CHAIN (TYPE_FIELDS (TREE_TYPE (exp)));
> +  return component_ref (exp, method_field);
> +}
> +
> +/* Returns the .object component from the delegate EXP.  */
> +
> +tree
> +delegate_object (tree exp)
> +{
> +  /* Get the backend type for the array and pick out the array data
> +     pointer field (assumed to be the first field).  */
> +  tree obj_field = TYPE_FIELDS (TREE_TYPE (exp));
> +  return component_ref (exp, obj_field);
> +}

Looks like the body comments might be a cut-&-pasto?  They reference the
right index, but don't match the function comments.

> +/* Return TRUE if EXP is a valid lvalue.  Lvalues references cannot be
> +   made into temporaries, otherwise any assignments will be lost.  */

s/Lvalues references/Lvalue references/?

> +	      /* If the index is NULL, then just assign it to the next field.
> +		 This is expect of layout_typeinfo(), which generates a flat
> +		 list of values that we must shape into the record type.  */
> +	      if (index == field || index == NULL_TREE)
> +		{
> +		  init->ordered_remove (idx);
> +		  if (!finished)
> +		    is_initialized = true;
> +		  break;
> +		}

"This is expect of" looks like a typo.

The function is quadratic in the initialiser size.  It would be
good to avoid that if possible.  That said, it looks like
build_class_instance is also quadratic, so maybe just changing
this on its own wouldn't help much.

> +	  for (size_t i = 0; fd->parameters && i < fd->parameters->dim; i++)
> +	    {
> +	      VarDeclaration *v = (*fd->parameters)[i];
> +	      /* Remove if already in closureVars so can push to front.  */
> +	      for (size_t j = i; j < fd->closureVars.dim; j++)
> +		{
> +		  Dsymbol *s = fd->closureVars[j];
> +		  if (s == v)
> +		    {
> +		      fd->closureVars.remove (j);
> +		      break;
> +		    }
> +		}
> +	      fd->closureVars.insert (i, v);
> +	    }

Looks like this also is quadratic.

> +      /* For nested functions, default to creating a frame.  Even if there is no
> +	 fields to populate the frame, create it anyway, as this will be used as
> +	 the record type instead of `void*` for the this parameter.  */

s/there is no fields/there are no fields/

> +      /* If we are taking the address of a decl that can never be null,
> +	 then the return result is always true.  */
> +      if (DECL_P (TREE_OPERAND (expr, 0))
> +	  && (TREE_CODE (TREE_OPERAND (expr, 0)) == PARM_DECL
> +	      || TREE_CODE (TREE_OPERAND (expr, 0)) == LABEL_DECL
> +	      || !DECL_WEAK (TREE_OPERAND (expr, 0))))
> +	return boolean_true_node;

Does something warn about this?

> +tree
> +convert (tree type, tree expr)
> ...
> +    case COMPLEX_TYPE:
> +      if (TREE_CODE (etype) == REAL_TYPE && TYPE_IMAGINARY_FLOAT (etype))
> +	ret = build2 (COMPLEX_EXPR, type,
> +		      build_zero_cst (TREE_TYPE (type)),
> +		      convert (TREE_TYPE (type), expr));
> +      else
> +	ret = convert_to_complex (type, e);
> +      goto maybe_fold;
> ...
> +    maybe_fold:
> +      if (!TREE_CONSTANT (ret))
> +	ret = fold (ret);
> +      return ret;

etc.  Any reason not to use fold_build* here and in subroutines (for all
cases, not just COMPLEX_TYPE), rather than fold at the end?  Or is the
!TREE_CONSTANT deliberately trying to avoid folding constants that
fold_build* would normally fold?   build* routines set TREE_CONSTANT
for operations with constant operands as well as constant literals,
so is the code trying to avoid folding those?

The routines use fold_build* in some places and plain build* with
a goto in others, but it wasn't obvious why.  Seems like it deserves
a comment.

> +	  /* Strings are treated as dynamic arrays D2.  */

Should that be by/in D2?

> +  return result ? result :
> +    convert (build_ctype (totype), exp);

Odd formatting, fits easily on a line.  If you want to split it,
the ":" should be on the second line rather than the first.

> +/* Apply semantics of assignment to a values of type TOTYPE to EXPR
> +   (e.g., pointer = array -> pointer = &array[0])

Type ("to a values")

> +   Return a TREE representation of EXPR implictly converted to TOTYPE
> +   for use in assignment expressions MODIFY_EXPR, INIT_EXPR.  */

"implicitly"

> +/* Helper routine for all error routines.  Reports a diagnostic specified by
> +   KIND at the explicit location LOC, where the message FORMAT has not yet
> +   been translated by the gcc diagnostic routines.  */
> +
> +static void ATTRIBUTE_GCC_DIAG(3,0)
> +d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char *format,
> +				va_list ap, diagnostic_t kind, bool verbatim)
> +{
> +  va_list argp;
> +  va_copy (argp, ap);
> +
> +  if (loc.filename || !verbatim)
> +    {
> +      rich_location rich_loc (line_table, get_linemap (loc));
> +      diagnostic_info diagnostic;
> +      char *xformat = expand_format (format);
> +
> +      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);

How does this work with translation?  xgettext will only see the original
format string, not the result of expand_format.  Do you have some scripting
to do the same format mangling when collecting the translation strings?
Same concern:

> +void ATTRIBUTE_GCC_DIAG(2,0)
> +verror (const Loc& loc, const char *format, va_list ap,
> +	const char *p1, const char *p2, const char *)
> +{
> +  if (!global.gag || global.params.showGaggedErrors)
> +    {
> +      char *xformat;
> +
> +      /* Build string and emit.  */
> +      if (p2 != NULL)
> +	xformat = xasprintf ("%s %s %s", p1, p2, format);
> +      else if (p1 != NULL)
> +	xformat = xasprintf ("%s %s", p1, format);
> +      else
> +	xformat = xasprintf ("%s", format);

...with the concatenation here.

It looks at first glance like the callers to d_diagnostic_report_diagnostic
should be doing the conversion themselves via _(format), before doing
any concatenation.  Then d_diagnostic_report_diagnostic can use
expand_format on the translated string and use
diagnostic_set_info_translated.  But I'm not an expert on this stuff.

Also, the function should document what p1 and p2 are.  Do they
need to be translated?

Does xgettext get to see all frontend error messages?  Haven't got
to the later patches yet.

Please add an overview comment somewhere in the diagnostic routines
explaining how translation works wrt errors that come from the D
frontend.

> +void ATTRIBUTE_GCC_DIAG(2,0)
> +vmessage(const Loc& loc, const char *format, va_list ap)

Missing space before "(".

+/* Start gagging. Return the current number of gagged errors.  */

Should be two spaces before "Return"

> +/* Write a little-endian 32-bit VALUE to BUFFER.  */
> +
> +void
> +Port::writelongLE (unsigned value, void *buffer)
> +{
> +    unsigned char *p = (unsigned char*) buffer;
> +
> +    p[0] = (unsigned) value;
> +    p[1] = (unsigned) value >> 8;
> +    p[2] = (unsigned) value >> 16;
> +    p[3] = (unsigned) value >> 24;
> +}
> ...
> +/* Write a big-endian 32-bit VALUE to BUFFER.  */
> +
> +void
> +Port::writelongBE (unsigned value, void *buffer)
> +{
> +    unsigned char *p = (unsigned char*) buffer;
> +
> +    p[0] = (unsigned) value >> 24;
> +    p[1] = (unsigned) value >> 16;
> +    p[2] = (unsigned) value >> 8;
> +    p[3] = (unsigned) value;
> +}

Overindented bodies.  Missing space before "*" in "(unsigned char*)"
in all these functions.

Obviously this stuff assumes host CHAR_BIT == 8, but let's be realistic :-)
Is it also used in ways that require the target BITS_PER_UNIT to be 8
as well?  That could realistically be a different value (and we've had
ports like that in the past).

> +/* Return a hash value for real_t value R.  */
> +
> +size_t
> +CTFloat::hash (real_t r)
> +{
> +    return real_hash (&r.rv ());
> +}

Overindented body.

> +/* Write out all dependencies of a given MODULE to the specified BUFFER.
> +   COLMAX is the number of columns to word-wrap at (0 means don't wrap).  */
> +
> +static void
> +deps_write (Module *module, OutBuffer *buffer, unsigned colmax = 72)
> +{
> ...
> +  /* Write out all make dependencies.  */
> +  while (modlist.dim > 0)
> +  {

Misindented while body.

> +      else if (empty_modify_p (TREE_TYPE (op0), op1))
> +	{
> +	  /* Remove any copies of empty aggregates.  Also drop volatile
> +	     loads on the RHS to avoid infinite recursion from
> +	     gimplify_expr trying to load the value.  */
> +	  gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, post_p,
> +			 is_gimple_lvalue, fb_lvalue);
> +	  if (TREE_SIDE_EFFECTS (op1))
> +	    {
> +	      if (TREE_THIS_VOLATILE (op1)
> +		  && (REFERENCE_CLASS_P (op1) || DECL_P (op1)))
> +		op1 = build_fold_addr_expr (op1);
> +
> +	      gimplify_and_add (op1, pre_p);
> +	    }
> +	  *expr_p = TREE_OPERAND (*expr_p, 0);
> +	  ret = GS_OK;
> +	}

It seems odd to gimplify the address of a volatile decl.  When would
that have meaningful side-effects?

What goes wrong if you don't have this block and let the gimplifier
handle empty modify_exprs normally?  The comments explain clearly what
the code is trying to do, but it isn't obvious why this needs to be
treated as a special case.

> +/* Implements the lang_hooks.get_alias_set routine for language D.
> +   Get the alias set corresponding the type or expression T.

s/corresponding the/corresponding to/

> +/* Implements the lang_hooks.get_alias_set routine for language D.
> +   Get the alias set corresponding the type or expression T.
> +   Return -1 if we don't do anything special.  */
> +
> +static alias_set_type
> +d_get_alias_set (tree t)
> +{
> +  /* Permit type-punning when accessing a union, provided the access
> +     is directly through the union.  */
> +  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
> +    {
> +      if (TREE_CODE (u) == COMPONENT_REF
> +	  && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
> +	return 0;
> +    }
> +
> +  /* That's all the expressions we handle.  */
> +  if (!TYPE_P (t))
> +    return get_alias_set (TREE_TYPE (t));
> +
> +  /* For now in D, assume everything aliases everything else,
> +     until we define some solid rules.  */
> +  return 0;
> +}

Any reason for not returning 0 unconditionally?  Won't the queries
deferred to get_alias_set either return 0 directly or come back here
and get 0?

Might be worth adding a comment referencing build_vconvert, which stood
out as a source of what in C would be alias violations.

> +static tree
> +d_eh_personality (void)
> +{
> +  if (!d_eh_personality_decl)
> +    {
> +      d_eh_personality_decl
> +	= build_personality_function ("gdc");
> +    }

Redundant braces, canonical formatting would be:

  if (!d_eh_personality_decl)
    d_eh_personality_decl = build_personality_function ("gdc");

> +/* this bit is set if they did `-lstdc++'.  */

s/this/This/

> +/* What do with libgphobos:
> +   -1 means we should not link in libgphobos
> +   0  means we should link in libgphobos if it is needed
> +   1  means libgphobos is needed and should be linked in.
> +   2  means libgphobos is needed and should be linked statically.
> +   3  means libgphobos is needed and should be linked dynamically. */
> +static int library = 0;

Might be worth using a more specific variable name, since the code
also deals with libgcc and libstdc++.  Maybe add named constants
for the values above?

> +void
> +lang_specific_driver (cl_decoded_option **in_decoded_options,
> +		      unsigned int *in_decoded_options_count,
> +		      int *in_added_libraries)
> +{
> ...
> +  /* If true, the user gave `-g'.  Used by -debuglib */
> ...
> +  /* What default library to use instead of phobos */
> ...
> +  /* What debug library to use instead of phobos */

Comments should end with ".  */"

> +	case OPT_SPECIAL_input_file:
> +	    {

Over-indented block.

> +	      /* Record that this is a D source file.  */
> +	      int len = strlen (arg);
> +	      if (len <= 2 || strcmp (arg + len - 2, ".d") == 0)
> +		{
> +		  if (first_d_file == NULL)
> +		    first_d_file = arg;
> +
> +		  args[i] |= DSOURCE;
> +		}
> +
> +	      /* If we don't know that this is a interface file, we might
> +		 need to link against libphobos library.  */
> +	      if (library == 0)
> +		{
> +		  if (len <= 3 || strcmp (arg + len - 3, ".di") != 0)
> +		    library = 1;
> +		}
> +
> +	      /* If this is a C++ source file, we'll need to link
> +		 against libstdc++ library.  */
> +	      if ((len <= 3 || strcmp (arg + len - 3, ".cc") == 0)
> +		  || (len <= 4 || strcmp (arg + len - 4, ".cpp") == 0)
> +		  || (len <= 4 || strcmp (arg + len - 4, ".c++") == 0))
> +		need_stdcxx = true;
> +
> +	      break;

The len handling looks a bit odd here, since something like "a.d" would
be treated as a c++ file too.  Might be better to use:

    dot = strrchr (arg, '.');

    if (dot && strcmp (dot, ".cc") == 0)

etc.

> +/* Called before linking.  Returns 0 on success and -1 on failure.  */
> +
> +int lang_specific_pre_link (void)

int
lang_specific_pre_link (void)

> +  /* If there is no hardware support, check if we an safely emulate it.  */

s/we an/we can/

> +/* For a vendor-specific type, return a string containing the C++ mangling.
> +   In all other cases, return NULL.  */
> +
> +const char *
> +Target::cppTypeMangle (Type *type)
> +{
> +    if (type->isTypeBasic () || type->ty == Tvector || type->ty == Tstruct)
> +    {
> +        tree ctype = build_ctype (type);
> +        return targetm.mangle_type (ctype);
> +    }
> +
> +    return NULL;
> +}

Misindented if and function body.

> +/* Return the type that will really be used for passing the given parameter
> +   ARG to an extern(C++) function.  */
> +
> +Type *
> +Target::cppParameterType (Parameter *arg)
> +{
> +    Type *t = arg->type->merge2 ();
> +    if (arg->storageClass & (STCout | STCref))
> +      t = t->referenceTo ();
> +    else if (arg->storageClass & STClazy)
> +      {
> +	/* Mangle as delegate.  */
> +	Type *td = TypeFunction::create (NULL, t, 0, LINKd);
> +	td = TypeDelegate::create (td);
> +	t = t->merge2 ();
> +      }
> +
> +    /* Could be a va_list, which we mangle as a pointer.  */
> +    if (t->ty == Tsarray && Type::tvalist->ty == Tsarray)
> +      {
> +	Type *tb = t->toBasetype ()->mutableOf ();
> +	if (tb == Type::tvalist)
> +	  {
> +	    tb = t->nextOf ()->pointerTo ();
> +	    t = tb->castMod (t->mod);
> +	  }
> +      }
> +
> +    return t;
> +}

Function body indented too far.

> +/* Used to represent a D inline assembly statement, an intermediate
> +   representation of an ASM_EXPR.  Reserved for future use and
> +   implementation yet to be defined.  */
> +DEFTREECODE (IASM_EXPR, "iasm_expr", tcc_statement, 5)

Do you need to reserve it?  The codes are local to the frontend
and you can add new ones whenever you want, so it would be good
to leave this out if possible.

> +/* Frame information for a function declaration.  */
> +
> +struct GTY(()) tree_frame_info
> +{
> +    struct tree_common common;
> +    tree frame_type;
> +};

Over-indented body.

> +/* Gate for when the D frontend make an early call into the codegen pass, such

s/make/makes/

> +    /* Anonymous structs/unions only exist as part of others,
> +       do not output forward referenced structs's.  */

s/'s//, unless I've misunderstood what the comment is saying.

> +    /* Ensure all semantic passes have ran.  */

"passes have run" or "passes ran".

> +      /* The real function type may differ from it's declaration.  */

s/it's/its/

> +      /* Initialiser must never be bigger than symbol size.  */

"Initializer" (alas)

> +/* Thunk code is based on g++ */

Comment should end with ".  */"

> +/* Get offset of base class's vtbl[] initialiser from start of csym.  */
> +
> +unsigned
> +base_vtable_offset (ClassDeclaration *cd, BaseClass *bc)
> +{
> ...
> +  return ~0;
> +}

Would be good to document the ~0 return, and probably also assert
that it isn't ~0:

> +	    /* The vtable of the interface length and ptr.  */
> +	    size_t voffset = base_vtable_offset (cd, b);
> +	    value = build_offset (csym, size_int (voffset));

...here.

> +  /* Build an expression of code CODE, data type TYPE, and operands ARG0 and
> +     ARG1.  Perform relevant conversions needs for correct code operations.  */

"relevant conversions needs" looks like a typo.

> +    if (POINTER_TYPE_P (t0) && POINTER_TYPE_P (t1))
> +      {
> +	/* Need to convert pointers to integers because tree-vrp asserts
> +	   against (ptr MINUS ptr).  */
> +	tree ptrtype = lang_hooks.types.type_for_mode (ptr_mode,
> +						       TYPE_UNSIGNED (type));
> +	arg0 = d_convert (ptrtype, arg0);
> +	arg1 = d_convert (ptrtype, arg1);
> +
> +	ret = fold_build2 (code, ptrtype, arg0, arg1);

Should probably now use POINTER_DIFF_EXPR for this.

> +    /* The LHS expression could be an assignment, to which it's operation gets
> +       lost during gimplification.  */
> +    if (TREE_CODE (lhs) == MODIFY_EXPR)
> +      {
> +	lexpr = compound_expr (lexpr, lhs);
> +	lhs = TREE_OPERAND (lhs, 0);
> +      }

s/it's/its/.  But the code looks a bit odd -- won't you end up with
double evaluation of the lhs of the MODIFY_EXPR?  Or is the creator of
the MODIFY_EXPR already guaranteed to have wrapped the lhs in a SAVE_EXPR
where necessary?  Probably worth a comment if so.

> +  /* Build a power expression, which raises its left operand to the power of
> +     its right operand.   */
> +
> +  void visit (PowExp *e)
> +  {
> ...
> +    /* If type is int, implicitly convert to double.  This allows backend
> +       to fold the call into a constant return value.  */
> +    if (e->type->isintegral ())
> +      powtype = double_type_node;

Seems dangerous if pow is defined as an integer operation.
E.g. ((1 << 62) | 1) ** 1 should be (1 << 62) | 1, which isn't
representable in double.

> +  /* Build a assignment operator expression.  The right operand is implicitly
> +     converted to the type of the left operand, and assigned to it.  */

"an assignment operator"

> +  /* Convert for initialising the DECL_RESULT.  */

"initializing"

> +/* Clear the DECL_BUILT_IN_CLASS flag on the function in CALLEXP.  */
> +
> +static void
> +clear_intrinsic_flag (tree callexp)
> +{
> +  tree decl = CALL_EXPR_FN (callexp);
> +
> +  if (TREE_CODE (decl) == ADDR_EXPR)
> +    decl = TREE_OPERAND (decl, 0);
> +
> +  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
> +
> +  DECL_INTRINSIC_CODE (decl) = INTRINSIC_NONE;
> +  DECL_BUILT_IN_CLASS (decl) = NOT_BUILT_IN;
> +  DECL_FUNCTION_CODE (decl) = BUILT_IN_NONE;
> +}

It seems wrong that a particular call to a function would change the
function's properties for all calls, and I don't think there's any
expectation in the midend that this kind of change might happen.

Why's it needed?  Whatever the reason, I think it needs to be done in a
different way.

> +static tree
> +expand_intrinsic_bsf (tree callexp)
> ...
> +  built_in_function code = (argsize <= INT_TYPE_SIZE) ? BUILT_IN_CTZ :
> +    (argsize <= LONG_TYPE_SIZE) ? BUILT_IN_CTZL :
> +    (argsize <= LONG_LONG_TYPE_SIZE) ? BUILT_IN_CTZLL : END_BUILTINS;

":" should be at the start of the line rather than the end.  Same for
later functions.

> +static tree
> +expand_intrinsic_bswap (tree callexp)
> +{
> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> +  int argsize = TYPE_PRECISION (TREE_TYPE (arg));
> +
> +  /* Which variant of __builtin_bswap* should we call?  */
> +  built_in_function code = (argsize == 32) ? BUILT_IN_BSWAP32 :
> +    (argsize == 64) ? BUILT_IN_BSWAP64 : END_BUILTINS;
> +
> +  /* Fallback on runtime implementation, which shouldn't happen as the
> +     argument for bswap() is either 32-bit or 64-bit.  */
> +  if (code == END_BUILTINS)
> +    {
> +      clear_intrinsic_flag (callexp);
> +      return callexp;
> +    }

The earlier intrinsics had this too, but I assumed there it was
because the code was mapping D types to target-dependent ABI types,
and you couldn't guarantee that long long was a 64-bit type.
In the above function there's no doubt that 32 and 64 bits are
the sizes available, so shouldn't this simply assert that
code != END_BUILTINS?

> +static tree
> +expand_intrinsic_sqrt (intrinsic_code intrinsic, tree callexp)
> +{
> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> +  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (arg));
> +
> +  /* arg is an integral type - use double precision.  */
> +  if (INTEGRAL_TYPE_P (type))
> +    arg = fold_convert (double_type_node, arg);
> +
> +  /* Which variant of __builtin_sqrt* should we call?  */
> +  built_in_function code = (intrinsic == INTRINSIC_SQRT) ? BUILT_IN_SQRT :
> +    (intrinsic == INTRINSIC_SQRTF) ? BUILT_IN_SQRTF :
> +    (intrinsic == INTRINSIC_SQRTL) ? BUILT_IN_SQRTL : END_BUILTINS;
> +
> +  gcc_assert (code != END_BUILTINS);
> +  return call_builtin_fn (callexp, code, 1, arg);
> +}

Converting integers to double precision isn't right for SQRTF and SQRTL.
But how do such calls reach here?  It looks like the deco strings in the
function entries provide types for the 3 sqrt intrinsics, is that right?
Does that translate to a well-typed FUNCTION_DECL, or do the function
types use some opaque types instead?  If the intrinsic function itself
is well-typed then an incoming CALLEXP with integer arguments would be
invalid.

> +static tree
> +expand_intrinsic_copysign (tree callexp)
> +{
> ...
> +  /* Which variant of __builtin_copysign* should we call?  */
> +  built_in_function code = (type == float_type_node) ? BUILT_IN_COPYSIGNF :
> +    (type == double_type_node) ? BUILT_IN_COPYSIGN :
> +    (type == long_double_type_node) ? BUILT_IN_COPYSIGNL : END_BUILTINS;

You can use mathfn_built_in for this.

> +  /* Assign returned result to overflow parameter, however if overflow is
> +     already true, maintain it's value.  */

s/it's/its/

> +/* DEF_D_INTRINSIC (CODE, ALIAS, NAME, MODULE, DECO, CTFE)
> +   CODE	    The enum code used to refer this intrinsic.
> +   ALIAS    The enum code used to refer the function DECL_FUNCTION_CODE, if

s/refer/refer to/ in both cases.  Same in runtime.texi.

> +/* This is the contribution to the `default_compilers' array in gcc.c
> +   for the D language.  */
> +
> +{".d", "@d", 0, 1, 0 },
> +{".dd", "@d", 0, 1, 0 },
> +{".di", "@d", 0, 1, 0 },
> +{"@d",
> +  "%{!E:cc1d %i %(cc1_options) %I %{nostdinc*} %{i*} %{I*} %{J*} \
> +    %{H} %{Hd*} %{Hf*} %{MD:-MD %b.deps} %{MMD:-MMD %b.deps} \
> +    %{M} %{MM} %{MF*} %{MG} %{MP} %{MQ*} %{MT*} \
> +    %{X:-Xf %b.json} %{Xf*} \
> +    %{v} %{!fsyntax-only:%(invoke_as)}}", 0, 1, 0 },

Guess this has been said before, but cc1d seems a strange name,
since "cc1" is "C Compiler pass 1".

> +fdeps
> +D
> +This switch is deprecated; do not use.
> +
> +fdeps=
> +D Joined RejectNegative
> +This switch is deprecated; do not use.
> ...
> +fintfc
> +D Alias(H)
> +; Deprecated in favour of -H
> +
> +fintfc-dir=
> +D Joined RejectNegative Alias(Hd)
> +; Deprecated in favour of -Hd
> +
> +fintfc-file=
> +D Joined RejectNegative Alias(Hf)
> +; Deprecated in favour of -Hf
> ...
> +ftransition=safe
> +D Alias(ftransition=dip1000)
> +; Deprecated in favor of -ftransition=dip1000

Any chance of just removing them?  Would be better not to add new code
(from GCC's POV) that's already deprecated.

> +ftransition=all
> +D RejectNegative
> +List information on all language changes

All docstrings should end in ".".  This should be tested by adding
the D frontend to:

foreach cls { "ada" "c" "c++" "fortran" "go" \
                    "optimizers" "param" "target" "warnings" } {

in gcc.misc-tests/help.exp

> +struct longdouble
> +{
> +public:
> +   /* Return the hidden real_value from the longdouble type.  */
> +  const real_value& rv (void) const
> +  { return *(const real_value *) this; }

Overindented comment.

> +/* Use ldouble() to explicitely create a longdouble value.  */

Typo: "explicitly"

> +/* D generates module information to inform the runtime library which modules
> +   needs some kind of special handling.  All `static this()', `static ~this()',

s/needs/need/

> +/* Record information about module initialization, termination,
> +   unit testing, and thread local storage in the compilation.  */
> +
> +struct module_info
> +{
> +  vec<tree, va_gc> *ctors;
> +  vec<tree, va_gc> *dtors;
> +  vec<tree, va_gc> *ctorgates;
> +
> +  vec<tree, va_gc> *sharedctors;
> +  vec<tree, va_gc> *shareddtors;
> +  vec<tree, va_gc> *sharedctorgates;
> +
> +  vec<tree, va_gc> *unitTests;
> +};
> ...
> +/* Static constructors and destructors (not D `static this').  */
> +
> +static vec<tree, va_gc> *static_ctor_list;
> +static vec<tree, va_gc> *static_dtor_list;

Looks odd to be using GC data structures without marking them.  Think it
deserves a comment if correct.

> +/* Generate a call to LIBCALL, returning the result as TYPE.  NARGS is the
> +   number of call arguments, the expressions of wwhich are provided in `...'.

s/wwhich/which/

> +/* Finish the statement tree rooted at T.  */
> +
> +tree
> +pop_stmt_list (void)
> +{
> +  tree t = d_function_chain->stmt_list->pop ();
> +
> +  /* If the statement list is completely empty, just return it.  This is just
> +     as good small as build_empty_stmt, with the advantage that statement
> +     lists are merged when they appended to one another.  So using the

Typo: "good small"
s/they appended/they are appended/

> +	/* Convert for initialising the DECL_RESULT.  */

"initializing"

> +  /* D Inline Assembler is not implemented, as would require a writing
> +     an assembly parser for each supported target.  Instead we leverage
> +     GCC extended assembler using the GccAsmStatement class.  */

s/as would require a writing/as it would require writing/

> +  /* Write out the interfacing vtable[] of base class BCD that will be accessed
> +     from the overriding class CD.  If both are the same class, then this will
> +     be it's own vtable.  INDEX is the offset in the interfaces array of the

s/it's/its/

> +    /* Internal reference should be public, but not visible outside the
> +       compilation unit, as itself is referencing public COMDATs.  */

Comment is hard to parse.

> +/* Returns true if the TypeInfo for type should be placed in
> +   the runtime library.  */
> +
> +static bool
> +builtin_typeinfo_p (Type *type)

"TypeInfo for TYPE"

> +  /* Let C++ do the RTTI generation, and just reference the symbol as
> +     extern, the knowing the underlying type is not required.  */

s/the knowing/knowing/?

> +	  if (!tinfo_types[tk])
> +	    {
> +	      make_internal_typeinfo (tk, ident, ptr_type_node,
> +				      array_type_node, NULL);
> +	    }

Redundant braces.

> +/* Implements a visitor interface to check whether a type is speculative.
> +   TypeInfo_Struct would refer the members of the struct it is representing

"refer to" or "reference"

> +Type *
> +get_object_type (void)
> +{
> +  if (ClassDeclaration::object)
> +    return ClassDeclaration::object->type;
> +
> +  ::error ("missing or corrupt object.d");

Why "::"?  Is it needed to avoid shadowing issues?

> +	  /* Insert the field declaration at it's given offset.  */

"its"

> +      /* Strip const modifiers from type before building.  This is done
> +	 to ensure that backend treats i.e: const (T) as a variant of T,
> +	 and not as two distinct types.  */

s/i.e/e.g./

Probably worth running tabify on the sources.  I noticed a couple
of indents by 8 spaces instead of tabs, but it didn't seem worth
listing them individually.  You can use contrib/check_GNU_style.{sh,py}
to find that kind of thing (although it generates a lot of false
positives for other coding style violations, so take with a pinch
of salt).

Also, s/layed out/laid out/ throughout.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-10-16 11:14   ` [PATCH 02/14] Add D frontend (GDC) implementation Richard Sandiford
@ 2018-10-16 11:39     ` Richard Sandiford
  2018-10-21  0:02     ` Iain Buclaw
  1 sibling, 0 replies; 19+ messages in thread
From: Richard Sandiford @ 2018-10-16 11:39 UTC (permalink / raw)
  To: Iain Buclaw; +Cc: gcc-patches, Jeff Law

Iain Buclaw <ibuclaw@gdcproject.org> writes:
> On 18 September 2018 at 02:33, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>> This patch adds the D front-end implementation, the only part of the
>> compiler that interacts with GCC directly, and being the parts that I
>> maintain, is something that I can talk about more directly.
>>
>> For the actual code generation pass, that converts the front-end AST
>> to GCC trees, most parts use a separate Visitor interfaces to do a
>> certain kind of lowering, for instance, types.cc builds *_TYPE trees
>> from AST Type's.  The Visitor class is part of the DMD front-end, and
>> is defined in dfrontend/visitor.h.
>>
>> There are also a few interfaces which have their headers in the DMD
>> frontend, but are implemented here because they do something that
>> requires knowledge of the GCC backend (d-target.cc), does something
>> that may not be portable, or differ between D compilers
>> (d-frontend.cc) or are a thin wrapper around something that is managed
>> by GCC (d-diagnostic.cc).
>>
>> Many high level operations result in generation of calls to D runtime
>> library functions (runtime.def), all with require some kind of runtime
>> type information (typeinfo.cc).  The compiler also generates functions
>> for registering/deregistering compiled modules with the D runtime
>> library (modules.cc).
>>
>> As well as the D language having it's own built-in functions
>> (intrinsics.cc), we also expose GCC builtins to D code via a
>> `gcc.builtins' module (d-builtins.cc), and give special treatment to a
>> number of UDAs that could be applied to functions (d-attribs.cc).
>>
>>
>> That is roughly the high level jist of how things are currently organized.
>>
>> ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch
>>
>
> https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00587.html
>
> During the last round, the last comment given was that the reviewer
> wasn't going to dive deep into this code, as it's essentially
> converting between the different representations and is code that I'd
> be maintaining.
>
> As this is code that other gcc maintainers will be potentially looking
> after as well, I'd like any glaring problems to be dealt with
> immediately.

I won't claim that I dived deep into the code either, since with a patch
of this size that would take weeks.  But FWIW I read it through trying
to scan for anything that stood out.

I think the patch is OK besides the below.  The comments are in patch
order and so mix very trivial stuff with things that seem more fundamental.
I think the only real blocker is the point below about translations.

> +/* Define attributes that are mutually exclusive with one another.  */
> +static const struct attribute_spec::exclusions attr_noreturn_exclusions[] =
> +{
> +  ATTR_EXCL ("noreturn", true, true, true),

Why does noreturn exclude itself?  Probably worth a comment if deliberate.

> +tree
> +build_attributes (Expressions *eattrs)
> +{
> [...]
> +      tree list = build_tree_list (get_identifier (name), args);
> +      attribs =  chainon (attribs, list);

Too many spaces before "chainon".

> +static tree
> +handle_const_attribute (tree *node, tree ARG_UNUSED (name),
> +			tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			bool * ARG_UNUSED (no_add_attrs))
> +{
> +  tree type = TREE_TYPE (*node);
> +
> +  /* See FIXME comment on noreturn in c_common_attribute_table.  */

Seems strange to be referencing c-family code here, especially since
there's no corresponding comment for handle_noreturn_attribute and
also no FIXME comment in the D version of the table.  Think we should
just drop the comment above.

It's unfortunate that there's so much cut-&-paste with c-attribs.c,
but there again, the current split between target-independent code
(attribs.*) and frontend code (c-attribs.*) isn't very clear-cut,
so I don't think this should hold up acceptance.

I'm going to review it as new code though, so:

> +/* Handle a "malloc" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +tree
> +handle_malloc_attribute (tree *node, tree ARG_UNUSED (name),
> +			 tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			 bool * ARG_UNUSED (no_add_attrs))
> +{
> +  if (TREE_CODE (*node) == FUNCTION_DECL
> +      && POINTER_TYPE_P (TREE_TYPE (TREE_TYPE (*node))))
> +    DECL_IS_MALLOC (*node) = 1;
> +  else
> +    gcc_unreachable ();
> +
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "pure" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_pure_attribute (tree *node, tree ARG_UNUSED (name),
> +		       tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +		       bool * ARG_UNUSED (no_add_attrs))
> +{
> +  if (TREE_CODE (*node) == FUNCTION_DECL)
> +    DECL_PURE_P (*node) = 1;
> +  else
> +    gcc_unreachable ();
> +
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "no vops" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_novops_attribute (tree *node, tree ARG_UNUSED (name),
> +			 tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			 bool * ARG_UNUSED (no_add_attrs))
> +{
> +  gcc_assert (TREE_CODE (*node) == FUNCTION_DECL);
> +  DECL_IS_NOVOPS (*node) = 1;
> +  return NULL_TREE;
> +}

Think the first two should follow the example of novops and use
gcc_assert rather than gcc_unreaachable.  Same for some later functions.

> +/* Build D frontend type from tree TYPE type given.  This will set the backend
> +   type symbol directly for complex types to save build_ctype() the work.
> +   For other types, it is not useful or will causes errors, such as casting

s/causes/cause/ or s/will //

> +  /* For now, we need to tell the D frontend what platform is being targeted.
> +     This should be removed once the frontend has been fixed.  */
> +  if (strcmp (ident, "linux") == 0)
> +    global.params.isLinux = true;
> +  else if (strcmp (ident, "OSX") == 0)
> +    global.params.isOSX = true;
> +  else if (strcmp (ident, "Windows") == 0)
> +    global.params.isWindows = true;
> +  else if (strcmp (ident, "FreeBSD") == 0)
> +    global.params.isFreeBSD = true;
> +  else if (strcmp (ident, "OpenBSD") == 0)
> +    global.params.isOpenBSD = true;
> +  else if (strcmp (ident, "Solaris") == 0)
> +    global.params.isSolaris = true;
> +  else if (strcmp (ident, "X86_64") == 0)
> +    global.params.is64bit = true;

Probably worth adding a comment to say why only X86_64 causes is64bit
to be set, and to say that it's OK to silently ignore other strings
(assuming it is).

> +      tf->purity = DECL_PURE_P (decl) ?   PUREstrong :
> +	TREE_READONLY (decl) ? PUREconst :
> +	DECL_IS_NOVOPS (decl) ? PUREweak :
> +	!DECL_ASSEMBLER_NAME_SET_P (decl) ? PUREweak :
> +	PUREimpure;
> +      tf->trust = !DECL_ASSEMBLER_NAME_SET_P (decl) ? TRUSTsafe :
> +	TREE_NOTHROW (decl) ? TRUSTtrusted :
> +	TRUSTsystem;

Formatting nit, think this should be:

      tf->purity = (DECL_PURE_P (decl) ? PUREstrong
		    : TREE_READONLY (decl) ? PUREconst
		    : DECL_IS_NOVOPS (decl) ? PUREweak
		    : !DECL_ASSEMBLER_NAME_SET_P (decl) ? PUREweak
		    : PUREimpure);

etc.

> +/* Search for any `extern(C)' functions that match any known GCC library builtin
> +   function in D and override it's internal backend symbol.  */

s/it's/its/

> +/* Used to help initialize the builtin-types.def table.  When a type of
> +   the correct size doesn't exist, use error_mark_node instead of NULL.
> +   The later results in segfaults even when a decl using the type doesn't
> +   get invoked.  */

s/later/latter/.  (The typo is in the code you copied this from, but might
as will fix it here.)

> +/* Build nodes that would have be created by the C front-end; necessary
> +   for including builtin-types.def and ultimately builtins.def.  */

"would be" or "would have been"

> +/* Build nodes that are used by the D front-end.
> +   These are distinct from C types.  */
> +
> +static void
> +d_build_d_type_nodes (void)
> +{
> +  /* Integral types.  */
> +  byte_type_node = make_signed_type (8);
> +  ubyte_type_node = make_unsigned_type (8);
> +
> +  short_type_node = make_signed_type (16);
> +  ushort_type_node = make_unsigned_type (16);
> +
> +  int_type_node = make_signed_type (32);
> +  uint_type_node = make_unsigned_type (32);
> +
> +  long_type_node = make_signed_type (64);
> +  ulong_type_node = make_unsigned_type (64);

It's a bit confusing for the D type to be long_type_node but the C/ABI type
to be long_integer_type_node.  The D type is surely an integer too. :-)
With this coming among the handling of built-in functions, it initially
looked related, and I was wondering how it was safe on ILP32 systems
before realising the difference.

Maybe prefixing them all with "d_" would be too ugly, but it would at
least be good to clarify the comment to say that these are "distinct
type nodes" (rather than just distinct definitions, as I'd initially
assumed) and that they're not used outside the frontend, or by the C
imports.

> +/* Return the GCC location for the D frontend location LOC.   */
> +
> +location_t
> +get_linemap (const Loc& loc)

FWIW, I think GCC style is to add a space before "&" rather than
after it, as for pointers.  But since you use this style consistently
(and I'm guessing would instinctively use it in later work), it's
probably better to leave it as-is.  The go interface similarly
has local conventions regarding things like spaces before "(".

> +tree
> +d_array_value (tree type, tree len, tree data)
> +{
> +  /* TODO: Assert type is a D array.  */

Just curious, what makes asserting difficult enough to be a TODO?

> +/* Returns the .funcptr component from the D delegate EXP.  */
> +
> +tree
> +delegate_method (tree exp)
> +{
> +  /* Get the backend type for the array and pick out the array length
> +     field (assumed to be the second field).  */
> +  tree method_field = TREE_CHAIN (TYPE_FIELDS (TREE_TYPE (exp)));
> +  return component_ref (exp, method_field);
> +}
> +
> +/* Returns the .object component from the delegate EXP.  */
> +
> +tree
> +delegate_object (tree exp)
> +{
> +  /* Get the backend type for the array and pick out the array data
> +     pointer field (assumed to be the first field).  */
> +  tree obj_field = TYPE_FIELDS (TREE_TYPE (exp));
> +  return component_ref (exp, obj_field);
> +}

Looks like the body comments might be a cut-&-pasto?  They reference the
right index, but don't match the function comments.

> +/* Return TRUE if EXP is a valid lvalue.  Lvalues references cannot be
> +   made into temporaries, otherwise any assignments will be lost.  */

s/Lvalues references/Lvalue references/?

> +	      /* If the index is NULL, then just assign it to the next field.
> +		 This is expect of layout_typeinfo(), which generates a flat
> +		 list of values that we must shape into the record type.  */
> +	      if (index == field || index == NULL_TREE)
> +		{
> +		  init->ordered_remove (idx);
> +		  if (!finished)
> +		    is_initialized = true;
> +		  break;
> +		}

"This is expect of" looks like a typo.

The function is quadratic in the initialiser size.  It would be
good to avoid that if possible.  That said, it looks like
build_class_instance is also quadratic, so maybe just changing
this on its own wouldn't help much.

> +	  for (size_t i = 0; fd->parameters && i < fd->parameters->dim; i++)
> +	    {
> +	      VarDeclaration *v = (*fd->parameters)[i];
> +	      /* Remove if already in closureVars so can push to front.  */
> +	      for (size_t j = i; j < fd->closureVars.dim; j++)
> +		{
> +		  Dsymbol *s = fd->closureVars[j];
> +		  if (s == v)
> +		    {
> +		      fd->closureVars.remove (j);
> +		      break;
> +		    }
> +		}
> +	      fd->closureVars.insert (i, v);
> +	    }

Looks like this also is quadratic.

> +      /* For nested functions, default to creating a frame.  Even if there is no
> +	 fields to populate the frame, create it anyway, as this will be used as
> +	 the record type instead of `void*` for the this parameter.  */

s/there is no fields/there are no fields/

> +      /* If we are taking the address of a decl that can never be null,
> +	 then the return result is always true.  */
> +      if (DECL_P (TREE_OPERAND (expr, 0))
> +	  && (TREE_CODE (TREE_OPERAND (expr, 0)) == PARM_DECL
> +	      || TREE_CODE (TREE_OPERAND (expr, 0)) == LABEL_DECL
> +	      || !DECL_WEAK (TREE_OPERAND (expr, 0))))
> +	return boolean_true_node;

Does something warn about this?

> +tree
> +convert (tree type, tree expr)
> ...
> +    case COMPLEX_TYPE:
> +      if (TREE_CODE (etype) == REAL_TYPE && TYPE_IMAGINARY_FLOAT (etype))
> +	ret = build2 (COMPLEX_EXPR, type,
> +		      build_zero_cst (TREE_TYPE (type)),
> +		      convert (TREE_TYPE (type), expr));
> +      else
> +	ret = convert_to_complex (type, e);
> +      goto maybe_fold;
> ...
> +    maybe_fold:
> +      if (!TREE_CONSTANT (ret))
> +	ret = fold (ret);
> +      return ret;

etc.  Any reason not to use fold_build* here and in subroutines (for all
cases, not just COMPLEX_TYPE), rather than fold at the end?  Or is the
!TREE_CONSTANT deliberately trying to avoid folding constants that
fold_build* would normally fold?   build* routines set TREE_CONSTANT
for operations with constant operands as well as constant literals,
so is the code trying to avoid folding those?

The routines use fold_build* in some places and plain build* with
a goto in others, but it wasn't obvious why.  Seems like it deserves
a comment.

> +	  /* Strings are treated as dynamic arrays D2.  */

Should that be by/in D2?

> +  return result ? result :
> +    convert (build_ctype (totype), exp);

Odd formatting, fits easily on a line.  If you want to split it,
the ":" should be on the second line rather than the first.

> +/* Apply semantics of assignment to a values of type TOTYPE to EXPR
> +   (e.g., pointer = array -> pointer = &array[0])

Type ("to a values")

> +   Return a TREE representation of EXPR implictly converted to TOTYPE
> +   for use in assignment expressions MODIFY_EXPR, INIT_EXPR.  */

"implicitly"

> +/* Helper routine for all error routines.  Reports a diagnostic specified by
> +   KIND at the explicit location LOC, where the message FORMAT has not yet
> +   been translated by the gcc diagnostic routines.  */
> +
> +static void ATTRIBUTE_GCC_DIAG(3,0)
> +d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char *format,
> +				va_list ap, diagnostic_t kind, bool verbatim)
> +{
> +  va_list argp;
> +  va_copy (argp, ap);
> +
> +  if (loc.filename || !verbatim)
> +    {
> +      rich_location rich_loc (line_table, get_linemap (loc));
> +      diagnostic_info diagnostic;
> +      char *xformat = expand_format (format);
> +
> +      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);

How does this work with translation?  xgettext will only see the original
format string, not the result of expand_format.  Do you have some scripting
to do the same format mangling when collecting the translation strings?
Same concern:

> +void ATTRIBUTE_GCC_DIAG(2,0)
> +verror (const Loc& loc, const char *format, va_list ap,
> +	const char *p1, const char *p2, const char *)
> +{
> +  if (!global.gag || global.params.showGaggedErrors)
> +    {
> +      char *xformat;
> +
> +      /* Build string and emit.  */
> +      if (p2 != NULL)
> +	xformat = xasprintf ("%s %s %s", p1, p2, format);
> +      else if (p1 != NULL)
> +	xformat = xasprintf ("%s %s", p1, format);
> +      else
> +	xformat = xasprintf ("%s", format);

...with the concatenation here.

It looks at first glance like the callers to d_diagnostic_report_diagnostic
should be doing the conversion themselves via _(format), before doing
any concatenation.  Then d_diagnostic_report_diagnostic can use
expand_format on the translated string and use
diagnostic_set_info_translated.  But I'm not an expert on this stuff.

Also, the function should document what p1 and p2 are.  Do they
need to be translated?

Does xgettext get to see all frontend error messages?  Haven't got
to the later patches yet.

Please add an overview comment somewhere in the diagnostic routines
explaining how translation works wrt errors that come from the D
frontend.

> +void ATTRIBUTE_GCC_DIAG(2,0)
> +vmessage(const Loc& loc, const char *format, va_list ap)

Missing space before "(".

+/* Start gagging. Return the current number of gagged errors.  */

Should be two spaces before "Return"

> +/* Write a little-endian 32-bit VALUE to BUFFER.  */
> +
> +void
> +Port::writelongLE (unsigned value, void *buffer)
> +{
> +    unsigned char *p = (unsigned char*) buffer;
> +
> +    p[0] = (unsigned) value;
> +    p[1] = (unsigned) value >> 8;
> +    p[2] = (unsigned) value >> 16;
> +    p[3] = (unsigned) value >> 24;
> +}
> ...
> +/* Write a big-endian 32-bit VALUE to BUFFER.  */
> +
> +void
> +Port::writelongBE (unsigned value, void *buffer)
> +{
> +    unsigned char *p = (unsigned char*) buffer;
> +
> +    p[0] = (unsigned) value >> 24;
> +    p[1] = (unsigned) value >> 16;
> +    p[2] = (unsigned) value >> 8;
> +    p[3] = (unsigned) value;
> +}

Overindented bodies.  Missing space before "*" in "(unsigned char*)"
in all these functions.

Obviously this stuff assumes host CHAR_BIT == 8, but let's be realistic :-)
Is it also used in ways that require the target BITS_PER_UNIT to be 8
as well?  That could realistically be a different value (and we've had
ports like that in the past).

> +/* Return a hash value for real_t value R.  */
> +
> +size_t
> +CTFloat::hash (real_t r)
> +{
> +    return real_hash (&r.rv ());
> +}

Overindented body.

> +/* Write out all dependencies of a given MODULE to the specified BUFFER.
> +   COLMAX is the number of columns to word-wrap at (0 means don't wrap).  */
> +
> +static void
> +deps_write (Module *module, OutBuffer *buffer, unsigned colmax = 72)
> +{
> ...
> +  /* Write out all make dependencies.  */
> +  while (modlist.dim > 0)
> +  {

Misindented while body.

> +      else if (empty_modify_p (TREE_TYPE (op0), op1))
> +	{
> +	  /* Remove any copies of empty aggregates.  Also drop volatile
> +	     loads on the RHS to avoid infinite recursion from
> +	     gimplify_expr trying to load the value.  */
> +	  gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, post_p,
> +			 is_gimple_lvalue, fb_lvalue);
> +	  if (TREE_SIDE_EFFECTS (op1))
> +	    {
> +	      if (TREE_THIS_VOLATILE (op1)
> +		  && (REFERENCE_CLASS_P (op1) || DECL_P (op1)))
> +		op1 = build_fold_addr_expr (op1);
> +
> +	      gimplify_and_add (op1, pre_p);
> +	    }
> +	  *expr_p = TREE_OPERAND (*expr_p, 0);
> +	  ret = GS_OK;
> +	}

It seems odd to gimplify the address of a volatile decl.  When would
that have meaningful side-effects?

What goes wrong if you don't have this block and let the gimplifier
handle empty modify_exprs normally?  The comments explain clearly what
the code is trying to do, but it isn't obvious why this needs to be
treated as a special case.

> +/* Implements the lang_hooks.get_alias_set routine for language D.
> +   Get the alias set corresponding the type or expression T.

s/corresponding the/corresponding to/

> +/* Implements the lang_hooks.get_alias_set routine for language D.
> +   Get the alias set corresponding the type or expression T.
> +   Return -1 if we don't do anything special.  */
> +
> +static alias_set_type
> +d_get_alias_set (tree t)
> +{
> +  /* Permit type-punning when accessing a union, provided the access
> +     is directly through the union.  */
> +  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
> +    {
> +      if (TREE_CODE (u) == COMPONENT_REF
> +	  && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
> +	return 0;
> +    }
> +
> +  /* That's all the expressions we handle.  */
> +  if (!TYPE_P (t))
> +    return get_alias_set (TREE_TYPE (t));
> +
> +  /* For now in D, assume everything aliases everything else,
> +     until we define some solid rules.  */
> +  return 0;
> +}

Any reason for not returning 0 unconditionally?  Won't the queries
deferred to get_alias_set either return 0 directly or come back here
and get 0?

Might be worth adding a comment referencing build_vconvert, which stood
out as a source of what in C would be alias violations.

> +static tree
> +d_eh_personality (void)
> +{
> +  if (!d_eh_personality_decl)
> +    {
> +      d_eh_personality_decl
> +	= build_personality_function ("gdc");
> +    }

Redundant braces, canonical formatting would be:

  if (!d_eh_personality_decl)
    d_eh_personality_decl = build_personality_function ("gdc");

> +/* this bit is set if they did `-lstdc++'.  */

s/this/This/

> +/* What do with libgphobos:
> +   -1 means we should not link in libgphobos
> +   0  means we should link in libgphobos if it is needed
> +   1  means libgphobos is needed and should be linked in.
> +   2  means libgphobos is needed and should be linked statically.
> +   3  means libgphobos is needed and should be linked dynamically. */
> +static int library = 0;

Might be worth using a more specific variable name, since the code
also deals with libgcc and libstdc++.  Maybe add named constants
for the values above?

> +void
> +lang_specific_driver (cl_decoded_option **in_decoded_options,
> +		      unsigned int *in_decoded_options_count,
> +		      int *in_added_libraries)
> +{
> ...
> +  /* If true, the user gave `-g'.  Used by -debuglib */
> ...
> +  /* What default library to use instead of phobos */
> ...
> +  /* What debug library to use instead of phobos */

Comments should end with ".  */"

> +	case OPT_SPECIAL_input_file:
> +	    {

Over-indented block.

> +	      /* Record that this is a D source file.  */
> +	      int len = strlen (arg);
> +	      if (len <= 2 || strcmp (arg + len - 2, ".d") == 0)
> +		{
> +		  if (first_d_file == NULL)
> +		    first_d_file = arg;
> +
> +		  args[i] |= DSOURCE;
> +		}
> +
> +	      /* If we don't know that this is a interface file, we might
> +		 need to link against libphobos library.  */
> +	      if (library == 0)
> +		{
> +		  if (len <= 3 || strcmp (arg + len - 3, ".di") != 0)
> +		    library = 1;
> +		}
> +
> +	      /* If this is a C++ source file, we'll need to link
> +		 against libstdc++ library.  */
> +	      if ((len <= 3 || strcmp (arg + len - 3, ".cc") == 0)
> +		  || (len <= 4 || strcmp (arg + len - 4, ".cpp") == 0)
> +		  || (len <= 4 || strcmp (arg + len - 4, ".c++") == 0))
> +		need_stdcxx = true;
> +
> +	      break;

The len handling looks a bit odd here, since something like "a.d" would
be treated as a c++ file too.  Might be better to use:

    dot = strrchr (arg, '.');

    if (dot && strcmp (dot, ".cc") == 0)

etc.

> +/* Called before linking.  Returns 0 on success and -1 on failure.  */
> +
> +int lang_specific_pre_link (void)

int
lang_specific_pre_link (void)

> +  /* If there is no hardware support, check if we an safely emulate it.  */

s/we an/we can/

> +/* For a vendor-specific type, return a string containing the C++ mangling.
> +   In all other cases, return NULL.  */
> +
> +const char *
> +Target::cppTypeMangle (Type *type)
> +{
> +    if (type->isTypeBasic () || type->ty == Tvector || type->ty == Tstruct)
> +    {
> +        tree ctype = build_ctype (type);
> +        return targetm.mangle_type (ctype);
> +    }
> +
> +    return NULL;
> +}

Misindented if and function body.

> +/* Return the type that will really be used for passing the given parameter
> +   ARG to an extern(C++) function.  */
> +
> +Type *
> +Target::cppParameterType (Parameter *arg)
> +{
> +    Type *t = arg->type->merge2 ();
> +    if (arg->storageClass & (STCout | STCref))
> +      t = t->referenceTo ();
> +    else if (arg->storageClass & STClazy)
> +      {
> +	/* Mangle as delegate.  */
> +	Type *td = TypeFunction::create (NULL, t, 0, LINKd);
> +	td = TypeDelegate::create (td);
> +	t = t->merge2 ();
> +      }
> +
> +    /* Could be a va_list, which we mangle as a pointer.  */
> +    if (t->ty == Tsarray && Type::tvalist->ty == Tsarray)
> +      {
> +	Type *tb = t->toBasetype ()->mutableOf ();
> +	if (tb == Type::tvalist)
> +	  {
> +	    tb = t->nextOf ()->pointerTo ();
> +	    t = tb->castMod (t->mod);
> +	  }
> +      }
> +
> +    return t;
> +}

Function body indented too far.

> +/* Used to represent a D inline assembly statement, an intermediate
> +   representation of an ASM_EXPR.  Reserved for future use and
> +   implementation yet to be defined.  */
> +DEFTREECODE (IASM_EXPR, "iasm_expr", tcc_statement, 5)

Do you need to reserve it?  The codes are local to the frontend
and you can add new ones whenever you want, so it would be good
to leave this out if possible.

> +/* Frame information for a function declaration.  */
> +
> +struct GTY(()) tree_frame_info
> +{
> +    struct tree_common common;
> +    tree frame_type;
> +};

Over-indented body.

> +/* Gate for when the D frontend make an early call into the codegen pass, such

s/make/makes/

> +    /* Anonymous structs/unions only exist as part of others,
> +       do not output forward referenced structs's.  */

s/'s//, unless I've misunderstood what the comment is saying.

> +    /* Ensure all semantic passes have ran.  */

"passes have run" or "passes ran".

> +      /* The real function type may differ from it's declaration.  */

s/it's/its/

> +      /* Initialiser must never be bigger than symbol size.  */

"Initializer" (alas)

> +/* Thunk code is based on g++ */

Comment should end with ".  */"

> +/* Get offset of base class's vtbl[] initialiser from start of csym.  */
> +
> +unsigned
> +base_vtable_offset (ClassDeclaration *cd, BaseClass *bc)
> +{
> ...
> +  return ~0;
> +}

Would be good to document the ~0 return, and probably also assert
that it isn't ~0:

> +	    /* The vtable of the interface length and ptr.  */
> +	    size_t voffset = base_vtable_offset (cd, b);
> +	    value = build_offset (csym, size_int (voffset));

...here.

> +  /* Build an expression of code CODE, data type TYPE, and operands ARG0 and
> +     ARG1.  Perform relevant conversions needs for correct code operations.  */

"relevant conversions needs" looks like a typo.

> +    if (POINTER_TYPE_P (t0) && POINTER_TYPE_P (t1))
> +      {
> +	/* Need to convert pointers to integers because tree-vrp asserts
> +	   against (ptr MINUS ptr).  */
> +	tree ptrtype = lang_hooks.types.type_for_mode (ptr_mode,
> +						       TYPE_UNSIGNED (type));
> +	arg0 = d_convert (ptrtype, arg0);
> +	arg1 = d_convert (ptrtype, arg1);
> +
> +	ret = fold_build2 (code, ptrtype, arg0, arg1);

Should probably now use POINTER_DIFF_EXPR for this.

> +    /* The LHS expression could be an assignment, to which it's operation gets
> +       lost during gimplification.  */
> +    if (TREE_CODE (lhs) == MODIFY_EXPR)
> +      {
> +	lexpr = compound_expr (lexpr, lhs);
> +	lhs = TREE_OPERAND (lhs, 0);
> +      }

s/it's/its/.  But the code looks a bit odd -- won't you end up with
double evaluation of the lhs of the MODIFY_EXPR?  Or is the creator of
the MODIFY_EXPR already guaranteed to have wrapped the lhs in a SAVE_EXPR
where necessary?  Probably worth a comment if so.

> +  /* Build a power expression, which raises its left operand to the power of
> +     its right operand.   */
> +
> +  void visit (PowExp *e)
> +  {
> ...
> +    /* If type is int, implicitly convert to double.  This allows backend
> +       to fold the call into a constant return value.  */
> +    if (e->type->isintegral ())
> +      powtype = double_type_node;

Seems dangerous if pow is defined as an integer operation.
E.g. ((1 << 62) | 1) ** 1 should be (1 << 62) | 1, which isn't
representable in double.

> +  /* Build a assignment operator expression.  The right operand is implicitly
> +     converted to the type of the left operand, and assigned to it.  */

"an assignment operator"

> +  /* Convert for initialising the DECL_RESULT.  */

"initializing"

> +/* Clear the DECL_BUILT_IN_CLASS flag on the function in CALLEXP.  */
> +
> +static void
> +clear_intrinsic_flag (tree callexp)
> +{
> +  tree decl = CALL_EXPR_FN (callexp);
> +
> +  if (TREE_CODE (decl) == ADDR_EXPR)
> +    decl = TREE_OPERAND (decl, 0);
> +
> +  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
> +
> +  DECL_INTRINSIC_CODE (decl) = INTRINSIC_NONE;
> +  DECL_BUILT_IN_CLASS (decl) = NOT_BUILT_IN;
> +  DECL_FUNCTION_CODE (decl) = BUILT_IN_NONE;
> +}

It seems wrong that a particular call to a function would change the
function's properties for all calls, and I don't think there's any
expectation in the midend that this kind of change might happen.

Why's it needed?  Whatever the reason, I think it needs to be done in a
different way.

> +static tree
> +expand_intrinsic_bsf (tree callexp)
> ...
> +  built_in_function code = (argsize <= INT_TYPE_SIZE) ? BUILT_IN_CTZ :
> +    (argsize <= LONG_TYPE_SIZE) ? BUILT_IN_CTZL :
> +    (argsize <= LONG_LONG_TYPE_SIZE) ? BUILT_IN_CTZLL : END_BUILTINS;

":" should be at the start of the line rather than the end.  Same for
later functions.

> +static tree
> +expand_intrinsic_bswap (tree callexp)
> +{
> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> +  int argsize = TYPE_PRECISION (TREE_TYPE (arg));
> +
> +  /* Which variant of __builtin_bswap* should we call?  */
> +  built_in_function code = (argsize == 32) ? BUILT_IN_BSWAP32 :
> +    (argsize == 64) ? BUILT_IN_BSWAP64 : END_BUILTINS;
> +
> +  /* Fallback on runtime implementation, which shouldn't happen as the
> +     argument for bswap() is either 32-bit or 64-bit.  */
> +  if (code == END_BUILTINS)
> +    {
> +      clear_intrinsic_flag (callexp);
> +      return callexp;
> +    }

The earlier intrinsics had this too, but I assumed there it was
because the code was mapping D types to target-dependent ABI types,
and you couldn't guarantee that long long was a 64-bit type.
In the above function there's no doubt that 32 and 64 bits are
the sizes available, so shouldn't this simply assert that
code != END_BUILTINS?

> +static tree
> +expand_intrinsic_sqrt (intrinsic_code intrinsic, tree callexp)
> +{
> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> +  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (arg));
> +
> +  /* arg is an integral type - use double precision.  */
> +  if (INTEGRAL_TYPE_P (type))
> +    arg = fold_convert (double_type_node, arg);
> +
> +  /* Which variant of __builtin_sqrt* should we call?  */
> +  built_in_function code = (intrinsic == INTRINSIC_SQRT) ? BUILT_IN_SQRT :
> +    (intrinsic == INTRINSIC_SQRTF) ? BUILT_IN_SQRTF :
> +    (intrinsic == INTRINSIC_SQRTL) ? BUILT_IN_SQRTL : END_BUILTINS;
> +
> +  gcc_assert (code != END_BUILTINS);
> +  return call_builtin_fn (callexp, code, 1, arg);
> +}

Converting integers to double precision isn't right for SQRTF and SQRTL.
But how do such calls reach here?  It looks like the deco strings in the
function entries provide types for the 3 sqrt intrinsics, is that right?
Does that translate to a well-typed FUNCTION_DECL, or do the function
types use some opaque types instead?  If the intrinsic function itself
is well-typed then an incoming CALLEXP with integer arguments would be
invalid.

> +static tree
> +expand_intrinsic_copysign (tree callexp)
> +{
> ...
> +  /* Which variant of __builtin_copysign* should we call?  */
> +  built_in_function code = (type == float_type_node) ? BUILT_IN_COPYSIGNF :
> +    (type == double_type_node) ? BUILT_IN_COPYSIGN :
> +    (type == long_double_type_node) ? BUILT_IN_COPYSIGNL : END_BUILTINS;

You can use mathfn_built_in for this.

> +  /* Assign returned result to overflow parameter, however if overflow is
> +     already true, maintain it's value.  */

s/it's/its/

> +/* DEF_D_INTRINSIC (CODE, ALIAS, NAME, MODULE, DECO, CTFE)
> +   CODE	    The enum code used to refer this intrinsic.
> +   ALIAS    The enum code used to refer the function DECL_FUNCTION_CODE, if

s/refer/refer to/ in both cases.  Same in runtime.texi.

> +/* This is the contribution to the `default_compilers' array in gcc.c
> +   for the D language.  */
> +
> +{".d", "@d", 0, 1, 0 },
> +{".dd", "@d", 0, 1, 0 },
> +{".di", "@d", 0, 1, 0 },
> +{"@d",
> +  "%{!E:cc1d %i %(cc1_options) %I %{nostdinc*} %{i*} %{I*} %{J*} \
> +    %{H} %{Hd*} %{Hf*} %{MD:-MD %b.deps} %{MMD:-MMD %b.deps} \
> +    %{M} %{MM} %{MF*} %{MG} %{MP} %{MQ*} %{MT*} \
> +    %{X:-Xf %b.json} %{Xf*} \
> +    %{v} %{!fsyntax-only:%(invoke_as)}}", 0, 1, 0 },

Guess this has been said before, but cc1d seems a strange name,
since "cc1" is "C Compiler pass 1".

> +fdeps
> +D
> +This switch is deprecated; do not use.
> +
> +fdeps=
> +D Joined RejectNegative
> +This switch is deprecated; do not use.
> ...
> +fintfc
> +D Alias(H)
> +; Deprecated in favour of -H
> +
> +fintfc-dir=
> +D Joined RejectNegative Alias(Hd)
> +; Deprecated in favour of -Hd
> +
> +fintfc-file=
> +D Joined RejectNegative Alias(Hf)
> +; Deprecated in favour of -Hf
> ...
> +ftransition=safe
> +D Alias(ftransition=dip1000)
> +; Deprecated in favor of -ftransition=dip1000

Any chance of just removing them?  Would be better not to add new code
(from GCC's POV) that's already deprecated.

> +ftransition=all
> +D RejectNegative
> +List information on all language changes

All docstrings should end in ".".  This should be tested by adding
the D frontend to:

foreach cls { "ada" "c" "c++" "fortran" "go" \
                    "optimizers" "param" "target" "warnings" } {

in gcc.misc-tests/help.exp

> +struct longdouble
> +{
> +public:
> +   /* Return the hidden real_value from the longdouble type.  */
> +  const real_value& rv (void) const
> +  { return *(const real_value *) this; }

Overindented comment.

> +/* Use ldouble() to explicitely create a longdouble value.  */

Typo: "explicitly"

> +/* D generates module information to inform the runtime library which modules
> +   needs some kind of special handling.  All `static this()', `static ~this()',

s/needs/need/

> +/* Record information about module initialization, termination,
> +   unit testing, and thread local storage in the compilation.  */
> +
> +struct module_info
> +{
> +  vec<tree, va_gc> *ctors;
> +  vec<tree, va_gc> *dtors;
> +  vec<tree, va_gc> *ctorgates;
> +
> +  vec<tree, va_gc> *sharedctors;
> +  vec<tree, va_gc> *shareddtors;
> +  vec<tree, va_gc> *sharedctorgates;
> +
> +  vec<tree, va_gc> *unitTests;
> +};
> ...
> +/* Static constructors and destructors (not D `static this').  */
> +
> +static vec<tree, va_gc> *static_ctor_list;
> +static vec<tree, va_gc> *static_dtor_list;

Looks odd to be using GC data structures without marking them.  Think it
deserves a comment if correct.

> +/* Generate a call to LIBCALL, returning the result as TYPE.  NARGS is the
> +   number of call arguments, the expressions of wwhich are provided in `...'.

s/wwhich/which/

> +/* Finish the statement tree rooted at T.  */
> +
> +tree
> +pop_stmt_list (void)
> +{
> +  tree t = d_function_chain->stmt_list->pop ();
> +
> +  /* If the statement list is completely empty, just return it.  This is just
> +     as good small as build_empty_stmt, with the advantage that statement
> +     lists are merged when they appended to one another.  So using the

Typo: "good small"
s/they appended/they are appended/

> +	/* Convert for initialising the DECL_RESULT.  */

"initializing"

> +  /* D Inline Assembler is not implemented, as would require a writing
> +     an assembly parser for each supported target.  Instead we leverage
> +     GCC extended assembler using the GccAsmStatement class.  */

s/as would require a writing/as it would require writing/

> +  /* Write out the interfacing vtable[] of base class BCD that will be accessed
> +     from the overriding class CD.  If both are the same class, then this will
> +     be it's own vtable.  INDEX is the offset in the interfaces array of the

s/it's/its/

> +    /* Internal reference should be public, but not visible outside the
> +       compilation unit, as itself is referencing public COMDATs.  */

Comment is hard to parse.

> +/* Returns true if the TypeInfo for type should be placed in
> +   the runtime library.  */
> +
> +static bool
> +builtin_typeinfo_p (Type *type)

"TypeInfo for TYPE"

> +  /* Let C++ do the RTTI generation, and just reference the symbol as
> +     extern, the knowing the underlying type is not required.  */

s/the knowing/knowing/?

> +	  if (!tinfo_types[tk])
> +	    {
> +	      make_internal_typeinfo (tk, ident, ptr_type_node,
> +				      array_type_node, NULL);
> +	    }

Redundant braces.

> +/* Implements a visitor interface to check whether a type is speculative.
> +   TypeInfo_Struct would refer the members of the struct it is representing

"refer to" or "reference"

> +Type *
> +get_object_type (void)
> +{
> +  if (ClassDeclaration::object)
> +    return ClassDeclaration::object->type;
> +
> +  ::error ("missing or corrupt object.d");

Why "::"?  Is it needed to avoid shadowing issues?

> +	  /* Insert the field declaration at it's given offset.  */

"its"

> +      /* Strip const modifiers from type before building.  This is done
> +	 to ensure that backend treats i.e: const (T) as a variant of T,
> +	 and not as two distinct types.  */

s/i.e/e.g./

Probably worth running tabify on the sources.  I noticed a couple
of indents by 8 spaces instead of tabs, but it didn't seem worth
listing them individually.  You can use contrib/check_GNU_style.{sh,py}
to find that kind of thing (although it generates a lot of false
positives for other coding style violations, so take with a pinch
of salt).

Also, s/layed out/laid out/ throughout.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-10-15  7:36     ` Iain Buclaw
@ 2018-10-20 10:51       ` Richard Sandiford
  2018-10-20 15:53         ` Iain Buclaw
                           ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Richard Sandiford @ 2018-10-20 10:51 UTC (permalink / raw)
  To: Iain Buclaw; +Cc: gcc-patches, Jeff Law

Iain Buclaw <ibuclaw@gdcproject.org> writes:
> On 14 October 2018 at 17:29, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> [Sorry if this turns out to do be a dup]
>>
>> Iain Buclaw <ibuclaw@gdcproject.org> writes:
>>> +/* Build nodes that are used by the D front-end.
>>> +   These are distinct from C types.  */
>>> +
>>> +static void
>>> +d_build_d_type_nodes (void)
>>> +{
>>> +  /* Integral types.  */
>>> +  byte_type_node = make_signed_type (8);
>>> +  ubyte_type_node = make_unsigned_type (8);
>>> +
>>> +  short_type_node = make_signed_type (16);
>>> +  ushort_type_node = make_unsigned_type (16);
>>> +
>>> +  int_type_node = make_signed_type (32);
>>> +  uint_type_node = make_unsigned_type (32);
>>> +
>>> +  long_type_node = make_signed_type (64);
>>> +  ulong_type_node = make_unsigned_type (64);
>>
>> It's a bit confusing for the D type to be long_type_node but the C/ABI type
>> to be long_integer_type_node.  The D type is surely an integer too. :-)
>> With this coming among the handling of built-in functions, it initially
>> looked related, and I was wondering how it was safe on ILP32 systems
>> before realising the difference.
>>
>> Maybe prefixing them all with "d_" would be too ugly, but it would at
>> least be good to clarify the comment to say that these are "distinct
>> type nodes" (rather than just distinct definitions, as I'd initially
>> assumed) and that they're not used outside the frontend, or by the C
>> imports.
>>
>
> If prefixing with "d_", perhaps dropping the "_node" would make them
> sufficiently not ugly (d_long_type, d_uint_type, d_byte_type).

Sounds good to me FWIW.

>>> +/* Helper routine for all error routines.  Reports a diagnostic specified by
>>> +   KIND at the explicit location LOC, where the message FORMAT has not yet
>>> +   been translated by the gcc diagnostic routines.  */
>>> +
>>> +static void ATTRIBUTE_GCC_DIAG(3,0)
>>> +d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char *format,
>>> +                             va_list ap, diagnostic_t kind, bool verbatim)
>>> +{
>>> +  va_list argp;
>>> +  va_copy (argp, ap);
>>> +
>>> +  if (loc.filename || !verbatim)
>>> +    {
>>> +      rich_location rich_loc (line_table, get_linemap (loc));
>>> +      diagnostic_info diagnostic;
>>> +      char *xformat = expand_format (format);
>>> +
>>> +      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);
>>
>> How does this work with translation?  xgettext will only see the original
>> format string, not the result of expand_format.  Do you have some scripting
>> to do the same format mangling when collecting the translation strings?
>> Same concern:
>>
>
> These diagnostic routines handle errors coming from the dmd front-end,
> which are not translated - all sources are listed under po/EXCLUDES in
> another patch.

OK.  In that case I think you want to use diagnostic_set_info_translated
instead of diagnostic_set_info, so that we don't try to translate things
that aren't meant to be translated.  Also it would be good to reword the
comment above the function, since "where the message FORMAT has not yet
been translated by the gcc diagnostic routines" made it sound like these
messages were supposed to be translated at some point, which is where the
confusion started. :-)

>>> +/* Write a little-endian 32-bit VALUE to BUFFER.  */
>>> +
>>> +void
>>> +Port::writelongLE (unsigned value, void *buffer)
>>> +{
>>> +    unsigned char *p = (unsigned char*) buffer;
>>> +
>>> +    p[0] = (unsigned) value;
>>> +    p[1] = (unsigned) value >> 8;
>>> +    p[2] = (unsigned) value >> 16;
>>> +    p[3] = (unsigned) value >> 24;
>>> +}
>>> ...
>>> +/* Write a big-endian 32-bit VALUE to BUFFER.  */
>>> +
>>> +void
>>> +Port::writelongBE (unsigned value, void *buffer)
>>> +{
>>> +    unsigned char *p = (unsigned char*) buffer;
>>> +
>>> +    p[0] = (unsigned) value >> 24;
>>> +    p[1] = (unsigned) value >> 16;
>>> +    p[2] = (unsigned) value >> 8;
>>> +    p[3] = (unsigned) value;
>>> +}
>>
>> Overindented bodies.  Missing space before "*" in "(unsigned char*)"
>> in all these functions.
>>
>> Obviously this stuff assumes host CHAR_BIT == 8, but let's be realistic :-)
>> Is it also used in ways that require the target BITS_PER_UNIT to be 8
>> as well?  That could realistically be a different value (and we've had
>> ports like that in the past).
>>
>
> These read(long|word)(BE|LE) functions should only ever be used when
> reading the BOM of a UTF-16 or UTF-32 file.
>
> I've done a grep, and the write(long|word)(BE|LE) are no longer used
> by the dmd frontend, so there's little point keeping them around.
>
> If there's any utility in libiberty or another location then I'd be
> more than happy to delegate this action to that here.

If it's for UTF-16 and UTF-32 then I think it's fine as-is (if you keep it).
Was just asking in case.

>>> +/* Implements the lang_hooks.get_alias_set routine for language D.
>>> +   Get the alias set corresponding the type or expression T.
>>> +   Return -1 if we don't do anything special.  */
>>> +
>>> +static alias_set_type
>>> +d_get_alias_set (tree t)
>>> +{
>>> +  /* Permit type-punning when accessing a union, provided the access
>>> +     is directly through the union.  */
>>> +  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
>>> +    {
>>> +      if (TREE_CODE (u) == COMPONENT_REF
>>> +       && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
>>> +     return 0;
>>> +    }
>>> +
>>> +  /* That's all the expressions we handle.  */
>>> +  if (!TYPE_P (t))
>>> +    return get_alias_set (TREE_TYPE (t));
>>> +
>>> +  /* For now in D, assume everything aliases everything else,
>>> +     until we define some solid rules.  */
>>> +  return 0;
>>> +}
>>
>> Any reason for not returning 0 unconditionally?  Won't the queries
>> deferred to get_alias_set either return 0 directly or come back here
>> and get 0?
>>
>> Might be worth adding a comment referencing build_vconvert, which stood
>> out as a source of what in C would be alias violations.
>>
>
> All previous attempts at doing more with this has caused silent
> corruption at runtime, the existence of build_vconvert likely doesn't
> help either.
>
> Although it isn't in the spec, D should be "strict aliasing".  But
> there's a lot of production code that looks like `intvar = *(int
> *)&floatvar;` that is currently expected to just work.
>
> By rule of thumb in D, the C behaviour should be followed if it looks
> like C code.  The only places where we deviate are made to be a
> compiler error.

But my point was more that the function seems to have the effect of
returning 0 all the time, so wouldn't it be better to get rid of the
code before the final "return 0"?  Or is there a case I missed?

>>> +/* Clear the DECL_BUILT_IN_CLASS flag on the function in CALLEXP.  */
>>> +
>>> +static void
>>> +clear_intrinsic_flag (tree callexp)
>>> +{
>>> +  tree decl = CALL_EXPR_FN (callexp);
>>> +
>>> +  if (TREE_CODE (decl) == ADDR_EXPR)
>>> +    decl = TREE_OPERAND (decl, 0);
>>> +
>>> +  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
>>> +
>>> +  DECL_INTRINSIC_CODE (decl) = INTRINSIC_NONE;
>>> +  DECL_BUILT_IN_CLASS (decl) = NOT_BUILT_IN;
>>> +  DECL_FUNCTION_CODE (decl) = BUILT_IN_NONE;
>>> +}
>>
>> It seems wrong that a particular call to a function would change the
>> function's properties for all calls, and I don't think there's any
>> expectation in the midend that this kind of change might happen.
>>
>> Why's it needed?  Whatever the reason, I think it needs to be done in a
>> different way.
>>
>
> There are some D library math functions that are only treated as
> built-in during compile-time only.  When it comes to run-time code
> generation, we want to call the D library functions, not the C library
> (or built-ins).

But e.g. we can treat printf as a built-in and still call the real printf
without doing this.

>>> +static tree
>>> +expand_intrinsic_sqrt (intrinsic_code intrinsic, tree callexp)
>>> +{
>>> +  tree arg = CALL_EXPR_ARG (callexp, 0);
>>> +  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (arg));
>>> +
>>> +  /* arg is an integral type - use double precision.  */
>>> +  if (INTEGRAL_TYPE_P (type))
>>> +    arg = fold_convert (double_type_node, arg);
>>> +
>>> +  /* Which variant of __builtin_sqrt* should we call?  */
>>> +  built_in_function code = (intrinsic == INTRINSIC_SQRT) ? BUILT_IN_SQRT :
>>> +    (intrinsic == INTRINSIC_SQRTF) ? BUILT_IN_SQRTF :
>>> +    (intrinsic == INTRINSIC_SQRTL) ? BUILT_IN_SQRTL : END_BUILTINS;
>>> +
>>> +  gcc_assert (code != END_BUILTINS);
>>> +  return call_builtin_fn (callexp, code, 1, arg);
>>> +}
>>
>> Converting integers to double precision isn't right for SQRTF and SQRTL.
>> But how do such calls reach here?  It looks like the deco strings in the
>> function entries provide types for the 3 sqrt intrinsics, is that right?
>> Does that translate to a well-typed FUNCTION_DECL, or do the function
>> types use some opaque types instead?  If the intrinsic function itself
>> is well-typed then an incoming CALLEXP with integer arguments would be
>> invalid.
>>
>
> As with pow(), I'll have to check by running this through the
> testsuite to see what fails and why.
>
> From memory, the reason is likely some requirement of CTFE, where this
> call *must* be const folded at compile-time, which may not happen if
> the types are wrong.

The reason for picking this one in particular is that the INTRINSIC_SQRT{,F,L}
FUNCTION_DECLs looked like they would have correct types, so any CALL_EXPR
tree with mismatched types would be invalid in terms of GCC internals.
If that happens, it sounds like there's a bug earlier on.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-10-20 10:51       ` Richard Sandiford
@ 2018-10-20 15:53         ` Iain Buclaw
  2018-10-20 22:57         ` Iain Buclaw
  2018-10-22  1:13         ` Iain Buclaw
  2 siblings, 0 replies; 19+ messages in thread
From: Iain Buclaw @ 2018-10-20 15:53 UTC (permalink / raw)
  To: richard.sandiford; +Cc: gcc-patches, Jeff Law

On Sat, 20 Oct 2018 at 11:03, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Iain Buclaw <ibuclaw@gdcproject.org> writes:
> > On 14 October 2018 at 17:29, Richard Sandiford
> > <rdsandiford@googlemail.com> wrote:
> >> [Sorry if this turns out to do be a dup]
> >>
> >> Iain Buclaw <ibuclaw@gdcproject.org> writes:
> >>> +/* Helper routine for all error routines.  Reports a diagnostic specified by
> >>> +   KIND at the explicit location LOC, where the message FORMAT has not yet
> >>> +   been translated by the gcc diagnostic routines.  */
> >>> +
> >>> +static void ATTRIBUTE_GCC_DIAG(3,0)
> >>> +d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char *format,
> >>> +                             va_list ap, diagnostic_t kind, bool verbatim)
> >>> +{
> >>> +  va_list argp;
> >>> +  va_copy (argp, ap);
> >>> +
> >>> +  if (loc.filename || !verbatim)
> >>> +    {
> >>> +      rich_location rich_loc (line_table, get_linemap (loc));
> >>> +      diagnostic_info diagnostic;
> >>> +      char *xformat = expand_format (format);
> >>> +
> >>> +      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);
> >>
> >> How does this work with translation?  xgettext will only see the original
> >> format string, not the result of expand_format.  Do you have some scripting
> >> to do the same format mangling when collecting the translation strings?
> >> Same concern:
> >>
> >
> > These diagnostic routines handle errors coming from the dmd front-end,
> > which are not translated - all sources are listed under po/EXCLUDES in
> > another patch.
>
> OK.  In that case I think you want to use diagnostic_set_info_translated
> instead of diagnostic_set_info, so that we don't try to translate things
> that aren't meant to be translated.  Also it would be good to reword the
> comment above the function, since "where the message FORMAT has not yet
> been translated by the gcc diagnostic routines" made it sound like these
> messages were supposed to be translated at some point, which is where the
> confusion started. :-)
>

OK, done.


> >>> +/* Write a little-endian 32-bit VALUE to BUFFER.  */
> >>> +
> >>> +void
> >>> +Port::writelongLE (unsigned value, void *buffer)
> >>> +{
> >>> +    unsigned char *p = (unsigned char*) buffer;
> >>> +
> >>> +    p[0] = (unsigned) value;
> >>> +    p[1] = (unsigned) value >> 8;
> >>> +    p[2] = (unsigned) value >> 16;
> >>> +    p[3] = (unsigned) value >> 24;
> >>> +}
> >>> ...
> >>> +/* Write a big-endian 32-bit VALUE to BUFFER.  */
> >>> +
> >>> +void
> >>> +Port::writelongBE (unsigned value, void *buffer)
> >>> +{
> >>> +    unsigned char *p = (unsigned char*) buffer;
> >>> +
> >>> +    p[0] = (unsigned) value >> 24;
> >>> +    p[1] = (unsigned) value >> 16;
> >>> +    p[2] = (unsigned) value >> 8;
> >>> +    p[3] = (unsigned) value;
> >>> +}
> >>
> >> Overindented bodies.  Missing space before "*" in "(unsigned char*)"
> >> in all these functions.
> >>
> >> Obviously this stuff assumes host CHAR_BIT == 8, but let's be realistic :-)
> >> Is it also used in ways that require the target BITS_PER_UNIT to be 8
> >> as well?  That could realistically be a different value (and we've had
> >> ports like that in the past).
> >>
> >
> > These read(long|word)(BE|LE) functions should only ever be used when
> > reading the BOM of a UTF-16 or UTF-32 file.
> >
> > I've done a grep, and the write(long|word)(BE|LE) are no longer used
> > by the dmd frontend, so there's little point keeping them around.
> >
> > If there's any utility in libiberty or another location then I'd be
> > more than happy to delegate this action to that here.
>
> If it's for UTF-16 and UTF-32 then I think it's fine as-is (if you keep it).
> Was just asking in case.
>

OK.

> >>> +/* Implements the lang_hooks.get_alias_set routine for language D.
> >>> +   Get the alias set corresponding the type or expression T.
> >>> +   Return -1 if we don't do anything special.  */
> >>> +
> >>> +static alias_set_type
> >>> +d_get_alias_set (tree t)
> >>> +{
> >>> +  /* Permit type-punning when accessing a union, provided the access
> >>> +     is directly through the union.  */
> >>> +  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
> >>> +    {
> >>> +      if (TREE_CODE (u) == COMPONENT_REF
> >>> +       && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
> >>> +     return 0;
> >>> +    }
> >>> +
> >>> +  /* That's all the expressions we handle.  */
> >>> +  if (!TYPE_P (t))
> >>> +    return get_alias_set (TREE_TYPE (t));
> >>> +
> >>> +  /* For now in D, assume everything aliases everything else,
> >>> +     until we define some solid rules.  */
> >>> +  return 0;
> >>> +}
> >>
> >> Any reason for not returning 0 unconditionally?  Won't the queries
> >> deferred to get_alias_set either return 0 directly or come back here
> >> and get 0?
> >>
> >> Might be worth adding a comment referencing build_vconvert, which stood
> >> out as a source of what in C would be alias violations.
> >>
> >
> > All previous attempts at doing more with this has caused silent
> > corruption at runtime, the existence of build_vconvert likely doesn't
> > help either.
> >
> > Although it isn't in the spec, D should be "strict aliasing".  But
> > there's a lot of production code that looks like `intvar = *(int
> > *)&floatvar;` that is currently expected to just work.
> >
> > By rule of thumb in D, the C behaviour should be followed if it looks
> > like C code.  The only places where we deviate are made to be a
> > compiler error.
>
> But my point was more that the function seems to have the effect of
> returning 0 all the time, so wouldn't it be better to get rid of the
> code before the final "return 0"?  Or is there a case I missed?
>

Yes, it only ever returns 0.  I'll just remove the other parts, I have
it in commit history should I ever attempt to support this proper.

Regards
-- 
Iain

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-10-20 10:51       ` Richard Sandiford
  2018-10-20 15:53         ` Iain Buclaw
@ 2018-10-20 22:57         ` Iain Buclaw
  2018-10-22  1:13         ` Iain Buclaw
  2 siblings, 0 replies; 19+ messages in thread
From: Iain Buclaw @ 2018-10-20 22:57 UTC (permalink / raw)
  To: richard.sandiford; +Cc: gcc-patches, Jeff Law

On Sat, 20 Oct 2018 at 11:03, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Iain Buclaw <ibuclaw@gdcproject.org> writes:
> > On 14 October 2018 at 17:29, Richard Sandiford
> > <rdsandiford@googlemail.com> wrote:
> >> [Sorry if this turns out to do be a dup]
> >>
> >> Iain Buclaw <ibuclaw@gdcproject.org> writes:
> >>> +/* Clear the DECL_BUILT_IN_CLASS flag on the function in CALLEXP.  */
> >>> +
> >>> +static void
> >>> +clear_intrinsic_flag (tree callexp)
> >>> +{
> >>> +  tree decl = CALL_EXPR_FN (callexp);
> >>> +
> >>> +  if (TREE_CODE (decl) == ADDR_EXPR)
> >>> +    decl = TREE_OPERAND (decl, 0);
> >>> +
> >>> +  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
> >>> +
> >>> +  DECL_INTRINSIC_CODE (decl) = INTRINSIC_NONE;
> >>> +  DECL_BUILT_IN_CLASS (decl) = NOT_BUILT_IN;
> >>> +  DECL_FUNCTION_CODE (decl) = BUILT_IN_NONE;
> >>> +}
> >>
> >> It seems wrong that a particular call to a function would change the
> >> function's properties for all calls, and I don't think there's any
> >> expectation in the midend that this kind of change might happen.
> >>
> >> Why's it needed?  Whatever the reason, I think it needs to be done in a
> >> different way.
> >>
> >
> > There are some D library math functions that are only treated as
> > built-in during compile-time only.  When it comes to run-time code
> > generation, we want to call the D library functions, not the C library
> > (or built-ins).
>
> But e.g. we can treat printf as a built-in and still call the real printf
> without doing this.
>

There's a doing_semantic_analysis_p variable I added a year after I
did this particular function.  Said variable is set to true from the
beginning till the end of the semantic passes, so that can be used as
a gate for expanding CTFE-only intrinsics.  So I think that would be
problem solved here.

> >>> +static tree
> >>> +expand_intrinsic_sqrt (intrinsic_code intrinsic, tree callexp)
> >>> +{
> >>> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> >>> +  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (arg));
> >>> +
> >>> +  /* arg is an integral type - use double precision.  */
> >>> +  if (INTEGRAL_TYPE_P (type))
> >>> +    arg = fold_convert (double_type_node, arg);
> >>> +
> >>> +  /* Which variant of __builtin_sqrt* should we call?  */
> >>> +  built_in_function code = (intrinsic == INTRINSIC_SQRT) ? BUILT_IN_SQRT :
> >>> +    (intrinsic == INTRINSIC_SQRTF) ? BUILT_IN_SQRTF :
> >>> +    (intrinsic == INTRINSIC_SQRTL) ? BUILT_IN_SQRTL : END_BUILTINS;
> >>> +
> >>> +  gcc_assert (code != END_BUILTINS);
> >>> +  return call_builtin_fn (callexp, code, 1, arg);
> >>> +}
> >>
> >> Converting integers to double precision isn't right for SQRTF and SQRTL.
> >> But how do such calls reach here?  It looks like the deco strings in the
> >> function entries provide types for the 3 sqrt intrinsics, is that right?
> >> Does that translate to a well-typed FUNCTION_DECL, or do the function
> >> types use some opaque types instead?  If the intrinsic function itself
> >> is well-typed then an incoming CALLEXP with integer arguments would be
> >> invalid.
> >>
> >
> > As with pow(), I'll have to check by running this through the
> > testsuite to see what fails and why.
> >
> > From memory, the reason is likely some requirement of CTFE, where this
> > call *must* be const folded at compile-time, which may not happen if
> > the types are wrong.
>
> The reason for picking this one in particular is that the INTRINSIC_SQRT{,F,L}
> FUNCTION_DECLs looked like they would have correct types, so any CALL_EXPR
> tree with mismatched types would be invalid in terms of GCC internals.
> If that happens, it sounds like there's a bug earlier on.
>

I think for this, it's just an initial lack of trust that D's CTFE
will pass to us the right thing.

Yes, the deco (function signature) matches explictly the float,
double, and real overloads, and calling `sqrt(2)` will result in a
compile-time error, as `int` is an ambiguous match.

I've just removed this though stopped should of adding hard asserts
that `!INTEGRAL_TYPE_P (type)`.  If the parameter passed somehow ends
up being an integer, no folding will occur, resulting in a compiler
error that the built-in was not evaluated.

Regards.
-- 
Iain

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-10-16 11:14   ` [PATCH 02/14] Add D frontend (GDC) implementation Richard Sandiford
  2018-10-16 11:39     ` Richard Sandiford
@ 2018-10-21  0:02     ` Iain Buclaw
  1 sibling, 0 replies; 19+ messages in thread
From: Iain Buclaw @ 2018-10-21  0:02 UTC (permalink / raw)
  To: richard.sandiford; +Cc: gcc-patches, Jeff Law

On Tue, 16 Oct 2018 at 11:48, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Iain Buclaw <ibuclaw@gdcproject.org> writes:
> > +    /* The LHS expression could be an assignment, to which it's operation gets
> > +       lost during gimplification.  */
> > +    if (TREE_CODE (lhs) == MODIFY_EXPR)
> > +      {
> > +     lexpr = compound_expr (lexpr, lhs);
> > +     lhs = TREE_OPERAND (lhs, 0);
> > +      }
>
> s/it's/its/.  But the code looks a bit odd -- won't you end up with
> double evaluation of the lhs of the MODIFY_EXPR?  Or is the creator of
> the MODIFY_EXPR already guaranteed to have wrapped the lhs in a SAVE_EXPR
> where necessary?  Probably worth a comment if so.
>

So, this particular block, the sort of code that we're handling looks like this:

(a = 42) += 64;
(foo() = 42) += 64;  // foo() returns a reference

Where the LHS of the left assignment is stripped out and re-written as:

a = 42, a = a + 64;
*SAVE_EXPR <foo ()> = 42, *SAVE_EXPR <foo ()> = *SAVE_EXPR <foo ()> + 64;

LHS is guaranteed to be side-effect free here, else it's a bug.  I'll
add a comment about this however.

Regards
-- 
Iain

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-10-20 10:51       ` Richard Sandiford
  2018-10-20 15:53         ` Iain Buclaw
  2018-10-20 22:57         ` Iain Buclaw
@ 2018-10-22  1:13         ` Iain Buclaw
  2018-10-23 14:47           ` Richard Sandiford
  2 siblings, 1 reply; 19+ messages in thread
From: Iain Buclaw @ 2018-10-22  1:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford, David Malcolm

[-- Attachment #1: Type: text/plain, Size: 20669 bytes --]

On Sat, 20 Oct 2018 at 11:03, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Iain Buclaw <ibuclaw@gdcproject.org> writes:
> > On 14 October 2018 at 17:29, Richard Sandiford
> > <rdsandiford@googlemail.com> wrote:
> >> [Sorry if this turns out to do be a dup]
> >>
> >> Iain Buclaw <ibuclaw@gdcproject.org> writes:
> >>> +/* Build nodes that are used by the D front-end.
> >>> +   These are distinct from C types.  */
> >>> +
> >>> +static void
> >>> +d_build_d_type_nodes (void)
> >>> +{
> >>> +  /* Integral types.  */
> >>> +  byte_type_node = make_signed_type (8);
> >>> +  ubyte_type_node = make_unsigned_type (8);
> >>> +
> >>> +  short_type_node = make_signed_type (16);
> >>> +  ushort_type_node = make_unsigned_type (16);
> >>> +
> >>> +  int_type_node = make_signed_type (32);
> >>> +  uint_type_node = make_unsigned_type (32);
> >>> +
> >>> +  long_type_node = make_signed_type (64);
> >>> +  ulong_type_node = make_unsigned_type (64);
> >>
> >> It's a bit confusing for the D type to be long_type_node but the C/ABI type
> >> to be long_integer_type_node.  The D type is surely an integer too. :-)
> >> With this coming among the handling of built-in functions, it initially
> >> looked related, and I was wondering how it was safe on ILP32 systems
> >> before realising the difference.
> >>
> >> Maybe prefixing them all with "d_" would be too ugly, but it would at
> >> least be good to clarify the comment to say that these are "distinct
> >> type nodes" (rather than just distinct definitions, as I'd initially
> >> assumed) and that they're not used outside the frontend, or by the C
> >> imports.
> >>
> >
> > If prefixing with "d_", perhaps dropping the "_node" would make them
> > sufficiently not ugly (d_long_type, d_uint_type, d_byte_type).
>
> Sounds good to me FWIW.
>
> >>> +/* Helper routine for all error routines.  Reports a diagnostic specified by
> >>> +   KIND at the explicit location LOC, where the message FORMAT has not yet
> >>> +   been translated by the gcc diagnostic routines.  */
> >>> +
> >>> +static void ATTRIBUTE_GCC_DIAG(3,0)
> >>> +d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char *format,
> >>> +                             va_list ap, diagnostic_t kind, bool verbatim)
> >>> +{
> >>> +  va_list argp;
> >>> +  va_copy (argp, ap);
> >>> +
> >>> +  if (loc.filename || !verbatim)
> >>> +    {
> >>> +      rich_location rich_loc (line_table, get_linemap (loc));
> >>> +      diagnostic_info diagnostic;
> >>> +      char *xformat = expand_format (format);
> >>> +
> >>> +      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);
> >>
> >> How does this work with translation?  xgettext will only see the original
> >> format string, not the result of expand_format.  Do you have some scripting
> >> to do the same format mangling when collecting the translation strings?
> >> Same concern:
> >>
> >
> > These diagnostic routines handle errors coming from the dmd front-end,
> > which are not translated - all sources are listed under po/EXCLUDES in
> > another patch.
>
> OK.  In that case I think you want to use diagnostic_set_info_translated
> instead of diagnostic_set_info, so that we don't try to translate things
> that aren't meant to be translated.  Also it would be good to reword the
> comment above the function, since "where the message FORMAT has not yet
> been translated by the gcc diagnostic routines" made it sound like these
> messages were supposed to be translated at some point, which is where the
> confusion started. :-)
>
> >>> +/* Write a little-endian 32-bit VALUE to BUFFER.  */
> >>> +
> >>> +void
> >>> +Port::writelongLE (unsigned value, void *buffer)
> >>> +{
> >>> +    unsigned char *p = (unsigned char*) buffer;
> >>> +
> >>> +    p[0] = (unsigned) value;
> >>> +    p[1] = (unsigned) value >> 8;
> >>> +    p[2] = (unsigned) value >> 16;
> >>> +    p[3] = (unsigned) value >> 24;
> >>> +}
> >>> ...
> >>> +/* Write a big-endian 32-bit VALUE to BUFFER.  */
> >>> +
> >>> +void
> >>> +Port::writelongBE (unsigned value, void *buffer)
> >>> +{
> >>> +    unsigned char *p = (unsigned char*) buffer;
> >>> +
> >>> +    p[0] = (unsigned) value >> 24;
> >>> +    p[1] = (unsigned) value >> 16;
> >>> +    p[2] = (unsigned) value >> 8;
> >>> +    p[3] = (unsigned) value;
> >>> +}
> >>
> >> Overindented bodies.  Missing space before "*" in "(unsigned char*)"
> >> in all these functions.
> >>
> >> Obviously this stuff assumes host CHAR_BIT == 8, but let's be realistic :-)
> >> Is it also used in ways that require the target BITS_PER_UNIT to be 8
> >> as well?  That could realistically be a different value (and we've had
> >> ports like that in the past).
> >>
> >
> > These read(long|word)(BE|LE) functions should only ever be used when
> > reading the BOM of a UTF-16 or UTF-32 file.
> >
> > I've done a grep, and the write(long|word)(BE|LE) are no longer used
> > by the dmd frontend, so there's little point keeping them around.
> >
> > If there's any utility in libiberty or another location then I'd be
> > more than happy to delegate this action to that here.
>
> If it's for UTF-16 and UTF-32 then I think it's fine as-is (if you keep it).
> Was just asking in case.
>
> >>> +/* Implements the lang_hooks.get_alias_set routine for language D.
> >>> +   Get the alias set corresponding the type or expression T.
> >>> +   Return -1 if we don't do anything special.  */
> >>> +
> >>> +static alias_set_type
> >>> +d_get_alias_set (tree t)
> >>> +{
> >>> +  /* Permit type-punning when accessing a union, provided the access
> >>> +     is directly through the union.  */
> >>> +  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
> >>> +    {
> >>> +      if (TREE_CODE (u) == COMPONENT_REF
> >>> +       && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
> >>> +     return 0;
> >>> +    }
> >>> +
> >>> +  /* That's all the expressions we handle.  */
> >>> +  if (!TYPE_P (t))
> >>> +    return get_alias_set (TREE_TYPE (t));
> >>> +
> >>> +  /* For now in D, assume everything aliases everything else,
> >>> +     until we define some solid rules.  */
> >>> +  return 0;
> >>> +}
> >>
> >> Any reason for not returning 0 unconditionally?  Won't the queries
> >> deferred to get_alias_set either return 0 directly or come back here
> >> and get 0?
> >>
> >> Might be worth adding a comment referencing build_vconvert, which stood
> >> out as a source of what in C would be alias violations.
> >>
> >
> > All previous attempts at doing more with this has caused silent
> > corruption at runtime, the existence of build_vconvert likely doesn't
> > help either.
> >
> > Although it isn't in the spec, D should be "strict aliasing".  But
> > there's a lot of production code that looks like `intvar = *(int
> > *)&floatvar;` that is currently expected to just work.
> >
> > By rule of thumb in D, the C behaviour should be followed if it looks
> > like C code.  The only places where we deviate are made to be a
> > compiler error.
>
> But my point was more that the function seems to have the effect of
> returning 0 all the time, so wouldn't it be better to get rid of the
> code before the final "return 0"?  Or is there a case I missed?
>
> >>> +/* Clear the DECL_BUILT_IN_CLASS flag on the function in CALLEXP.  */
> >>> +
> >>> +static void
> >>> +clear_intrinsic_flag (tree callexp)
> >>> +{
> >>> +  tree decl = CALL_EXPR_FN (callexp);
> >>> +
> >>> +  if (TREE_CODE (decl) == ADDR_EXPR)
> >>> +    decl = TREE_OPERAND (decl, 0);
> >>> +
> >>> +  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
> >>> +
> >>> +  DECL_INTRINSIC_CODE (decl) = INTRINSIC_NONE;
> >>> +  DECL_BUILT_IN_CLASS (decl) = NOT_BUILT_IN;
> >>> +  DECL_FUNCTION_CODE (decl) = BUILT_IN_NONE;
> >>> +}
> >>
> >> It seems wrong that a particular call to a function would change the
> >> function's properties for all calls, and I don't think there's any
> >> expectation in the midend that this kind of change might happen.
> >>
> >> Why's it needed?  Whatever the reason, I think it needs to be done in a
> >> different way.
> >>
> >
> > There are some D library math functions that are only treated as
> > built-in during compile-time only.  When it comes to run-time code
> > generation, we want to call the D library functions, not the C library
> > (or built-ins).
>
> But e.g. we can treat printf as a built-in and still call the real printf
> without doing this.
>
> >>> +static tree
> >>> +expand_intrinsic_sqrt (intrinsic_code intrinsic, tree callexp)
> >>> +{
> >>> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> >>> +  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (arg));
> >>> +
> >>> +  /* arg is an integral type - use double precision.  */
> >>> +  if (INTEGRAL_TYPE_P (type))
> >>> +    arg = fold_convert (double_type_node, arg);
> >>> +
> >>> +  /* Which variant of __builtin_sqrt* should we call?  */
> >>> +  built_in_function code = (intrinsic == INTRINSIC_SQRT) ? BUILT_IN_SQRT :
> >>> +    (intrinsic == INTRINSIC_SQRTF) ? BUILT_IN_SQRTF :
> >>> +    (intrinsic == INTRINSIC_SQRTL) ? BUILT_IN_SQRTL : END_BUILTINS;
> >>> +
> >>> +  gcc_assert (code != END_BUILTINS);
> >>> +  return call_builtin_fn (callexp, code, 1, arg);
> >>> +}
> >>
> >> Converting integers to double precision isn't right for SQRTF and SQRTL.
> >> But how do such calls reach here?  It looks like the deco strings in the
> >> function entries provide types for the 3 sqrt intrinsics, is that right?
> >> Does that translate to a well-typed FUNCTION_DECL, or do the function
> >> types use some opaque types instead?  If the intrinsic function itself
> >> is well-typed then an incoming CALLEXP with integer arguments would be
> >> invalid.
> >>
> >
> > As with pow(), I'll have to check by running this through the
> > testsuite to see what fails and why.
> >
> > From memory, the reason is likely some requirement of CTFE, where this
> > call *must* be const folded at compile-time, which may not happen if
> > the types are wrong.
>
> The reason for picking this one in particular is that the INTRINSIC_SQRT{,F,L}
> FUNCTION_DECLs looked like they would have correct types, so any CALL_EXPR
> tree with mismatched types would be invalid in terms of GCC internals.
> If that happens, it sounds like there's a bug earlier on.
>

I'm just going to post the diff since the original here, just to show
what's been done since review comments.

I think I've covered all that's been addressed, except for the couple
of notes about the quadratic parts (though I think one of them is
actually O(N^2)).  I've raised bug reports on improving them later.

I've also rebased them against trunk, so there's a couple new things
present that are just to support build.

Regards
--
Iain

---
gcc/d/:
2018-10-22  Iain Buclaw  <ibuclaw@gdcproject.org>

    * Make-lang.in (selftest-d): New.

2018-10-22  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-spec.cc (lang_specific_driver): Always link against phobos if any
    input file is given.

2018-10-21  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-lang.cc (d_get_alias_set): Always return zero.

2018-10-21  Iain Buclaw  <ibuclaw@gdcproject.org>

    * intrinsics.cc (maybe_set_intrinsic): Don't set built-in flag on
    unsupported pow() overloads.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * expr.cc (ExprVisitor::binop_assignment): Call stabilize_reference on
    LHS construct if it has side effects.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * intrinsics.cc (clear_intrinsic_flag): Remove function.
    (maybe_expand_intrinsic): Remove clear_intrinsic_flag call.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * intrinsics.cc (expand_intrinsic_copysign): Use mathfn_built_in to
    determine correct built-in to call.
    (expand_intrinsic_pow): Likewise.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * intrinsics.cc (expand_intrinsic_sqrt): Remove implicit int to double
    conversion.
    (expand_intrinsic_pow): Likewise.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-codegen.cc (get_frame_for_symbol): Use error_at.
    (build_frame_type): Likewise.
    (get_framedecl): Likewise.
    * d-lang.cc (d_parse_file): Likewise.
    * decl.cc (DeclVisitor::visit(StructDeclaration)): Likewise.
    (DeclVisitor::finish_vtable): Likewise.
    (DeclVisitor::visit(ClassDeclaration)): Likewise.
    (DeclVisitor::visit(InterfaceDeclaration)): Likewise.
    (DeclVisitor::visit(EnumDeclaration)): Likewise.
    (DeclVisitor::visit(VarDeclaration)): Likewise.
    * toir.cc (IRVisitor::check_goto): Likewise.
    (IRVisitor::check_previous_goto): Likewise.
    (IRVisitor::visit(ThrowStatement)): Likewise.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-codegen.cc (get_array_length): Use quoted format flag in message.
    (d_build_call): Likewise.
    * d-lang.cc (d_handle_option): Likewise.
    * decl.cc (DeclVisitor::finish_vtable): Likewise.
    * expr.cc (ExprVisitor::visit(ArrayLengthExp)): Likewise.
    (ExprVisitor::visit(DeleteExp)): Likewise.
    (ExprVisitor::visit(RemoveExp)): Likewise.
    (ExprVisitor::visit(RemoveExp)): Likewise.
    (ExprVisitor::visit(CallExp)): Likewise.
    (ExprVisitor::visit(DotVarExp)): Likewise.
    (ExprVisitor::visit(VarExp)): Likewise.
    (ExprVisitor::visit(ScopeExp)): Likewise.
    (ExprVisitor::visit(TypeExp)): Likewise.
    (build_expr): Likewise.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-diagnostic.cc (d_diagnostic_report_diagnostic): Skip translation
    by instead calling diagnostic_set_info_translated.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-tree.h (bool_type_node): Rename to d_bool_type.
    (byte_type_node): Rename to d_byte_type.
    (ubyte_type_node): Rename to d_ubyte_type.
    (short_type_node): Rename to d_short_type.
    (ushort_type_node): Rename to d_ushort_type.
    (int_type_node): Rename to d_int_type.
    (uint_type_node): Rename to d_uint_type.
    (long_type_node): Rename to d_long_type.
    (ulong_type_node): Rename to d_ulong_type.
    (cent_type_node): Rename to d_cent_type.
    (ucent_type_node): Rename to d_ucent_type.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * expr.cc (ExprVisitor::visit(PowExp)): Remove function.

2018-10-19  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-attribs.c: Rename to d-attribs.cc.
    * d-spec.c: Rename to d-spec.cc.

2018-10-19  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-lang.cc (d_gimplify_expr): Don't handle TREE_THIS_VOLATILE.

2018-10-19  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-diagnostic.cc (vwarning): Update to use Diagnostic enum.
    (vdeprecation): Likewise.
    (vdeprecationSupplemental): Likewise.
    * d-lang.cc (d_init_options): Explicitly set warnings and deprecations
    as DIAGNOSTICoff.
    (d_handle_option): Update to use Diagnostic enum.
    (d_post_options): Likewise.

2018-10-18  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-diagnostic.cc (expand_format): Rename to expand_d_format.
    Updated all callers.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-codegen.cc (get_linemap): Rename function to make_location_t.
    Updated all callers.
    * d-tree.h (get_linemap): Rename declaration to make_location_t.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * expr.cc (ExprVisitor::binary_op): Use POINTER_DIFF_EXPR.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * intrinsics.cc (expand_intrinsic_bsf): Assert that built-in function
    code is not END_BUILTINS.
    (expand_intrinsic_bsr): Likewise.
    (expand_intrinsic_bswap): Likewise.
    (expand_intrinsic_popcnt): Likewise.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * config-lang.in (gtfiles): Add modules.cc.
    * modules.cc: Include gt-d-modules.h.
    (module_info): Mark with GTY.
    (static_ctor_list): Likewise.
    (static_dtor_list): Likewise.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-spec.c (lang_specific_driver): Use strrchr and strcmp to check
    input file suffix.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-spec.c (phobos_action): New enum.
    (library): Rename to phobos_library.
    (lang_specific_driver): Update to use phobos_library.
    (lang_specific_pre_link): Likewise.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-frontend.cc (Port::writelongLE): Remove function.
    (Port::writelongBE): Remove function.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-convert.cc (convert): Remove goto maybe_fold.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-codegen.cc (warn_for_null_address): New function.
    (build_boolop): Warn about comparing address of decl to null.
    * d-convert.cc (decl_with_nonnull_addr_p): New function.
    (d_truthvalue_conversion): Warn about evaluating address as boolean.
    * d-tree.h (decl_with_nonnull_addr_p): Add declaration.
    * lang.opt (Waddress): New option.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-codegen.cc (d_array_length): Assert that argument type is a
    dynamic array.
    (d_array_ptr): Likewise.
    (d_array_value): Likewise.
    (delegate_method): Assert that argument type is a delegate.
    (delegate_object): Likewise.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-attribs.c (handle_malloc_attribute): Use gcc_assert instead of
    gcc_unreachable.
    (handle_pure_attribute): Likewise.
    (handle_nothrow_attribute): Likewise.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * Make-lang.in: Rename compiler proper to d21.
    * config-lang.in (compilers): Rename compiler to d21.
    * d-spec.c (lang_specific_driver): Update comments.
    * lang-specs.h: Rename compiler to d21.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * lang.opt: Add missing periods to the ends of sentences.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-lang.cc (d_handle_option): Remove handling of -fdeps.
    (d_parse_file): Don't generate module dependencies.
    * lang.opt (fdeps, fdeps=): Remove options.
    (fintfc, fintfc-dir=, fintfc-file=): Remove options.
    (ftransition=safe): Remove option.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-lang.cc (d_init_ts): Remove handling of IASM_EXPR.
    (d_gimplify_expr): Likewise.
    * d-tree.def (IASM_EXPR): Remove tree code.

2018-10-15  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-attrib.c (attr_noreturn_exclusions): Attribute not mutually
    exclusive with self.
    * typeinfo.cc (TypeInfoVisitor::layout_interfaces): Assert that
    base class vtable is found in interface.

2018-10-08  Iain Buclaw  <ibuclaw@gdcproject.org>

    * decl.cc (DeclVisitor): Add using Visitor::visit.
    * expr.cc (ExprVisitor): Likewise.
    * imports.cc (ImportVisitor): Likewise.
    * toir.cc (IRVisitor): Likewise.
    * typeinfo.cc (TypeInfoVisitor): Likewise.
    (TypeInfoDeclVisitor): Likewise.
    (SpeculativeTypeVisitor): Likewise.
    * types.cc (TypeVisitor): Likewise.

2018-10-01  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-frontend.cc: Include compiler.h, errors.h, expression.h.
    (genCmain): Rename function to Compiler::genCmain.
    (Compiler::paintAsType): New function.
    (Compiler::loadModule): New function.
    (getTypeInfoType): Call error function directly.
    * d-lang.cc (deps_write): Use hash_set for dependency tracking.
    (d_parse_file): Call Compiler::loadModule.
    * d-target.cc: Remove include identifier.h, module.h.
    (Target::paintAsType): Remove function.
    (Target::loadModule): Remove function.
    (Target::getTargetInfo): New function.

2018-10-01  Eugene Wissner  <belka@caraus.de>

    * decl.cc (finish_thunk): Adjust call to cgraph_node::create_thunk.

2018-09-25  Eugene Wissner  <belka@caraus.de>

    * d-codegen.cc (d_assert_call): Don't make STRING_CSTs larger than they
    are.
    * expr.cc (ExprVisitor::visit(StringExp)): Likewise.
    * typeinfo.cc (TypeInfoVisitor::layout_string): Likewise.

2018-09-24  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-builtins.cc: Include expression.h, identifier.h.
    * d-codegen.cc: Include identifier.h.
    * d-convert.cc: Include declaration.h.
    * d-frontend.cc: Include identifier.h.
    * d-lang.cc: Include declaration.h, expression.h, identifier.h.
    (d_parse_file): Call moduleToBuffer to get string dump of contents.
    * d-target.cc: Include declaration.h, expression.h, identifier.h.
    * expr.cc: Include identifier.h.
    * imports.cc: Include identifier.h.
    * intrinsics.cc: Include identifier.h.
    * modules.cc: Include identifier.h.
    * toir.cc: Include expression.h, identifier.h.
    * typeinfo.cc: Include expression.h, identifier.h.
    * types.cc: Include expression.h, identifier.h.
---

[-- Attachment #2: 02-v4v5-d-frontend-gdc.patch.xz --]
[-- Type: application/octet-stream, Size: 45444 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-10-22  1:13         ` Iain Buclaw
@ 2018-10-23 14:47           ` Richard Sandiford
  2018-10-23 19:49             ` Iain Buclaw
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Sandiford @ 2018-10-23 14:47 UTC (permalink / raw)
  To: Iain Buclaw; +Cc: gcc-patches, David Malcolm

Iain Buclaw <ibuclaw@gdcproject.org> writes:
> I'm just going to post the diff since the original here, just to show
> what's been done since review comments.
>
> I think I've covered all that's been addressed, except for the couple
> of notes about the quadratic parts (though I think one of them is
> actually O(N^2)).  I've raised bug reports on improving them later.
>
> I've also rebased them against trunk, so there's a couple new things
> present that are just to support build.

Thanks, this is OK when the frontend is accepted in principle
(can't remember where things stand with that).

Richard

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 02/14] Add D frontend (GDC) implementation.
  2018-10-23 14:47           ` Richard Sandiford
@ 2018-10-23 19:49             ` Iain Buclaw
  2018-10-25 14:12               ` Is the D frontend good to go? (was Re: [PATCH 02/14] Add D frontend (GDC) implementation.) David Malcolm
  0 siblings, 1 reply; 19+ messages in thread
From: Iain Buclaw @ 2018-10-23 19:49 UTC (permalink / raw)
  To: gcc-patches, David Malcolm, richard.sandiford

On Tue, 23 Oct 2018 at 15:48, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Iain Buclaw <ibuclaw@gdcproject.org> writes:
> > I'm just going to post the diff since the original here, just to show
> > what's been done since review comments.
> >
> > I think I've covered all that's been addressed, except for the couple
> > of notes about the quadratic parts (though I think one of them is
> > actually O(N^2)).  I've raised bug reports on improving them later.
> >
> > I've also rebased them against trunk, so there's a couple new things
> > present that are just to support build.
>
> Thanks, this is OK when the frontend is accepted in principle
> (can't remember where things stand with that).
>

As discussed, the front-end has already been approved by the SC.

I'm not sure if there's anything else further required, or if any
final review needs to be done.

Thanks.
-- 
Iain

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Is the D frontend good to go? (was Re: [PATCH 02/14] Add D frontend (GDC) implementation.)
  2018-10-23 19:49             ` Iain Buclaw
@ 2018-10-25 14:12               ` David Malcolm
  2018-10-25 15:01                 ` Iain Buclaw
  0 siblings, 1 reply; 19+ messages in thread
From: David Malcolm @ 2018-10-25 14:12 UTC (permalink / raw)
  To: Iain Buclaw, gcc-patches, richard.sandiford

On Tue, 2018-10-23 at 19:21 +0200, Iain Buclaw wrote:
> On Tue, 23 Oct 2018 at 15:48, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> > 
> > Iain Buclaw <ibuclaw@gdcproject.org> writes:
> > > I'm just going to post the diff since the original here, just to
> > > show
> > > what's been done since review comments.
> > > 
> > > I think I've covered all that's been addressed, except for the
> > > couple
> > > of notes about the quadratic parts (though I think one of them is
> > > actually O(N^2)).  I've raised bug reports on improving them
> > > later.
> > > 
> > > I've also rebased them against trunk, so there's a couple new
> > > things
> > > present that are just to support build.
> > 
> > Thanks, this is OK when the frontend is accepted in principle
> > (can't remember where things stand with that).
> > 
> 
> As discussed, the front-end has already been approved by the SC.
> 
> I'm not sure if there's anything else further required, or if any
> final review needs to be done.
> 
> Thanks.

I'm wondering what the state of this is [1]

Iain: are all of the patches individually approved, with the necessary
issues fixed?

IIRC, the front-end as a whole was approved, pending approval of all of
the individual patches (URL?).  If that's done, then presumably this is
good to go in - unless there was still some license discussion pending?

I take it that you've already got your contributor paperwork in place,
right?  I see from your maintainers commit that you presumably have svn
access.

I'm not a global reviewer or steering committee member though; would be
nice to get a "go for it" from one of those.  Richard is a global
reviewer.

I'm not sure if it should be one big mega-commit, or split out the same
way you split things out for review.

Thanks for all your work on this
Dave

[1] I've been checking the git mirror every few hours to look for a
massive commit from you, if I'm honest :)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Is the D frontend good to go? (was Re: [PATCH 02/14] Add D frontend (GDC) implementation.)
  2018-10-25 14:12               ` Is the D frontend good to go? (was Re: [PATCH 02/14] Add D frontend (GDC) implementation.) David Malcolm
@ 2018-10-25 15:01                 ` Iain Buclaw
  2018-10-26 10:07                   ` Richard Biener
  0 siblings, 1 reply; 19+ messages in thread
From: Iain Buclaw @ 2018-10-25 15:01 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches, richard.sandiford

On Thu, 25 Oct 2018 at 15:06, David Malcolm <dmalcolm@redhat.com> wrote:
>
> On Tue, 2018-10-23 at 19:21 +0200, Iain Buclaw wrote:
> > On Tue, 23 Oct 2018 at 15:48, Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> > >
> > > Iain Buclaw <ibuclaw@gdcproject.org> writes:
> > > > I'm just going to post the diff since the original here, just to
> > > > show
> > > > what's been done since review comments.
> > > >
> > > > I think I've covered all that's been addressed, except for the
> > > > couple
> > > > of notes about the quadratic parts (though I think one of them is
> > > > actually O(N^2)).  I've raised bug reports on improving them
> > > > later.
> > > >
> > > > I've also rebased them against trunk, so there's a couple new
> > > > things
> > > > present that are just to support build.
> > >
> > > Thanks, this is OK when the frontend is accepted in principle
> > > (can't remember where things stand with that).
> > >
> >
> > As discussed, the front-end has already been approved by the SC.
> >
> > I'm not sure if there's anything else further required, or if any
> > final review needs to be done.
> >
> > Thanks.
>
> I'm wondering what the state of this is [1]
>
> Iain: are all of the patches individually approved, with the necessary
> issues fixed?
>

I've posted diffs a few days back that cover all requested changes.

> IIRC, the front-end as a whole was approved, pending approval of all of
> the individual patches (URL?).  If that's done, then presumably this is
> good to go in - unless there was still some license discussion pending?
>

I have on tab responses from each patch, from what I see, they have
all been OK'd.

02: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01432.html
03: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00734.html
04: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00928.html
05: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00592.html
06: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00609.html
07: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00955.html
08: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01270.html
09: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01264.html
10: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01269.html
12: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00735.html
14: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00970.html

1, 11, and 13 are DMD, Druntime, and Phobos, which are mirrored from
upstream dlang repositories.  I spoke with Richard Stallman a couple
days after this years GNU Cauldron, and he said there is no problem
with regards to their license.

> I take it that you've already got your contributor paperwork in place,
> right?  I see from your maintainers commit that you presumably have svn
> access.
>
> I'm not a global reviewer or steering committee member though; would be
> nice to get a "go for it" from one of those.  Richard is a global
> reviewer.
>

Yes, I was going to wait a couple days to make sure that there's no
objection, before pressing on with it.

Having a "go for it" from one of the reviewers would be nice though.

> I'm not sure if it should be one big mega-commit, or split out the same
> way you split things out for review.
>

I think splitting makes sense, though not necessarily in 14 pieces,
there are only a few distinct parts.

- D language front-end.
- D standard and runtime libraries.
- D language testsuite
- D language support in GCC proper
- D language support in GCC targets
- Toplevel configure/makefile patches that add front-end and library
to the build

The first three can be squashed into one commit, as it's only adding new files.

> Thanks for all your work on this
> Dave
>
> [1] I've been checking the git mirror every few hours to look for a
> massive commit from you, if I'm honest :)

Oops, I didn't realise there were some who are so eager. :-)

-- 
Iain

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Is the D frontend good to go? (was Re: [PATCH 02/14] Add D frontend (GDC) implementation.)
  2018-10-25 15:01                 ` Iain Buclaw
@ 2018-10-26 10:07                   ` Richard Biener
  2018-10-28 23:55                     ` Iain Buclaw
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Biener @ 2018-10-26 10:07 UTC (permalink / raw)
  To: ibuclaw; +Cc: David Malcolm, GCC Patches, Richard Sandiford

On Thu, Oct 25, 2018 at 4:13 PM Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>
> On Thu, 25 Oct 2018 at 15:06, David Malcolm <dmalcolm@redhat.com> wrote:
> >
> > On Tue, 2018-10-23 at 19:21 +0200, Iain Buclaw wrote:
> > > On Tue, 23 Oct 2018 at 15:48, Richard Sandiford
> > > <richard.sandiford@arm.com> wrote:
> > > >
> > > > Iain Buclaw <ibuclaw@gdcproject.org> writes:
> > > > > I'm just going to post the diff since the original here, just to
> > > > > show
> > > > > what's been done since review comments.
> > > > >
> > > > > I think I've covered all that's been addressed, except for the
> > > > > couple
> > > > > of notes about the quadratic parts (though I think one of them is
> > > > > actually O(N^2)).  I've raised bug reports on improving them
> > > > > later.
> > > > >
> > > > > I've also rebased them against trunk, so there's a couple new
> > > > > things
> > > > > present that are just to support build.
> > > >
> > > > Thanks, this is OK when the frontend is accepted in principle
> > > > (can't remember where things stand with that).
> > > >
> > >
> > > As discussed, the front-end has already been approved by the SC.
> > >
> > > I'm not sure if there's anything else further required, or if any
> > > final review needs to be done.
> > >
> > > Thanks.
> >
> > I'm wondering what the state of this is [1]
> >
> > Iain: are all of the patches individually approved, with the necessary
> > issues fixed?
> >
>
> I've posted diffs a few days back that cover all requested changes.
>
> > IIRC, the front-end as a whole was approved, pending approval of all of
> > the individual patches (URL?).  If that's done, then presumably this is
> > good to go in - unless there was still some license discussion pending?
> >
>
> I have on tab responses from each patch, from what I see, they have
> all been OK'd.
>
> 02: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01432.html
> 03: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00734.html
> 04: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00928.html
> 05: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00592.html
> 06: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00609.html
> 07: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00955.html
> 08: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01270.html
> 09: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01264.html
> 10: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01269.html
> 12: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00735.html
> 14: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00970.html
>
> 1, 11, and 13 are DMD, Druntime, and Phobos, which are mirrored from
> upstream dlang repositories.  I spoke with Richard Stallman a couple
> days after this years GNU Cauldron, and he said there is no problem
> with regards to their license.
>
> > I take it that you've already got your contributor paperwork in place,
> > right?  I see from your maintainers commit that you presumably have svn
> > access.
> >
> > I'm not a global reviewer or steering committee member though; would be
> > nice to get a "go for it" from one of those.  Richard is a global
> > reviewer.
> >
>
> Yes, I was going to wait a couple days to make sure that there's no
> objection, before pressing on with it.
>
> Having a "go for it" from one of the reviewers would be nice though.
>
> > I'm not sure if it should be one big mega-commit, or split out the same
> > way you split things out for review.
> >
>
> I think splitting makes sense, though not necessarily in 14 pieces,
> there are only a few distinct parts.
>
> - D language front-end.
> - D standard and runtime libraries.
> - D language testsuite
> - D language support in GCC proper
> - D language support in GCC targets
> - Toplevel configure/makefile patches that add front-end and library
> to the build
>
> The first three can be squashed into one commit, as it's only adding new files.

If you make sure each individual commit still builds splitting is OK, though
technically I see no need for splitting.

Go for it!

Richard.

> > Thanks for all your work on this
> > Dave
> >
> > [1] I've been checking the git mirror every few hours to look for a
> > massive commit from you, if I'm honest :)
>
> Oops, I didn't realise there were some who are so eager. :-)
>
> --
> Iain

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Is the D frontend good to go? (was Re: [PATCH 02/14] Add D frontend (GDC) implementation.)
  2018-10-26 10:07                   ` Richard Biener
@ 2018-10-28 23:55                     ` Iain Buclaw
  0 siblings, 0 replies; 19+ messages in thread
From: Iain Buclaw @ 2018-10-28 23:55 UTC (permalink / raw)
  To: Richard Guenther; +Cc: David Malcolm, gcc-patches, richard.sandiford

On Fri, 26 Oct 2018 at 11:02, Richard Biener <richard.guenther@gmail.com> wrote:
>
> On Thu, Oct 25, 2018 at 4:13 PM Iain Buclaw <ibuclaw@gdcproject.org> wrote:
> >
> > On Thu, 25 Oct 2018 at 15:06, David Malcolm <dmalcolm@redhat.com> wrote:
> > >
> > > On Tue, 2018-10-23 at 19:21 +0200, Iain Buclaw wrote:
> > > > On Tue, 23 Oct 2018 at 15:48, Richard Sandiford
> > > > <richard.sandiford@arm.com> wrote:
> > > > >
> > > > > Iain Buclaw <ibuclaw@gdcproject.org> writes:
> > > > > > I'm just going to post the diff since the original here, just to
> > > > > > show
> > > > > > what's been done since review comments.
> > > > > >
> > > > > > I think I've covered all that's been addressed, except for the
> > > > > > couple
> > > > > > of notes about the quadratic parts (though I think one of them is
> > > > > > actually O(N^2)).  I've raised bug reports on improving them
> > > > > > later.
> > > > > >
> > > > > > I've also rebased them against trunk, so there's a couple new
> > > > > > things
> > > > > > present that are just to support build.
> > > > >
> > > > > Thanks, this is OK when the frontend is accepted in principle
> > > > > (can't remember where things stand with that).
> > > > >
> > > >
> > > > As discussed, the front-end has already been approved by the SC.
> > > >
> > > > I'm not sure if there's anything else further required, or if any
> > > > final review needs to be done.
> > > >
> > > > Thanks.
> > >
> > > I'm wondering what the state of this is [1]
> > >
> > > Iain: are all of the patches individually approved, with the necessary
> > > issues fixed?
> > >
> >
> > I've posted diffs a few days back that cover all requested changes.
> >
> > > IIRC, the front-end as a whole was approved, pending approval of all of
> > > the individual patches (URL?).  If that's done, then presumably this is
> > > good to go in - unless there was still some license discussion pending?
> > >
> >
> > I have on tab responses from each patch, from what I see, they have
> > all been OK'd.
> >
> > 02: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01432.html
> > 03: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00734.html
> > 04: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00928.html
> > 05: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00592.html
> > 06: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00609.html
> > 07: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00955.html
> > 08: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01270.html
> > 09: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01264.html
> > 10: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01269.html
> > 12: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00735.html
> > 14: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00970.html
> >
> > 1, 11, and 13 are DMD, Druntime, and Phobos, which are mirrored from
> > upstream dlang repositories.  I spoke with Richard Stallman a couple
> > days after this years GNU Cauldron, and he said there is no problem
> > with regards to their license.
> >
> > > I take it that you've already got your contributor paperwork in place,
> > > right?  I see from your maintainers commit that you presumably have svn
> > > access.
> > >
> > > I'm not a global reviewer or steering committee member though; would be
> > > nice to get a "go for it" from one of those.  Richard is a global
> > > reviewer.
> > >
> >
> > Yes, I was going to wait a couple days to make sure that there's no
> > objection, before pressing on with it.
> >
> > Having a "go for it" from one of the reviewers would be nice though.
> >
> > > I'm not sure if it should be one big mega-commit, or split out the same
> > > way you split things out for review.
> > >
> >
> > I think splitting makes sense, though not necessarily in 14 pieces,
> > there are only a few distinct parts.
> >
> > - D language front-end.
> > - D standard and runtime libraries.
> > - D language testsuite
> > - D language support in GCC proper
> > - D language support in GCC targets
> > - Toplevel configure/makefile patches that add front-end and library
> > to the build
> >
> > The first three can be squashed into one commit, as it's only adding new files.
>
> If you make sure each individual commit still builds splitting is OK, though
> technically I see no need for splitting.
>
> Go for it!
>

Bootstrapped with --enable-languages=all both before and after, and
ran all front-end testsuites on x86_64-linux-gnu, and saw no new
regressions.

So I'm going for it.  Just rebased on trunk and running just the D
testsuite one last time for my own sanity and committing.

Thanks.
--
Iain

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-10-28 18:12 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-18  0:34 [PATCH 02/14] Add D frontend (GDC) implementation Iain Buclaw
2018-09-19 20:06 ` Iain Buclaw
2018-10-14 19:28   ` Richard Sandiford
2018-10-15  7:36     ` Iain Buclaw
2018-10-20 10:51       ` Richard Sandiford
2018-10-20 15:53         ` Iain Buclaw
2018-10-20 22:57         ` Iain Buclaw
2018-10-22  1:13         ` Iain Buclaw
2018-10-23 14:47           ` Richard Sandiford
2018-10-23 19:49             ` Iain Buclaw
2018-10-25 14:12               ` Is the D frontend good to go? (was Re: [PATCH 02/14] Add D frontend (GDC) implementation.) David Malcolm
2018-10-25 15:01                 ` Iain Buclaw
2018-10-26 10:07                   ` Richard Biener
2018-10-28 23:55                     ` Iain Buclaw
2018-10-16 11:14   ` [PATCH 02/14] Add D frontend (GDC) implementation Richard Sandiford
2018-10-16 11:39     ` Richard Sandiford
2018-10-21  0:02     ` Iain Buclaw
2018-10-15 14:25 ` David Malcolm
2018-10-15 16:46   ` Iain Buclaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).