public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [00/10][RFC] Splitting the C and C++ concept of "complete type"
@ 2018-10-15 14:32 Richard Sandiford
  2018-10-15 14:33 ` [01/10] Expand COMPLETE_TYPE_P in obvious checks for null Richard Sandiford
                   ` (11 more replies)
  0 siblings, 12 replies; 33+ messages in thread
From: Richard Sandiford @ 2018-10-15 14:32 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, jason, nathan, nd

The C standard says:

    At various points within a translation unit an object type may be
    "incomplete" (lacking sufficient information to determine the size of
    objects of that type) or "complete" (having sufficient information).

For AArch64 SVE, we'd like to split this into two concepts:

  * has the type been fully defined?
  * would fully-defining the type determine its size?

This is because we'd like to be able to represent SVE vectors as C and C++
types.  Since SVE is a "vector-length agnostic" architecture, the size
of the vectors is determined by the runtime environment rather than the
programmer or compiler.  In that sense, defining an SVE vector type does
not determine its size.  It's nevertheless possible to use SVE vector types
in meaningful ways, such as having automatic vector variables and passing
vectors between functions.

The main questions in the RFC are:

  1) is splitting the definition like this OK in principle?
  2) are the specific rules described below OK?
  3) coding-wise, how should the split be represented in GCC?

Terminology
-----------

Going back to the second bullet above:

  * would fully-defining the type determine its size?

the rest of the RFC calls a type "sized" if fully defining it would
determine its size.  The type is "sizeless" otherwise.

Contents
--------

The RFC is organised as follows.  I've erred on the side of including
detail rather than leaving it out, but each section is meant to be
self-contained and skippable:

  - An earlier RFC
  - Quick overview of SVE
  - Why we need SVE types in C and C++
  - How we ended up with this definition
  - The SVE types in more detail
  - Outline of the type system changes
  - Sizeless structures (and testing on non-SVE targets)
  - Other variable-length vector architectures
  - Edits to the C standard
    - Base changes
    - Updates for consistency
    - Sizeless structures
  - Edits to the C++ standard
  - GCC implementation questions

I'll follow up with patches that implement the split.



An earlier RFC
==============

For the record (in case this sounds familiar) I sent an RFC about the
sizeless type extension a while ago:

    https://gcc.gnu.org/ml/gcc/2017-08/msg00012.html

The rules haven't changed since then, but this version includes more
information and includes support for sizeless structures.


Quick overview of SVE
=====================

SVE is a vector extension to AArch64.  A detailed description is
available here:

    https://static.docs.arm.com/ddi0584/a/DDI0584A_a_SVE_supp_armv8A.pdf

but the only feature that really matters for this RFC is that SVE has no
fixed or preferred vector length.  Implementations can instead choose
from a range of possible vector lengths, with 128 bits being the minimum
and 2048 bits being the maximum.  Priveleged software can further
constrain the vector length within the range offered by the implementation;
e.g. linux currently provides per-thread control of the vector length.


Why we need SVE types in C and C++
==================================

SVE was designed to be an easy target for autovectorising normal scalar
code.  There are also various language extensions that support explicit
data parallelism or that make explicit vector chunking easier to do in
an architecture-neutral way (e.g. C++ P0214).  This means that many users
won't need to do anything SVE-specific.

Even so, there's always going to be a place for writing SVE-specific
optimisations, with full access to the underlying ISA.  As for other
vector architectures, we'd like users to be able to write such routines
in C and C++ rather than force them to go all the way to assembly.

We'd also like C and C++ functions to be able to take SVE vector
parameters and return SVE vector results, which is particularly useful
when implementing things like vector math routines.  In this case in
particular, the types need to map directly to something that fits in
an SVE register, so that passing and returning vectors has minimal
overhead.


How we ended up with this definition
====================================

Requirements
------------

We need the SVE vector types to define and use SVE intrinsic functions
and to write SVE vector library routines.  The key requirements when
defining the types were:

  * They must be available in both C and C++ (because we want to be able
    add SVE optimisations to C-only codebases).

  * They must fit in an SVE vector register (so there can be no on-the-side
    information).

  * It must be possible to define automatic variables with these types.

  * It must be possible to pass and return objects of these types
    (since that's what intrinsics and vector library routines need to do).

  * It must be possible to use the types in _Generic associations
    (so that _Generic can be used to provide tgmath.h-style overloads).

  * It must be possible to use pointers or references to the types
    (for passing or returning by pointer or reference, and because not
    allowing references would be semantically difficult in C++).

Ideally, there'd also be a way of grouping SVE vectors together into tuples,
since the ISA has instructions like LD2 that return multiple vectors.
It would be good if users could also define their own tuple types, on top
of the ones needed by the intrinsics, although that's more "nice to have".

Possible approaches
-------------------

The main complication is that the size of an SVE vector is not a
compile-time constant.  It seems that any approach to handling this
would fall into one of three categories:

  (1) Limit the types in such a way that there is no concept of size.

  (2) Define the size of the types to be variable.

  (3) Define the size of the types to be constant, either with the
      constant being large enough for all possible vector lengths or
      with the types pointing to separate memory (as for C++ classes
      like std::string).

Why (2) seemed like a bad idea
------------------------------

(2) seemed initially appealing since C already has the concept of
variable-length arrays.  However, variable-length built-in types
would work in a significantly different way.  Arrays often decay to
pointers (which of course are fixed-length types), whereas vector
types never would.  Unlike arrays, it should be possible to pass
variable-length vectors to functions, return them from functions,
and assign them by value.

One particular difficulty is that the semantics of variable-length arrays
rely on having a point at which the array size is evaluated.  It would
be difficult to extend this approach to declarations of functions that
pass or return variable-length types.

As well as the extension itself being relatively complex (especially
for C++), it might be difficult to define it in a way that interacts
naturally with other (unseen) extensions, even those that are aware of
variable-length arrays.  Also, AIUI, variable-length arrays were added
to an early draft of C++14, but were later removed as too controversial
and didn't make it into the final standard.  C++17 still requires sizeof
to be constant and C11 makes variable-length arrays optional.

(2) therefore felt like a complicated dead-end.

Why (3) seemed like a bad idea
------------------------------

(3) can be divided into two:

(3a) The vector types have a constant size and are large enough for all
     possible vector lengths.

    The main problem with this is that the maximum size of an SVE
    vector (2048 bits) is much larger than the minimum size (128 bits).
    Using a fixed size of 2048 bits would be extremely inefficient for
    smaller vector lengths, and of course the whole point of using
    vectors is to make things *more* efficient.

    Also, we would need to define the types such that only the bytes
    associated with the actual vector length are significant.  This would
    make it possible to pass or return the types in registers and treat
    them as register values when copying.  This perhaps has some similarity
    with overaligned structures such as:

	struct s { _Alignas(16) int i; };

    except that the amount of padding would only be known at runtime.

    There's also a significant conceptual problem: encoding a fixed size
    goes against a guiding principle of SVE, in which there is no preferred
    vector length.  There's nothing particularly magical about the current
    limit of 2048 bits and it would be better to avoid an ABI break if the
    maximum ever did increase in future.

(3b) The vector types have a constant size and refer to separate storage
     (as for std::string etc.)

    This would be difficult to do without C++-style constructor, destructor,
    copy and move semantics, so wouldn't work well in C.  And in C++ it would
    be less efficient than the proposed approach, since presumably an Allocator
    would be needed to allocate the separate storage.  It would also require
    a complicated ABI mapping to ensure that the vectors can still be passed
    and returned in registers.

Chosen approach
---------------

We therefore took approach (1) and classified C and C++ types as "sized"
(having a measurable size when fully-defined) and "sizeless" (never
having a measurable size).  Sizeless types have no defined size,
alignment or layout at the language level, with those things becoming
purely an ABI-level detail.

We then treated all sizeless types as permanently incomplete.
On its own, this would put them in a similar situation to "void"
(although they wouldn't be exactly the same, since there are some
specific rules for "void" that don't apply to incomplete types in
general).  We then relaxed specific rules until the types were actually
useful.

Things in favour of (1)
-----------------------

The reasons above were mostly negative, arriving at (1) by elimination.
A more positive justification of this approach is that it seems
to meet the requirements in the most efficient way possible.  The
vectors can use their natural (native) representation, and the type
system prevents uses that would make that representation problematic.

Also, the approach of starting with very restricted types and then
specifically allowing certain things should be more future-proof
and interact better with other language extensions.  By default,
any language extension would treat the new types like other incomplete
types and choose conservatively-correct behaviour.  It would then be
possible to relax the language extension if this default behaviour
turns out to be too restrictive.

(That said, treating the types as permanently incomplete still won't
avoid all clashes with other extensions.  For example, we need to
allow objects of automatic storage duration to have certain forms of
incomplete type, whereas an extension might implicitly assume that all
such objects must already have complete type.  The approach should still
avoid the worst effects though.)


The SVE types in more detail
============================

Arm has published an SVE "ACLE" that specifies the SVE types and intrinsic
functions in detail.  For reference this is available without registration
at:

    https://static.docs.arm.com/100987/0000/acle_sve_100987_0000_00_en.pdf

but I'll try to keep this self-contained.

The ACLE defines a vector type sv<base>_t for each supported element type
<base>_t, so that the complete set is:

    svint8_t      svint16_t     svint32_t     svint64_t
    svuint8_t     svuint16_t    svuint32_t    svuint64_t
                  svfloat16_t   svfloat32_t   svfloat64_t

The types in each column have the same number of lanes and have twice
as many lanes as those in the column to the right.  Every vector has
the same number of bytes in total, with the number of bytes being
determined at runtime.

The ACLE also defines a single predicate type:

    svbool_t

that has the same number of lanes as svint8_t and svuint8_t.

All these types are opaque builtin types and are only expected to
be used with the associated ACLE intrinsics.  There are intrinsics for
creating vectors from scalars, loading from scalars, storing to scalars,
reinterpreting one type as another, etc.

The idea is that the vector types would only be used for short-term
register-sized working data.  Longer-term data would typically be stored
out to arrays.

For example, the vector function underlying:

    #pragma omp declare simd
    double sin(double);

would be:

    svfloat64_t mangled_sin(svfloat64_t, svbool_t);

(The svbool_t is because SVE functions should be predicated by default,
to avoid the need for a scalar epilogue loop.)

The ACLE also defines x2, x3 and x4 tuple types for each vector type;
for example, svint8x3_t is a tuple of 3 svint8_ts.  The tuples are
structure-like types with fields v0, v1, v2 and v3, up to the number
required.


Outline of the type system changes
==================================

Going back to the summary at the start of the RFC, C classifies types as
"complete" (the size of objects can be calculated) or "incomplete" (the
size of objects can't be calculated).  There's very little you can do
with a type until it becomes complete.

The approach we took was to treat all the SVE types as permanently
incomplete.  We then went through the standard relaxing specific
rules until the types were actually useful.

The first step was to classify types as:

  * "indefinite" (lacking sufficient information to create an object of
    that type) or "definite" (having sufficient information)

  * "sized" (will have a known size when definite) or "sizeless" (will
    never have a known size)

  * "incomplete" (lacking sufficient information to determine the size of
    objects of that type) or "complete" (having sufficient information)

where the wording for the final bullet is unchanged from the standard.
Thus a "definite type" is one that has been fully-defined rather than
simply declared, and "complete" is now equivalent to "sized and definite".
All standard types are "sized" (even "void", although it's always
indefinite and incomplete).

We then needed to make some rules use the distinction between "indefinite"
and "definite" rather than "incomplete" and "complete".  The specific
things we wanted to allow were:

  * defining automatic variables with sizeless definite type
  * defining functions whose parameters have sizeless definite type
  * defining functions that return a sizeless definite type
  * using sizeless definite types in _Generic associations
  * dereferencing pointers to sizeless definite types

Specific things we wanted to remain invalid -- by inheriting the rules from
incomplete types -- were:

  * creating or accessing arrays that have sizeless element types
  * doing pointer arithmetic on pointers to sizeless types
  * using sizeof and _Alignof with a sizeless type (or object of sizeless type)
  * defining (sized) unions or structures with sizeless members

It also seemed worth adding an extra restriction:

  * variables with sizeless type must not have static or thread-local
    storage duration

In practice it's impossible to define such variables with incomplete type,
but having an explicit rule means that things like:

    extern svint8_t foo;  // An SVE vector of int8_t elements.

are outright invalid rather than simply useless (because no other
translation unit could ever define foo).  Similarly, without an
explicit rule:

    svint8_t foo;         // An SVE vector of int8_t elements.

would be a valid tentative definition at the point it occurs and only
become invalid at the end of the translation unit, because svint8_t is
never completed.

This restriction isn't critical but it gives better diagnostics.


Sizeless structures (and testing on non-SVE targets)
====================================================

We're planning to build all SVE intrinsic types directly into GCC
(patches already written).  SVE therefore doesn't strictly need a syntax
for creating new sizeless types in C and C++.  However, having a way of
creating new structure-like "sizeless" types would be useful for three
reasons:

  - Functions could return arbitrary data by value.  The SVE ABI allows
    a function to return up to 8 vectors and 4 predicates in registers,
    which is far more flexible than the intrinsic types.

  - We could use these sizeless structure types to test the functionality
    on all targets.

  - A lot of the C++ frontend is concerned with classes, and having
    a way of creating sizeless classes would help make the C++ changes
    more consistent.

The patches therefore add a new "__sizeless_struct" keyword to denote
structures that are sizeless rather than sized.  Unlike normal
structures, these structures can have members of sizeless type in
addition to members of sized type.  On the other hand, they have all
the same limitations as other sizeless types (described in earlier
sections).

E.g., a sizeless structure definition might look like:

    __sizeless_struct data {
      double *array;
      svuint64_t indices;  // An SVE vector of uint64_t elements.
      svbool_t active;     // An SVE predicate.
    };

Adding a new keyword seemed better than using an attribute because it
means that the sized vs. sizeless distinction is fixed by the declaration.
E.g.:

    struct data;                     // Is it sized or sizeless?
    extern struct data global_data;  // OK if sized, not if sizeless.
    struct __attribute__((sizeless)) data {
      double *array;
      svuint64_t indices;            // An SVE vector of uint64_t elements.
      svbool_t active;               // An SVE predicate.
    };

would lead to the declaration of "global_data" sneaking through
despite being invalid when "data" is sizeless.

The tests in the patches all use these __sizeless_structs; they contain
nothing SVE- or AArch64-specific.


Other variable-length vector architectures
==========================================

The proposed RISC-V vector extension also has variable-length vectors.
When this language change was discussed on the clang developers' list,
Bruce Hoult (from SiFive, but speaking personally) replied with:

    http://lists.llvm.org/pipermail/cfe-dev/2018-May/057943.html

That message covers some of the background about the vector extension.
On the language changes, Bruce said:

    > However, even though the length is variable, the concept of a
    > "register-sized" C and C++ vector type makes just as much sense for SVE
    > as it does for other vector architectures.  Vector library functions
    > take such register-sized vectors as input and return them as results.
    > Intrinsic functions are also just as useful as they are for other vector
    > architectures, and they too take register-sized vectors as input and
    > return them as results.

    Intrinsic functions are absolutely required, and are I think the main
    reason for such a low-level register-sized vector type to exist.

[ Bruce went on to say:

    I'm not sure whether user-written functions operating on register-sized
    vectors are useful enough to support. User-written functions would normally
    take and return a higher-level vector type, and would implement the desired
    functionality in terms of calls to other user-written functions (operating
    on the high level vector as a whole) and/or explicit loops iterating
    through the high level vector type using intrinsic functions on the
    register-sized vector type proposed here.

But this use case is very important for SVE, since it will allow us
to implement vector math routines in a way that works with the OpenMP
"declare simd" construct.  There was also talk on gcc@ recently about
supporting this style of interface for RISC-V. ]

[...]

    > All these types are opaque builtin types and are only intended to be
    > used with the associated ACLE intrinsics.  There are intrinsics for
    > creating vectors from scalars, loading from scalars, storing to scalars,
    > reinterpreting one type as another, etc.
    >
    > The idea is that the vector types would only be used for short-term
    > register-sized working data.  Longer-term data would typically be stored
    > out to arrays.

    I agree with this.

[...]

    > The approach we took was to treat all the SVE types as permanently
    > incomplete.

    This seems reasonable.

So it looks like this extension would be useful for at least one
architecture besides SVE.


Edits to the C standard
=======================

This section specifies the behaviour for sizeless types as an edit to N1570.
There are three stages:

  - base changes, which add enough support for built-in sizeless
    vector types

  - updates for consistency, which change some of the wording without
    changing the meaning

  - support for sizeless structures

In each case, -strikethrough- indicates deleted text and *bold*
includes additional text.


Base changes
------------

These changes are enough to support sizeless built-in vector types.

    6.2.5 Types
    -----------

    1. The meaning of a value stored in an object or returned by a
    function is determined by the type of the expression used to access
    it. … Types are partitioned into object types (types that
    describe objects) and function types (types that describe
    functions).  -At various points within a translation unit an object
    type may be incomplete (lacking sufficient information to determine
    the size of objects of that type) or complete (having sufficient
    information).37)- *Object types are further partitioned into sized and
    sizeless; all basic and derived types defined in this standard are
    sized, but an implementation may provide additional sizeless types.*

    1A. *At various points within a translation unit an object type may
    be indefinite (lacking sufficient information to construct an object
    of that type) or definite (having sufficient information).37) An
    object type is said to be complete if it is both sized and definite;
    all other object types are said to be incomplete.  Complete types
    have sufficient information to determine the size of an object of
    that type while incomplete types do not.*

    1B. *Arrays, structures, unions and enumerated types are always
    sized, so for them the term incomplete is equivalent to (and used
    interchangeably with) the term indefinite.*

    …

    19. The void type comprises an empty set of values; it is -an
    incomplete- *a sized indefinite* object type that cannot be completed
    *(made definite)*.

    …

    37) A type may be -incomplete- *indefinite* or -complete- *definite*
    throughout an entire translation unit, or it may change states at
    different points within a translation unit.

    …

    6.3.2.1 Lvalues, arrays, and function designators
    -------------------------------------------------

    1.  An lvalue is an expression (with an object type other than void)
    that potentially designates an object;64) … A modifiable lvalue is
    an lvalue that does not have array type, does not have an
    -incomplete- *indefinite* type, does not have a const-qualified
    type, …

    2.  Except when it is the operand of the sizeof operator, the
    _Alignof operator, the unary & operator, the ++ operator, the --
    operator, or the left operand of the . operator or an assignment
    operator, an lvalue that does not have array type is converted to
    the value stored in the designated object (and is no longer an
    lvalue); this is called lvalue conversion. … If the lvalue has an
    -incomplete- *indefinite* type and does not have array type, the
    behavior is undefined. …

    …

    6.5.1.1 Generic selection
    -------------------------

    …

    Constraints

    2. A generic selection shall have no more than one default generic
    association. The type name in a generic association shall specify a
    -complete- *definite* object type other than a variably modified
    type. …

    …

    6.5.2.2 Function calls
    ----------------------

    Constraints

    1. The expression that denotes the called function92) shall have
    type pointer to function returning void or returning a -complete-
    *definite* object type other than an array type.

    …

    Semantics

    …

    4. An argument may be an expression of any -complete- *definite* object
    type. …

    …

    6.5.2.5 Compound literals
    -------------------------

    Constraints

    1. The type name shall specify a -complete- *definite* object type or an
    array of unknown size, but not a variable length array type.

    …

    6.7 Declarations
    ----------------

    Constraints

    …

    4A. *If an identifier for an object does not have automatic storage
    duration, its type must be sized rather than sizeless.*

    Semantics

    …

    7. If an identifier for an object is declared with no linkage, the
    type for the object shall be -complete- *definite* by the end of its
    declarator, or by the end of its init-declarator if it has an
    initializer; in the case of function parameters (including in
    prototypes), it is the adjusted type (see 6.7.6.3) that is required
    to be -complete- *definite*.

    …
     
    6.7.6.3 Function declarators (including prototypes) 
    ---------------------------------------------------

    Constraints

    …

    4. After adjustment, the parameters in a parameter type list in a
    function declarator that is part of a definition of that function
    shall not have -incomplete- *indefinite* type.

    …

    6.7.9 Initialization
    --------------------

    Constraints

    …

    3. The type of the entity to be initialized shall be an array of
    unknown size or a -complete- *definite* object type that is not a
    variable length array type.

    …

    6.9.1 Function definitions
    --------------------------

    Constraints

    …

    3. The return type of a function shall be void or a -complete-
    *definite* object type other than array type.

    …

    Semantics

    …

    7. The declarator in a function definition specifies the name of the
    function being defined and the identifiers of its parameters. …
    [T]he type of each parameter is adjusted as described in
    6.7.6.3 for a parameter type list; the resulting type shall be a
    -complete- *definite* object type.

    …

    J.2 Undefined behavior
    ----------------------

        …
      * A non-array lvalue with -an incomplete- *an indefinite* type is used
        in a context that requires the value of the designated object
        (6.3.2.1).
        …
      * An identifier for an object is declared with no linkage and the
        type of the object is -incomplete- *indefinite* after its
        declarator, or after its init-declarator if it has an
        initializer (6.7).
        …
      * An adjusted parameter type in a function definition is not a
        -complete- *definite* object type (6.9.1).
        …

Updates for consistency
-----------------------

These changes are a prerequisite for sizeless structures.  They have no
effect otherwise, but might be preferred anyway because they make the
terminology more consistent.  They apply on top of the previous edits.

    6.2.5 Types
    -----------

    …

    22. An array type of unknown size is an -incomplete- *indefinite*
    type. It is -completed- *made definite*, for an identifier of that type,
    by specifying the size in a later declaration (with internal or
    external linkage). A structure or union type of unknown content (as
    described in 6.7.2.3) is an -incomplete- *indefinite* type. It is
    -completed- *made definite*, for all declarations of that type, by
    declaring the same structure or union tag with its defining content
    later in the same scope.

    …

    6.2.7 Compatible type and composite type
    ----------------------------------------

    1. Two types have compatible type if their types are the same. …
    Moreover, two structure, union, or enumerated types declared in
    separate translation units are compatible if their tags and members
    satisfy the following requirements: If one is declared with a tag,
    the other shall be declared with the same tag. If both are
    -completed- *made definite* anywhere within their respective
    translation units, then the following additional requirements apply: …

    …

    6.7.2.1 Structure and union specifiers
    --------------------------------------

    …

    Semantics

    …

    8. The presence of a struct-declaration-list in a
    struct-or-union-specifier declares a new type, within a translation
    unit. The struct-declaration-list is a sequence of declarations for
    the members of the structure or union.  If the struct-declaration-list
    does not contain any named members, either directly or via an anonymous
    structure or anonymous union, the behavior is undefined.  The type is
    -incomplete- *indefinite* until immediately after the } that terminates
    the list, and -complete- *definite* thereafter.

    …

    6.7.2.2 Enumeration specifiers
    ------------------------------

    …

    Semantics

    …

    4. … The enumerated type is -incomplete- *indefinite* until
    immediately after the } that terminates the list of enumerator
    declarations, and -complete- *definite* thereafter.

    …

    6.7.2.3 Tags
    ------------

    …

    Semantics

    4. All declarations of structure, union, or enumerated types that
    have the same scope and use the same tag declare the same
    type. Irrespective of whether there is a tag or what other
    declarations of the type are in the same translation unit, the type
    is -incomplete- *indefinite* 129) until immediately after the closing
    brace of the list defining the content, and -complete- *definite*
    thereafter.

    …

    8. If a type specifier of the form

    struct-or-union identifier

    occurs other than as part of one of the above forms, and no other
    declaration of the identifier as a tag is visible, then it declares
    an -incomplete- *indefinite* structure or union type, and declares the
    identifier as the tag of that type.131)

    …

    129) An -incomplete- *indefinite* type may only by used when -the
    size of an object- *the ability to create an object* of that type
    is not needed.  It is not needed, for example, when a typedef name
    is declared to be a specifier for a structure or union, or when a
    pointer to or a function returning a structure or union is being
    declared. (See -incomplete- *indefinite* types in 6.2.5.) The
    specification has to be -complete- *definite* before such a function
    is called or defined.

    6.7.6.3 Function declarators (including prototypes) 
    ---------------------------------------------------

    …

    Semantics

    …

    12.  If the function declarator is not part of a definition of that
    function, parameters may have -incomplete- *indefinite* type and may use
    the [*] notation in their sequences of declarator specifiers to
    specify variable length array types.

    …

    J.2 Undefined behavior
    ----------------------

        …
      * When the -complete- *definite* type is needed, an -incomplete-
        *indefinite* structure or union type is not completed in the same
        scope by another declaration of the tag that defines the content
        (6.7.2.3).
        …

Sizeless structures
-------------------

These additional changes to N1570 add the concept of a sizeless structure.
Again they apply on top of the edits above:

    6.2.3 Name spaces of identifiers
    --------------------------------

    1. If more than one declaration of a particular identifier is
    visible at any point in a translation unit, the syntactic context
    disambiguates uses that refer to different entities. Thus, there
    are separate name spaces for various categories of identifiers, as
    follows:

	…

      * the tags of *sized* structures, *sizeless structures,* unions, and
	enumerations (disambiguated by following any32) of the keywords
	struct, *__sizeless_struct,* union, or enum);

	…

    6.2.5 Types
    -----------

    1. … Types are partitioned into object types (types that describe
    objects) and function types (types that describe functions).
    Object types are further partitioned into sized and sizeless;
    -all basic and derived types defined in this standard are
    sized, but an implementation may provide additional sizeless types.-
    *the only sizeless types defined by this standard are __sizeless_structs,
    but an implementation may provide additional sizeless types.*

    …

    1B. Arrays, -structures,- unions and enumerated types are always
    sized, so for them the term incomplete is equivalent to (and used
    interchangeably with) the term indefinite.

    …

    20. Any number of derived types can be constructed from the object
    and function types, as follows: …

      * A *sized* structure type describes a sequentially allocated
        nonempty set of sized member objects (and, in certain
        circumstances, an incomplete array), each of which has an
        optionally specified name and possibly distinct type.

      * *A sizeless structure type describes a set of non-overlapping
        member objects whose types may be sizeless and whose relative
        positions are unspecified.  It is also unspecified whether the
        structure occupies a single contiguous piece of storage or
        whether it requires several disjoint pieces.*

    …

    *20A. The term structure type refers collectively to sized structure
    types and sizeless structure types.*

    …

    6.4.1 Keywords
    --------------

    Syntax

    1. *(Add __sizeless_struct to the list and update the copy in A.1.2)*

    …

    6.5.8 Relational operators
    --------------------------

    …

    Semantics

    …

    5. When two pointers are compared, the result depends on the
    relative locations in the address space of the objects pointed to.
    … If the objects pointed to are members of the same aggregate object,
    pointers to *sized* structure members declared later compare greater
    than pointers to members declared earlier in the structure, and
    pointers to array elements with larger subscript values compare
    greater than pointers to elements of the same array with lower
    subscript values. …

    …

    6.7.2.1 Structure and union specifiers
    --------------------------------------

    Syntax

    struct-or-union-specifier:
        struct-or-union identifieropt { struct-declaration-list }
        struct-or-union identifier

    struct-or-union:
        struct
        *__sizeless_struct*
        union

    …

    3. A *sized* structure or union shall not contain a member with
    incomplete or function type …, except that the last member of a
    structure with more than one named member may have incomplete array
    type; such a structure (and any union containing, possibly
    recursively, a member that is such a structure) shall not be a
    member of a structure or an element of an array.  *Simlarly, a
    sizeless structure shall not contain a member with indefinite or
    function type; the exception for incomplete array types does not
    apply.*

    …

    Semantics

    6. As discussed in 6.2.5, a *sized* structure is a type consisting
    of a sequence of members, whose storage is allocated in an ordered
    sequence; *a sizeless structure is a type consisting of
    non-overlapping members whose relative position is unspecified,*
    and a union is a type consisting of a sequence of members whose
    storage overlap.

    7. Structure and union specifiers have the same form. The keywords
    struct, *__sizeless_struct* and union indicate that the type being
    specified is, respectively, a *sized* structure type, *a sizeless
    structure type,* or a union type.

    …[8 is as above]…

    9. A member of a structure or union may have any complete object
    type other than a variably modified type.123)  *A member of a sizeless
    structure may also have a sizeless definite type.*  In addition, a
    member *of a structure or union* may be declared to consist of a
    specified number of bits (including a sign bit, if any). Such a
    member is called a bit-field;124) its width is preceded by a colon.

    …

    15. Within a *sized* structure object, the non-bit-field members and
    the units in which bit-fields reside have addresses that increase in
    the order in which they are declared. A pointer to a *sized* structure
    object, suitably converted, points to its initial member (or if that
    member is a bit-field, then to the unit in which it resides), and
    vice versa. There may be unnamed padding within a *sized* structure
    object, but not at its beginning.

    15A. *The representation of a sizeless structure object is
    unspecified.  It is possible to form pointers to the structure
    itself and to its individual members, but the relationship between
    their addresses is unspecified.  The structure may occupy a single
    piece of contiguous storage or it may occupy several disjoint
    pieces.*

    …

    18 As a special case, the last element of a *sized* structure with
    more than one named member may have an incomplete array type; this
    is called a flexible array member. …

    …

    6.7.2.3 Tags
    ------------

    Constraints

    …

    2. Where two declarations that use the same tag declare the same
    type, they shall both use the same choice of struct, *__sizeless_struct,*
    union, or enum.

    …


Edits to the C++ standard
=========================

We have a similar set of changes to the C++ standard, but this RFC is
long enough already, so I've not included them here.  I also didn't find
them to be particularly useful when writing the C++ patches, since most
of the changes were obvious given a few basic rules.  Those rules are:

  - type traits can be used with sizeless types (unlike incomplete types)

  - sizeless structures cannot have base classes or be used as base classes

  - sizeless structures cannot have virtual members

  - pointers to member variables are invalid for sizeless structures
    (although taking the address of a member of a specific sizeless object
    is fine, as for C)

  - sizeless types are not literal types

  - sizeless types cannot be created by operator new (as for incomplete types)

  - sizeless types cannot be deleted (so, unlike for incomplete types,
    this is an error rather than a warning)

  - sizeless types cannot be thrown or caught (as for incomplete types)

  - sizeless types cannot be used with typeid() (as for incomplete types)


GCC implementation questions
============================

The GCC patches are pretty simple in principle.  The language changes
involve going through the standard replacing "complete" with "definite"
and most of the GCC patches go through the frontend code making the
same kind of change.

New type flag for sizeless types
--------------------------------

The patches add a new flag TYPE_SIZELESS_P to represent the negative of:

  * would fully-defining the type determine its size?

from the summary above.  Negative names are usually a bad thing,
but the natural default is for the flag to be off.

There are currently 17 bits free in tree_type_common, so the patches
steal one of those.  Is that OK?

The effect on COMPLETE_TYPE_P
-----------------------------

The current definition of COMPLETE_TYPE_P is:

    /* Nonzero if this type is a complete type.  */
    #define COMPLETE_TYPE_P(NODE) (TYPE_SIZE (NODE) != NULL_TREE)

Although the SVE types don't have a measurable size at the language
level, they still have a TYPE_SIZE and TYPE_SIZE_UNIT, with the sizes
using placeholders for the runtime vector size.  So after the split
described in the summary, TYPE_SIZE (NODE) != NULL_TREE means
"the type is fully defined" rather than "the type is complete".
With TYPE_SIZELESS_P, the definition of "complete type" would be:

    #define COMPLETE_TYPE_P(NODE) \
      (TYPE_SIZE (NODE) != NULL_TREE && !TYPE_SIZELESS_P (NODE))

i.e. the type is fully-defined, and fully-defining it determines
its size at the language level.

Uses of COMPLETE_TYPE_P outside the frontends
---------------------------------------------

The main complication is that the concept of "complete type" is exposed
outside the frontends, with COMPLETE_TYPE_P being defined in tree.h.

I tried to audit all uses outside the frontends and it looks like
they're all testing whether "the type is fully defined" and don't
care about the distinction between sized and sizeless.  This means
that the current definition (rather than the new definition)
should be correct in all cases.

In some cases the tests are simple null checks, like:

     /* Try to approach equal type sizes.  */
     if (!COMPLETE_TYPE_P (type_a)
         || !COMPLETE_TYPE_P (type_b)
         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
       break;

IMO it's more obvious to test TYPE_SIZE_UNIT directly for null here.
Having a wrapper doesn't add much.

In places like:

  if (!COMPLETE_TYPE_P (t))
    layout_type (t);

and:

  if (COMPLETE_TYPE_P (t) && TYPE_CANONICAL (t)
      && TYPE_MODE (t) != TYPE_MODE (TYPE_CANONICAL (t)))
    ...

it's testing whether the type has been laid out already.

So the patches do two things:

  * Expand the definition of the current COMPLETE_TYPE_P macro outside
    the frontends if the macro is simply protecting against a null
    dereference.

  * Make COMPLETE_TYPE_P local to the frontends and rename all uses
    outside the frontends.

As far as the second point goes, I wasn't sure what new name to use
outside the front ends.  Possibilities include:

  - DEFINITE_TYPE_P
  - USABLE_TYPE_P
  - VALID_VAR_TYPE_P
  - TYPE_LAID_OUT_P
  - TYPE_DEFINED_P
  - TYPE_FULLY_DEFINED_P
  - TYPE_READY_P
  ...other suggestions welcome...

I went for DEFINITE_TYPE_P because that's what the SVE specification
uses, but something more neutral like TYPE_DEFINED_P might be better.

Frontend changes
----------------

The frontend patches change COMPLETE_TYPE_P to DEFINITE_TYPE_P where
necessary.  I've tried where possible to accompany each individual
change with a test.

This worked fairly naturally (IMO) for C, and most of the changes could
be tied directly to the language edits above.

For C++ it was more difficult (not surprisingly).  There are a lot of
tests for COMPLETE_TYPE_P that are obviously testing whether a class
has been fully defined, and are more concerned with name lookup than
TYPE_SIZE.  The same goes for COMPLETE_OR_OPEN_TYPE_P and whether the
definition has been started.  So while the C changes were relatively
small and self-contained, the C++ changes replace many more uses of
COMPLETE_TYPE_P than they keep.  This makes me wonder whether it's a
good idea to keep COMPLETE_TYPE_P at all, or whether it would be better
to replace the remaining uses with something more explicit like:

  TYPE_SIZE_KNOWN_P
  TYPE_SIZE_DEFINED_P
  TYPE_SIZE_MEASURABLE_P
  TYPE_SIZE_COMPLETE_P
  ...suggestions again welcome...

Thanks,
Richard

^ permalink raw reply	[flat|nested] 33+ messages in thread
* [00/10][RFC] Splitting the C and C++ concept of "complete type"
@ 2018-10-16 12:06 Richard Sandiford
  2018-10-16 12:10 ` Richard Sandiford
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Sandiford @ 2018-10-16 12:06 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, jason, nathan, nd

The C standard says:

    At various points within a translation unit an object type may be
    "incomplete" (lacking sufficient information to determine the size of
    objects of that type) or "complete" (having sufficient information).

For AArch64 SVE, we'd like to split this into two concepts:

  * has the type been fully defined?
  * would fully-defining the type determine its size?

This is because we'd like to be able to represent SVE vectors as C and C++
types.  Since SVE is a "vector-length agnostic" architecture, the size
of the vectors is determined by the runtime environment rather than the
programmer or compiler.  In that sense, defining an SVE vector type does
not determine its size.  It's nevertheless possible to use SVE vector types
in meaningful ways, such as having automatic vector variables and passing
vectors between functions.

The main questions in the RFC are:

  1) is splitting the definition like this OK in principle?
  2) are the specific rules described below OK?
  3) coding-wise, how should the split be represented in GCC?

Terminology
-----------

Going back to the second bullet above:

  * would fully-defining the type determine its size?

the rest of the RFC calls a type "sized" if fully defining it would
determine its size.  The type is "sizeless" otherwise.

Contents
--------

The RFC is organised as follows.  I've erred on the side of including
detail rather than leaving it out, but each section is meant to be
self-contained and skippable:

  - An earlier RFC
  - Quick overview of SVE
  - Why we need SVE types in C and C++
  - How we ended up with this definition
  - The SVE types in more detail
  - Outline of the type system changes
  - Sizeless structures (and testing on non-SVE targets)
  - Other variable-length vector architectures
  - Edits to the C standard
    - Base changes
    - Updates for consistency
    - Sizeless structures
  - Edits to the C++ standard
  - GCC implementation questions

I'll follow up with patches that implement the split.



An earlier RFC
==============

For the record (in case this sounds familiar) I sent an RFC about the
sizeless type extension a while ago:

    https://gcc.gnu.org/ml/gcc/2017-08/msg00012.html

The rules haven't changed since then, but this version includes more
information and includes support for sizeless structures.


Quick overview of SVE
=====================

SVE is a vector extension to AArch64.  A detailed description is
available here:

    https://static.docs.arm.com/ddi0584/a/DDI0584A_a_SVE_supp_armv8A.pdf

but the only feature that really matters for this RFC is that SVE has no
fixed or preferred vector length.  Implementations can instead choose
from a range of possible vector lengths, with 128 bits being the minimum
and 2048 bits being the maximum.  Priveleged software can further
constrain the vector length within the range offered by the implementation;
e.g. linux currently provides per-thread control of the vector length.


Why we need SVE types in C and C++
==================================

SVE was designed to be an easy target for autovectorising normal scalar
code.  There are also various language extensions that support explicit
data parallelism or that make explicit vector chunking easier to do in
an architecture-neutral way (e.g. C++ P0214).  This means that many users
won't need to do anything SVE-specific.

Even so, there's always going to be a place for writing SVE-specific
optimisations, with full access to the underlying ISA.  As for other
vector architectures, we'd like users to be able to write such routines
in C and C++ rather than force them to go all the way to assembly.

We'd also like C and C++ functions to be able to take SVE vector
parameters and return SVE vector results, which is particularly useful
when implementing things like vector math routines.  In this case in
particular, the types need to map directly to something that fits in
an SVE register, so that passing and returning vectors has minimal
overhead.


How we ended up with this definition
====================================

Requirements
------------

We need the SVE vector types to define and use SVE intrinsic functions
and to write SVE vector library routines.  The key requirements when
defining the types were:

  * They must be available in both C and C++ (because we want to be able
    add SVE optimisations to C-only codebases).

  * They must fit in an SVE vector register (so there can be no on-the-side
    information).

  * It must be possible to define automatic variables with these types.

  * It must be possible to pass and return objects of these types
    (since that's what intrinsics and vector library routines need to do).

  * It must be possible to use the types in _Generic associations
    (so that _Generic can be used to provide tgmath.h-style overloads).

  * It must be possible to use pointers or references to the types
    (for passing or returning by pointer or reference, and because not
    allowing references would be semantically difficult in C++).

Ideally, there'd also be a way of grouping SVE vectors together into tuples,
since the ISA has instructions like LD2 that return multiple vectors.
It would be good if users could also define their own tuple types, on top
of the ones needed by the intrinsics, although that's more "nice to have".

Possible approaches
-------------------

The main complication is that the size of an SVE vector is not a
compile-time constant.  It seems that any approach to handling this
would fall into one of three categories:

  (1) Limit the types in such a way that there is no concept of size.

  (2) Define the size of the types to be variable.

  (3) Define the size of the types to be constant, either with the
      constant being large enough for all possible vector lengths or
      with the types pointing to separate memory (as for C++ classes
      like std::string).

Why (2) seemed like a bad idea
------------------------------

(2) seemed initially appealing since C already has the concept of
variable-length arrays.  However, variable-length built-in types
would work in a significantly different way.  Arrays often decay to
pointers (which of course are fixed-length types), whereas vector
types never would.  Unlike arrays, it should be possible to pass
variable-length vectors to functions, return them from functions,
and assign them by value.

One particular difficulty is that the semantics of variable-length arrays
rely on having a point at which the array size is evaluated.  It would
be difficult to extend this approach to declarations of functions that
pass or return variable-length types.

As well as the extension itself being relatively complex (especially
for C++), it might be difficult to define it in a way that interacts
naturally with other (unseen) extensions, even those that are aware of
variable-length arrays.  Also, AIUI, variable-length arrays were added
to an early draft of C++14, but were later removed as too controversial
and didn't make it into the final standard.  C++17 still requires sizeof
to be constant and C11 makes variable-length arrays optional.

(2) therefore felt like a complicated dead-end.

Why (3) seemed like a bad idea
------------------------------

(3) can be divided into two:

(3a) The vector types have a constant size and are large enough for all
     possible vector lengths.

    The main problem with this is that the maximum size of an SVE
    vector (2048 bits) is much larger than the minimum size (128 bits).
    Using a fixed size of 2048 bits would be extremely inefficient for
    smaller vector lengths, and of course the whole point of using
    vectors is to make things *more* efficient.

    Also, we would need to define the types such that only the bytes
    associated with the actual vector length are significant.  This would
    make it possible to pass or return the types in registers and treat
    them as register values when copying.  This perhaps has some similarity
    with overaligned structures such as:

	struct s { _Alignas(16) int i; };

    except that the amount of padding would only be known at runtime.

    There's also a significant conceptual problem: encoding a fixed size
    goes against a guiding principle of SVE, in which there is no preferred
    vector length.  There's nothing particularly magical about the current
    limit of 2048 bits and it would be better to avoid an ABI break if the
    maximum ever did increase in future.

(3b) The vector types have a constant size and refer to separate storage
     (as for std::string etc.)

    This would be difficult to do without C++-style constructor, destructor,
    copy and move semantics, so wouldn't work well in C.  And in C++ it would
    be less efficient than the proposed approach, since presumably an Allocator
    would be needed to allocate the separate storage.  It would also require
    a complicated ABI mapping to ensure that the vectors can still be passed
    and returned in registers.

Chosen approach
---------------

We therefore took approach (1) and classified C and C++ types as "sized"
(having a measurable size when fully-defined) and "sizeless" (never
having a measurable size).  Sizeless types have no defined size,
alignment or layout at the language level, with those things becoming
purely an ABI-level detail.

We then treated all sizeless types as permanently incomplete.
On its own, this would put them in a similar situation to "void"
(although they wouldn't be exactly the same, since there are some
specific rules for "void" that don't apply to incomplete types in
general).  We then relaxed specific rules until the types were actually
useful.

Things in favour of (1)
-----------------------

The reasons above were mostly negative, arriving at (1) by elimination.
A more positive justification of this approach is that it seems
to meet the requirements in the most efficient way possible.  The
vectors can use their natural (native) representation, and the type
system prevents uses that would make that representation problematic.

Also, the approach of starting with very restricted types and then
specifically allowing certain things should be more future-proof
and interact better with other language extensions.  By default,
any language extension would treat the new types like other incomplete
types and choose conservatively-correct behaviour.  It would then be
possible to relax the language extension if this default behaviour
turns out to be too restrictive.

(That said, treating the types as permanently incomplete still won't
avoid all clashes with other extensions.  For example, we need to
allow objects of automatic storage duration to have certain forms of
incomplete type, whereas an extension might implicitly assume that all
such objects must already have complete type.  The approach should still
avoid the worst effects though.)


The SVE types in more detail
============================

Arm has published an SVE "ACLE" that specifies the SVE types and intrinsic
functions in detail.  For reference this is available without registration
at:

    https://static.docs.arm.com/100987/0000/acle_sve_100987_0000_00_en.pdf

but I'll try to keep this self-contained.

The ACLE defines a vector type sv<base>_t for each supported element type
<base>_t, so that the complete set is:

    svint8_t      svint16_t     svint32_t     svint64_t
    svuint8_t     svuint16_t    svuint32_t    svuint64_t
                  svfloat16_t   svfloat32_t   svfloat64_t

The types in each column have the same number of lanes and have twice
as many lanes as those in the column to the right.  Every vector has
the same number of bytes in total, with the number of bytes being
determined at runtime.

The ACLE also defines a single predicate type:

    svbool_t

that has the same number of lanes as svint8_t and svuint8_t.

All these types are opaque builtin types and are only expected to
be used with the associated ACLE intrinsics.  There are intrinsics for
creating vectors from scalars, loading from scalars, storing to scalars,
reinterpreting one type as another, etc.

The idea is that the vector types would only be used for short-term
register-sized working data.  Longer-term data would typically be stored
out to arrays.

For example, the vector function underlying:

    #pragma omp declare simd
    double sin(double);

would be:

    svfloat64_t mangled_sin(svfloat64_t, svbool_t);

(The svbool_t is because SVE functions should be predicated by default,
to avoid the need for a scalar epilogue loop.)

The ACLE also defines x2, x3 and x4 tuple types for each vector type;
for example, svint8x3_t is a tuple of 3 svint8_ts.  The tuples are
structure-like types with fields v0, v1, v2 and v3, up to the number
required.


Outline of the type system changes
==================================

Going back to the summary at the start of the RFC, C classifies types as
"complete" (the size of objects can be calculated) or "incomplete" (the
size of objects can't be calculated).  There's very little you can do
with a type until it becomes complete.

The approach we took was to treat all the SVE types as permanently
incomplete.  We then went through the standard relaxing specific
rules until the types were actually useful.

The first step was to classify types as:

  * "indefinite" (lacking sufficient information to create an object of
    that type) or "definite" (having sufficient information)

  * "sized" (will have a known size when definite) or "sizeless" (will
    never have a known size)

  * "incomplete" (lacking sufficient information to determine the size of
    objects of that type) or "complete" (having sufficient information)

where the wording for the final bullet is unchanged from the standard.
Thus a "definite type" is one that has been fully-defined rather than
simply declared, and "complete" is now equivalent to "sized and definite".
All standard types are "sized" (even "void", although it's always
indefinite and incomplete).

We then needed to make some rules use the distinction between "indefinite"
and "definite" rather than "incomplete" and "complete".  The specific
things we wanted to allow were:

  * defining automatic variables with sizeless definite type
  * defining functions whose parameters have sizeless definite type
  * defining functions that return a sizeless definite type
  * using sizeless definite types in _Generic associations
  * dereferencing pointers to sizeless definite types

Specific things we wanted to remain invalid -- by inheriting the rules from
incomplete types -- were:

  * creating or accessing arrays that have sizeless element types
  * doing pointer arithmetic on pointers to sizeless types
  * using sizeof and _Alignof with a sizeless type (or object of sizeless type)
  * defining (sized) unions or structures with sizeless members

It also seemed worth adding an extra restriction:

  * variables with sizeless type must not have static or thread-local
    storage duration

In practice it's impossible to define such variables with incomplete type,
but having an explicit rule means that things like:

    extern svint8_t foo;  // An SVE vector of int8_t elements.

are outright invalid rather than simply useless (because no other
translation unit could ever define foo).  Similarly, without an
explicit rule:

    svint8_t foo;         // An SVE vector of int8_t elements.

would be a valid tentative definition at the point it occurs and only
become invalid at the end of the translation unit, because svint8_t is
never completed.

This restriction isn't critical but it gives better diagnostics.


Sizeless structures (and testing on non-SVE targets)
====================================================

We're planning to build all SVE intrinsic types directly into GCC
(patches already written).  SVE therefore doesn't strictly need a syntax
for creating new sizeless types in C and C++.  However, having a way of
creating new structure-like "sizeless" types would be useful for three
reasons:

  - Functions could return arbitrary data by value.  The SVE ABI allows
    a function to return up to 8 vectors and 4 predicates in registers,
    which is far more flexible than the intrinsic types.

  - We could use these sizeless structure types to test the functionality
    on all targets.

  - A lot of the C++ frontend is concerned with classes, and having
    a way of creating sizeless classes would help make the C++ changes
    more consistent.

The patches therefore add a new "__sizeless_struct" keyword to denote
structures that are sizeless rather than sized.  Unlike normal
structures, these structures can have members of sizeless type in
addition to members of sized type.  On the other hand, they have all
the same limitations as other sizeless types (described in earlier
sections).

E.g., a sizeless structure definition might look like:

    __sizeless_struct data {
      double *array;
      svuint64_t indices;  // An SVE vector of uint64_t elements.
      svbool_t active;     // An SVE predicate.
    };

Adding a new keyword seemed better than using an attribute because it
means that the sized vs. sizeless distinction is fixed by the declaration.
E.g.:

    struct data;                     // Is it sized or sizeless?
    extern struct data global_data;  // OK if sized, not if sizeless.
    struct __attribute__((sizeless)) data {
      double *array;
      svuint64_t indices;            // An SVE vector of uint64_t elements.
      svbool_t active;               // An SVE predicate.
    };

would lead to the declaration of "global_data" sneaking through
despite being invalid when "data" is sizeless.

The tests in the patches all use these __sizeless_structs; they contain
nothing SVE- or AArch64-specific.


Other variable-length vector architectures
==========================================

The proposed RISC-V vector extension also has variable-length vectors.
When this language change was discussed on the clang developers' list,
Bruce Hoult (from SiFive, but speaking personally) replied with:

    http://lists.llvm.org/pipermail/cfe-dev/2018-May/057943.html

That message covers some of the background about the vector extension.
On the language changes, Bruce said:

    > However, even though the length is variable, the concept of a
    > "register-sized" C and C++ vector type makes just as much sense for SVE
    > as it does for other vector architectures.  Vector library functions
    > take such register-sized vectors as input and return them as results.
    > Intrinsic functions are also just as useful as they are for other vector
    > architectures, and they too take register-sized vectors as input and
    > return them as results.

    Intrinsic functions are absolutely required, and are I think the main
    reason for such a low-level register-sized vector type to exist.

[ Bruce went on to say:

    I'm not sure whether user-written functions operating on register-sized
    vectors are useful enough to support. User-written functions would normally
    take and return a higher-level vector type, and would implement the desired
    functionality in terms of calls to other user-written functions (operating
    on the high level vector as a whole) and/or explicit loops iterating
    through the high level vector type using intrinsic functions on the
    register-sized vector type proposed here.

But this use case is very important for SVE, since it will allow us
to implement vector math routines in a way that works with the OpenMP
"declare simd" construct.  There was also talk on gcc@ recently about
supporting this style of interface for RISC-V. ]

[...]

    > All these types are opaque builtin types and are only intended to be
    > used with the associated ACLE intrinsics.  There are intrinsics for
    > creating vectors from scalars, loading from scalars, storing to scalars,
    > reinterpreting one type as another, etc.
    >
    > The idea is that the vector types would only be used for short-term
    > register-sized working data.  Longer-term data would typically be stored
    > out to arrays.

    I agree with this.

[...]

    > The approach we took was to treat all the SVE types as permanently
    > incomplete.

    This seems reasonable.

So it looks like this extension would be useful for at least one
architecture besides SVE.


Edits to the C standard
=======================

This section specifies the behaviour for sizeless types as an edit to N1570.
There are three stages:

  - base changes, which add enough support for built-in sizeless
    vector types

  - updates for consistency, which change some of the wording without
    changing the meaning

  - support for sizeless structures

In each case, -strikethrough- indicates deleted text and *bold*
includes additional text.


Base changes
------------

These changes are enough to support sizeless built-in vector types.

    6.2.5 Types
    -----------

    1. The meaning of a value stored in an object or returned by a
    function is determined by the type of the expression used to access
    it. … Types are partitioned into object types (types that
    describe objects) and function types (types that describe
    functions).  -At various points within a translation unit an object
    type may be incomplete (lacking sufficient information to determine
    the size of objects of that type) or complete (having sufficient
    information).37)- *Object types are further partitioned into sized and
    sizeless; all basic and derived types defined in this standard are
    sized, but an implementation may provide additional sizeless types.*

    1A. *At various points within a translation unit an object type may
    be indefinite (lacking sufficient information to construct an object
    of that type) or definite (having sufficient information).37) An
    object type is said to be complete if it is both sized and definite;
    all other object types are said to be incomplete.  Complete types
    have sufficient information to determine the size of an object of
    that type while incomplete types do not.*

    1B. *Arrays, structures, unions and enumerated types are always
    sized, so for them the term incomplete is equivalent to (and used
    interchangeably with) the term indefinite.*

    …

    19. The void type comprises an empty set of values; it is -an
    incomplete- *a sized indefinite* object type that cannot be completed
    *(made definite)*.

    …

    37) A type may be -incomplete- *indefinite* or -complete- *definite*
    throughout an entire translation unit, or it may change states at
    different points within a translation unit.

    …

    6.3.2.1 Lvalues, arrays, and function designators
    -------------------------------------------------

    1.  An lvalue is an expression (with an object type other than void)
    that potentially designates an object;64) … A modifiable lvalue is
    an lvalue that does not have array type, does not have an
    -incomplete- *indefinite* type, does not have a const-qualified
    type, …

    2.  Except when it is the operand of the sizeof operator, the
    _Alignof operator, the unary & operator, the ++ operator, the --
    operator, or the left operand of the . operator or an assignment
    operator, an lvalue that does not have array type is converted to
    the value stored in the designated object (and is no longer an
    lvalue); this is called lvalue conversion. … If the lvalue has an
    -incomplete- *indefinite* type and does not have array type, the
    behavior is undefined. …

    …

    6.5.1.1 Generic selection
    -------------------------

    …

    Constraints

    2. A generic selection shall have no more than one default generic
    association. The type name in a generic association shall specify a
    -complete- *definite* object type other than a variably modified
    type. …

    …

    6.5.2.2 Function calls
    ----------------------

    Constraints

    1. The expression that denotes the called function92) shall have
    type pointer to function returning void or returning a -complete-
    *definite* object type other than an array type.

    …

    Semantics

    …

    4. An argument may be an expression of any -complete- *definite* object
    type. …

    …

    6.5.2.5 Compound literals
    -------------------------

    Constraints

    1. The type name shall specify a -complete- *definite* object type or an
    array of unknown size, but not a variable length array type.

    …

    6.7 Declarations
    ----------------

    Constraints

    …

    4A. *If an identifier for an object does not have automatic storage
    duration, its type must be sized rather than sizeless.*

    Semantics

    …

    7. If an identifier for an object is declared with no linkage, the
    type for the object shall be -complete- *definite* by the end of its
    declarator, or by the end of its init-declarator if it has an
    initializer; in the case of function parameters (including in
    prototypes), it is the adjusted type (see 6.7.6.3) that is required
    to be -complete- *definite*.

    …
     
    6.7.6.3 Function declarators (including prototypes) 
    ---------------------------------------------------

    Constraints

    …

    4. After adjustment, the parameters in a parameter type list in a
    function declarator that is part of a definition of that function
    shall not have -incomplete- *indefinite* type.

    …

    6.7.9 Initialization
    --------------------

    Constraints

    …

    3. The type of the entity to be initialized shall be an array of
    unknown size or a -complete- *definite* object type that is not a
    variable length array type.

    …

    6.9.1 Function definitions
    --------------------------

    Constraints

    …

    3. The return type of a function shall be void or a -complete-
    *definite* object type other than array type.

    …

    Semantics

    …

    7. The declarator in a function definition specifies the name of the
    function being defined and the identifiers of its parameters. …
    [T]he type of each parameter is adjusted as described in
    6.7.6.3 for a parameter type list; the resulting type shall be a
    -complete- *definite* object type.

    …

    J.2 Undefined behavior
    ----------------------

        …
      * A non-array lvalue with -an incomplete- *an indefinite* type is used
        in a context that requires the value of the designated object
        (6.3.2.1).
        …
      * An identifier for an object is declared with no linkage and the
        type of the object is -incomplete- *indefinite* after its
        declarator, or after its init-declarator if it has an
        initializer (6.7).
        …
      * An adjusted parameter type in a function definition is not a
        -complete- *definite* object type (6.9.1).
        …

Updates for consistency
-----------------------

These changes are a prerequisite for sizeless structures.  They have no
effect otherwise, but might be preferred anyway because they make the
terminology more consistent.  They apply on top of the previous edits.

    6.2.5 Types
    -----------

    …

    22. An array type of unknown size is an -incomplete- *indefinite*
    type. It is -completed- *made definite*, for an identifier of that type,
    by specifying the size in a later declaration (with internal or
    external linkage). A structure or union type of unknown content (as
    described in 6.7.2.3) is an -incomplete- *indefinite* type. It is
    -completed- *made definite*, for all declarations of that type, by
    declaring the same structure or union tag with its defining content
    later in the same scope.

    …

    6.2.7 Compatible type and composite type
    ----------------------------------------

    1. Two types have compatible type if their types are the same. …
    Moreover, two structure, union, or enumerated types declared in
    separate translation units are compatible if their tags and members
    satisfy the following requirements: If one is declared with a tag,
    the other shall be declared with the same tag. If both are
    -completed- *made definite* anywhere within their respective
    translation units, then the following additional requirements apply: …

    …

    6.7.2.1 Structure and union specifiers
    --------------------------------------

    …

    Semantics

    …

    8. The presence of a struct-declaration-list in a
    struct-or-union-specifier declares a new type, within a translation
    unit. The struct-declaration-list is a sequence of declarations for
    the members of the structure or union.  If the struct-declaration-list
    does not contain any named members, either directly or via an anonymous
    structure or anonymous union, the behavior is undefined.  The type is
    -incomplete- *indefinite* until immediately after the } that terminates
    the list, and -complete- *definite* thereafter.

    …

    6.7.2.2 Enumeration specifiers
    ------------------------------

    …

    Semantics

    …

    4. … The enumerated type is -incomplete- *indefinite* until
    immediately after the } that terminates the list of enumerator
    declarations, and -complete- *definite* thereafter.

    …

    6.7.2.3 Tags
    ------------

    …

    Semantics

    4. All declarations of structure, union, or enumerated types that
    have the same scope and use the same tag declare the same
    type. Irrespective of whether there is a tag or what other
    declarations of the type are in the same translation unit, the type
    is -incomplete- *indefinite* 129) until immediately after the closing
    brace of the list defining the content, and -complete- *definite*
    thereafter.

    …

    8. If a type specifier of the form

    struct-or-union identifier

    occurs other than as part of one of the above forms, and no other
    declaration of the identifier as a tag is visible, then it declares
    an -incomplete- *indefinite* structure or union type, and declares the
    identifier as the tag of that type.131)

    …

    129) An -incomplete- *indefinite* type may only by used when -the
    size of an object- *the ability to create an object* of that type
    is not needed.  It is not needed, for example, when a typedef name
    is declared to be a specifier for a structure or union, or when a
    pointer to or a function returning a structure or union is being
    declared. (See -incomplete- *indefinite* types in 6.2.5.) The
    specification has to be -complete- *definite* before such a function
    is called or defined.

    6.7.6.3 Function declarators (including prototypes) 
    ---------------------------------------------------

    …

    Semantics

    …

    12.  If the function declarator is not part of a definition of that
    function, parameters may have -incomplete- *indefinite* type and may use
    the [*] notation in their sequences of declarator specifiers to
    specify variable length array types.

    …

    J.2 Undefined behavior
    ----------------------

        …
      * When the -complete- *definite* type is needed, an -incomplete-
        *indefinite* structure or union type is not completed in the same
        scope by another declaration of the tag that defines the content
        (6.7.2.3).
        …

Sizeless structures
-------------------

These additional changes to N1570 add the concept of a sizeless structure.
Again they apply on top of the edits above:

    6.2.3 Name spaces of identifiers
    --------------------------------

    1. If more than one declaration of a particular identifier is
    visible at any point in a translation unit, the syntactic context
    disambiguates uses that refer to different entities. Thus, there
    are separate name spaces for various categories of identifiers, as
    follows:

	…

      * the tags of *sized* structures, *sizeless structures,* unions, and
	enumerations (disambiguated by following any32) of the keywords
	struct, *__sizeless_struct,* union, or enum);

	…

    6.2.5 Types
    -----------

    1. … Types are partitioned into object types (types that describe
    objects) and function types (types that describe functions).
    Object types are further partitioned into sized and sizeless;
    -all basic and derived types defined in this standard are
    sized, but an implementation may provide additional sizeless types.-
    *the only sizeless types defined by this standard are __sizeless_structs,
    but an implementation may provide additional sizeless types.*

    …

    1B. Arrays, -structures,- unions and enumerated types are always
    sized, so for them the term incomplete is equivalent to (and used
    interchangeably with) the term indefinite.

    …

    20. Any number of derived types can be constructed from the object
    and function types, as follows: …

      * A *sized* structure type describes a sequentially allocated
        nonempty set of sized member objects (and, in certain
        circumstances, an incomplete array), each of which has an
        optionally specified name and possibly distinct type.

      * *A sizeless structure type describes a set of non-overlapping
        member objects whose types may be sizeless and whose relative
        positions are unspecified.  It is also unspecified whether the
        structure occupies a single contiguous piece of storage or
        whether it requires several disjoint pieces.*

    …

    *20A. The term structure type refers collectively to sized structure
    types and sizeless structure types.*

    …

    6.4.1 Keywords
    --------------

    Syntax

    1. *(Add __sizeless_struct to the list and update the copy in A.1.2)*

    …

    6.5.8 Relational operators
    --------------------------

    …

    Semantics

    …

    5. When two pointers are compared, the result depends on the
    relative locations in the address space of the objects pointed to.
    … If the objects pointed to are members of the same aggregate object,
    pointers to *sized* structure members declared later compare greater
    than pointers to members declared earlier in the structure, and
    pointers to array elements with larger subscript values compare
    greater than pointers to elements of the same array with lower
    subscript values. …

    …

    6.7.2.1 Structure and union specifiers
    --------------------------------------

    Syntax

    struct-or-union-specifier:
        struct-or-union identifieropt { struct-declaration-list }
        struct-or-union identifier

    struct-or-union:
        struct
        *__sizeless_struct*
        union

    …

    3. A *sized* structure or union shall not contain a member with
    incomplete or function type …, except that the last member of a
    structure with more than one named member may have incomplete array
    type; such a structure (and any union containing, possibly
    recursively, a member that is such a structure) shall not be a
    member of a structure or an element of an array.  *Simlarly, a
    sizeless structure shall not contain a member with indefinite or
    function type; the exception for incomplete array types does not
    apply.*

    …

    Semantics

    6. As discussed in 6.2.5, a *sized* structure is a type consisting
    of a sequence of members, whose storage is allocated in an ordered
    sequence; *a sizeless structure is a type consisting of
    non-overlapping members whose relative position is unspecified,*
    and a union is a type consisting of a sequence of members whose
    storage overlap.

    7. Structure and union specifiers have the same form. The keywords
    struct, *__sizeless_struct* and union indicate that the type being
    specified is, respectively, a *sized* structure type, *a sizeless
    structure type,* or a union type.

    …[8 is as above]…

    9. A member of a structure or union may have any complete object
    type other than a variably modified type.123)  *A member of a sizeless
    structure may also have a sizeless definite type.*  In addition, a
    member *of a structure or union* may be declared to consist of a
    specified number of bits (including a sign bit, if any). Such a
    member is called a bit-field;124) its width is preceded by a colon.

    …

    15. Within a *sized* structure object, the non-bit-field members and
    the units in which bit-fields reside have addresses that increase in
    the order in which they are declared. A pointer to a *sized* structure
    object, suitably converted, points to its initial member (or if that
    member is a bit-field, then to the unit in which it resides), and
    vice versa. There may be unnamed padding within a *sized* structure
    object, but not at its beginning.

    15A. *The representation of a sizeless structure object is
    unspecified.  It is possible to form pointers to the structure
    itself and to its individual members, but the relationship between
    their addresses is unspecified.  The structure may occupy a single
    piece of contiguous storage or it may occupy several disjoint
    pieces.*

    …

    18 As a special case, the last element of a *sized* structure with
    more than one named member may have an incomplete array type; this
    is called a flexible array member. …

    …

    6.7.2.3 Tags
    ------------

    Constraints

    …

    2. Where two declarations that use the same tag declare the same
    type, they shall both use the same choice of struct, *__sizeless_struct,*
    union, or enum.

    …


Edits to the C++ standard
=========================

We have a similar set of changes to the C++ standard, but this RFC is
long enough already, so I've not included them here.  I also didn't find
them to be particularly useful when writing the C++ patches, since most
of the changes were obvious given a few basic rules.  Those rules are:

  - type traits can be used with sizeless types (unlike incomplete types)

  - sizeless structures cannot have base classes or be used as base classes

  - sizeless structures cannot have virtual members

  - pointers to member variables are invalid for sizeless structures
    (although taking the address of a member of a specific sizeless object
    is fine, as for C)

  - sizeless types are not literal types

  - sizeless types cannot be created by operator new (as for incomplete types)

  - sizeless types cannot be deleted (so, unlike for incomplete types,
    this is an error rather than a warning)

  - sizeless types cannot be thrown or caught (as for incomplete types)

  - sizeless types cannot be used with typeid() (as for incomplete types)


GCC implementation questions
============================

The GCC patches are pretty simple in principle.  The language changes
involve going through the standard replacing "complete" with "definite"
and most of the GCC patches go through the frontend code making the
same kind of change.

New type flag for sizeless types
--------------------------------

The patches add a new flag TYPE_SIZELESS_P to represent the negative of:

  * would fully-defining the type determine its size?

from the summary above.  Negative names are usually a bad thing,
but the natural default is for the flag to be off.

There are currently 17 bits free in tree_type_common, so the patches
steal one of those.  Is that OK?

The effect on COMPLETE_TYPE_P
-----------------------------

The current definition of COMPLETE_TYPE_P is:

    /* Nonzero if this type is a complete type.  */
    #define COMPLETE_TYPE_P(NODE) (TYPE_SIZE (NODE) != NULL_TREE)

Although the SVE types don't have a measurable size at the language
level, they still have a TYPE_SIZE and TYPE_SIZE_UNIT, with the sizes
using placeholders for the runtime vector size.  So after the split
described in the summary, TYPE_SIZE (NODE) != NULL_TREE means
"the type is fully defined" rather than "the type is complete".
With TYPE_SIZELESS_P, the definition of "complete type" would be:

    #define COMPLETE_TYPE_P(NODE) \
      (TYPE_SIZE (NODE) != NULL_TREE && !TYPE_SIZELESS_P (NODE))

i.e. the type is fully-defined, and fully-defining it determines
its size at the language level.

Uses of COMPLETE_TYPE_P outside the frontends
---------------------------------------------

The main complication is that the concept of "complete type" is exposed
outside the frontends, with COMPLETE_TYPE_P being defined in tree.h.

I tried to audit all uses outside the frontends and it looks like
they're all testing whether "the type is fully defined" and don't
care about the distinction between sized and sizeless.  This means
that the current definition (rather than the new definition)
should be correct in all cases.

In some cases the tests are simple null checks, like:

     /* Try to approach equal type sizes.  */
     if (!COMPLETE_TYPE_P (type_a)
         || !COMPLETE_TYPE_P (type_b)
         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
       break;

IMO it's more obvious to test TYPE_SIZE_UNIT directly for null here.
Having a wrapper doesn't add much.

In places like:

  if (!COMPLETE_TYPE_P (t))
    layout_type (t);

and:

  if (COMPLETE_TYPE_P (t) && TYPE_CANONICAL (t)
      && TYPE_MODE (t) != TYPE_MODE (TYPE_CANONICAL (t)))
    ...

it's testing whether the type has been laid out already.

So the patches do two things:

  * Expand the definition of the current COMPLETE_TYPE_P macro outside
    the frontends if the macro is simply protecting against a null
    dereference.

  * Make COMPLETE_TYPE_P local to the frontends and rename all uses
    outside the frontends.

As far as the second point goes, I wasn't sure what new name to use
outside the front ends.  Possibilities include:

  - DEFINITE_TYPE_P
  - USABLE_TYPE_P
  - VALID_VAR_TYPE_P
  - TYPE_LAID_OUT_P
  - TYPE_DEFINED_P
  - TYPE_FULLY_DEFINED_P
  - TYPE_READY_P
  ...other suggestions welcome...

I went for DEFINITE_TYPE_P because that's what the SVE specification
uses, but something more neutral like TYPE_DEFINED_P might be better.

Frontend changes
----------------

The frontend patches change COMPLETE_TYPE_P to DEFINITE_TYPE_P where
necessary.  I've tried where possible to accompany each individual
change with a test.

This worked fairly naturally (IMO) for C, and most of the changes could
be tied directly to the language edits above.

For C++ it was more difficult (not surprisingly).  There are a lot of
tests for COMPLETE_TYPE_P that are obviously testing whether a class
has been fully defined, and are more concerned with name lookup than
TYPE_SIZE.  The same goes for COMPLETE_OR_OPEN_TYPE_P and whether the
definition has been started.  So while the C changes were relatively
small and self-contained, the C++ changes replace many more uses of
COMPLETE_TYPE_P than they keep.  This makes me wonder whether it's a
good idea to keep COMPLETE_TYPE_P at all, or whether it would be better
to replace the remaining uses with something more explicit like:

  TYPE_SIZE_KNOWN_P
  TYPE_SIZE_DEFINED_P
  TYPE_SIZE_MEASURABLE_P
  TYPE_SIZE_COMPLETE_P
  ...suggestions again welcome...

Thanks,
Richard

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2018-10-19 13:14 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-15 14:32 [00/10][RFC] Splitting the C and C++ concept of "complete type" Richard Sandiford
2018-10-15 14:33 ` [01/10] Expand COMPLETE_TYPE_P in obvious checks for null Richard Sandiford
2018-10-18 19:32   ` Jeff Law
2018-10-15 14:34 ` [03/10] Move COMPLETE_OR_VOID_TYPE_P to the C and C++ frontends Richard Sandiford
2018-10-15 14:34 ` [02/10] Replace most uses of COMPLETE_TYPE_P outside the frontends Richard Sandiford
2018-10-15 14:35 ` [04/10] Move COMPLETE_OR_UNBOUND_ARRAY_TYPE_P to the C and C++ frontends Richard Sandiford
2018-10-15 14:36 ` [06/10] Move COMPLETE_TYPE_P " Richard Sandiford
2018-10-15 14:36 ` [05/10] Move complete_or_array_type_p " Richard Sandiford
2018-10-15 14:37 ` [07/10] Use COMPLETE_TYPE_P instead of TYPE_SIZE Richard Sandiford
2018-10-15 14:38 ` [08/10] Add a TYPE_SIZELESS_P property to types Richard Sandiford
2018-10-15 14:50 ` [09/10] C support for sizeless types Richard Sandiford
2018-10-15 15:01 ` [10/10] C++ " Richard Sandiford
2018-10-15 15:14 ` [00/10][RFC] Splitting the C and C++ concept of "complete type" Joseph Myers
2018-10-15 18:57 ` Uecker, Martin
2018-10-16  8:51   ` Richard Biener
2018-10-16 12:55   ` Richard Sandiford
2018-10-16 23:07     ` Joseph Myers
2018-10-17 12:54       ` Richard Sandiford
2018-10-17 14:39         ` Uecker, Martin
2018-10-17 15:06           ` Richard Sandiford
2018-10-17 15:25             ` Joseph Myers
2018-10-18 13:33               ` Richard Sandiford
2018-10-18 20:06                 ` Uecker, Martin
2018-10-18 20:12                   ` Richard Sandiford
2018-10-18 21:09                     ` Uecker, Martin
2018-10-18 21:05                   ` Joseph Myers
2018-10-17 14:39         ` Joseph Myers
2018-10-18 12:01           ` Richard Sandiford
2018-10-18 20:34             ` Joseph Myers
2018-10-19 12:34               ` Richard Sandiford
2018-10-19 13:42                 ` Joseph Myers
2018-10-16 12:06 Richard Sandiford
2018-10-16 12:10 ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).