From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-388424-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 31341 invoked by alias); 6 Jan 2015 16:56:19 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 31319 invoked by uid 89); 6 Jan 2015 16:56:17 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=2.0 required=5.0 tests=AWL,BAYES_50,MEDICAL_SUBJECT,SPF_PASS,T_RP_MATCHES_RCVD autolearn=no version=3.3.2
X-HELO: cam-smtp0.cambridge.arm.com
Received: from fw-tnat.cambridge.arm.com (HELO cam-smtp0.cambridge.arm.com) (217.140.96.140) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Tue, 06 Jan 2015 16:56:15 +0000
Received: from arm.com (e106375-lin.cambridge.arm.com [10.1.203.138])	by cam-smtp0.cambridge.arm.com (8.13.8/8.13.8) with ESMTP id t06Gu0Z4009666;	Tue, 6 Jan 2015 16:56:01 GMT
Date: Tue, 06 Jan 2015 16:56:00 -0000
From: James Greenhalgh <james.greenhalgh@arm.com>
To: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: "gerald@pfeifer.com" <gerald@pfeifer.com>,        "joseph@codesourcery.com" <joseph@codesourcery.com>
Subject: Re: [Patch docs 1/5] Update the first section of md.texi
Message-ID: <20150106165600.GA25265@arm.com>
References: <1420543302-11008-1-git-send-email-james.greenhalgh@arm.com> <1420543302-11008-2-git-send-email-james.greenhalgh@arm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1420543302-11008-2-git-send-email-james.greenhalgh@arm.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-IsSubscribed: yes
X-SW-Source: 2015-01/txt/msg00246.txt.bz2

As requested, please find a paragraph-by-paragraph justification for
the changes in this patch below.

I hope this aids review.

Of particular interest for this patch is the removal of the text
describing the semicolon for a comment syntax. I believe that a
description of the syntax of a .md file does not belong in the
introduction. I don't have a good suggestion as to where it should
move to.

Thanks,
James

On Tue, Jan 06, 2015 at 11:21:38AM +0000, James Greenhalgh wrote:
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 7f0426c..0277f14 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -7,39 +7,39 @@
>  @chapter Machine Descriptions
>  @cindex machine descriptions
>  
> -A machine description has two parts: a file of instruction patterns
> -(@file{.md} file) and a C header file of macro definitions.
> +A machine description has two parts: one or more files describing the
> +instructions available on the target (@file{.md} files) and one or more
> +C/C++ source files implementing target hooks and macros.  These are
> +described in the next chapter (@pxref{Target Macros}).

The original text was inaccurate. .md files can be included in one
another, so it is wrong to say "a file". Real-world backends are not
just C header files of macro definitions.

Rather than just mentioning the next chapter, cross-reference it.

> -The @file{.md} file for a target machine contains a pattern for each
> -instruction that the target machine supports (or at least each instruction
> -that is worth telling the compiler about).  It may also contain comments.
> -A semicolon causes the rest of the line to be a comment, unless the semicolon
> -is inside a quoted string.

A detailed description of the syntax of a .md file does not belong in an
introduction. It is a deficiency of this patch set that this text is not
added back elsewhere. Do you have any suggestions of where the text could
sensibly go?

> -See the next chapter for information on the C header file.

Moved above.

> +This chapter describes the @file{.md} files for a target machine.  These
> +contain the instruction patterns which are used by the compiler when
> +expanding gimple to RTL, during the RTL optimization passes, and when
> +emitting assembler code.

New text. Give a high level overview of what is in an md file and why
we want it.

>  
>  @menu
>  * Overview::            How the machine description is used.
>  * Patterns::            How to write instruction patterns.
>  * Example::             An explained example of a @code{define_insn} pattern.
> -* RTL Template::        The RTL template defines what insns match a pattern.
> +* RTL Template::        The RTL template defines which insns may match a
> +                        pattern.

s/what/which/ -- This reads better to me
s/match/may match/ -- Many things can result in an RTL Template not
matching; pattern ordering, predicates, failing conditions, etc.

>  * Output Template::     The output template says how to make assembler code
> -                        from such an insn.
> +                        from an insn matched to an instruction pattern.

The previous text did not make sense. "such an insn" is not bound to
anything if I am reading the contents non-linearly (as I am likely to do).

>  * Output Statement::    For more generality, write C code to output
>                          the assembler code.
> -* Predicates::          Controlling what kinds of operands can be used
> -                        for an insn.
> +* Predicates::          Controlling which kinds of operands can be used
> +                        when matching an insn to an instruction pattern.

s/what/which/ -- Reads better to me.

Add text qualifying when predicates are used - predicates can't be used
to instruct register allocation after pattern matching, they are only
used in pattern matching.

>  * Constraints::         Fine-tuning operand selection.
>  * Standard Names::      Names mark patterns to use for code generation.
>  * Pattern Ordering::    When the order of patterns makes a difference.
>  * Dependent Patterns::  Having one pattern may make you need another.
>  * Jump Patterns::       Special considerations for patterns for jump insns.
>  * Looping Patterns::    How to define patterns for special looping insns.
> -* Insn Canonicalizations::Canonicalization of Instructions
> +* Insn Canonicalizations::Canonicalization of instructions

We don't use title case elsewhere in the descriptions.

>  * Expander Definitions::Generating a sequence of several RTL insns
>                          for a standard operation.
> -* Insn Splitting::      Splitting Instructions into Multiple Instructions.
> -* Including Patterns::  Including Patterns in Machine Descriptions.
> +* Insn Splitting::      Splitting insns into multiple insns.
> +* Including Patterns::  Including patterns in machine descriptions.

Insn splitting acts on insns (a compiler concept) not instructions (a
machine concept).

We don't use title case elsewhere in the descriptions.

>  * Peephole Definitions::Defining machine-specific peephole optimizations.
>  * Insn Attributes::     Specifying the value of attributes for generated insns.
>  * Conditional Execution::Generating @code{define_insn} patterns for
> @@ -54,50 +54,60 @@ See the next chapter for information on the C header file.
>  @node Overview
>  @section Overview of How the Machine Description is Used
>  
> -There are three main conversions that happen in the compiler:
> +There are four main conversions that happen in the compiler:

The previous text is inaccurate.

>  
>  @enumerate
>  
>  @item
> -The front end reads the source code and builds a parse tree.
> +The front end reads the source code and converts it to an intermediate,
> +front end specific, tree based representation.

The previous text is imprecise.

>  @item
> -The parse tree is used to generate an RTL insn list based on named
> -instruction patterns.
> +This tree based representation is lowered (gimplified) to a gimple
> +representation.

The previous text is inaccurate.

>  
>  @item
> -The insn list is matched against the RTL templates to produce assembler
> -code.
> +The gimple representation is transformed (expanded) to an RTL
> +representation.  This RTL representation is a doubly-linked chain of
> +objects called insns, known as the insn list.

New text describing expansion from gimple to RTL.

> +
> +@item
> +The insn list is optimized, and the optimized insn list is matched
> +against @code{define_insn} instruction patterns in the machine description
> +to produce assembler code.

Add text describing that the insn list is further optimized before
instruction selection.

>  
>  @end enumerate
>  
> -For the generate pass, only the names of the insns matter, from either a
> -named @code{define_insn} or a @code{define_expand}.  The compiler will
> +When expanding from gimple to RTL, only named @code{define_insn}
> +constructs and @code{define_expand} constructs are used.  The compiler will
>  choose the pattern with the right name and apply the operands according
>  to the documentation later in this chapter, without regard for the RTL
> -template or operand constraints.  Note that the names the compiler looks
> -for are hard-coded in the compiler---it will ignore unnamed patterns and
> -patterns with names it doesn't know about, but if you don't provide a
> -named pattern it needs, it will abort.
> +template or operand constraints.  Note that the names which are used for
> +expansion are hard-coded in the compiler---unnamed patterns and patterns
> +with names which do not have a standard meaning are ignored during
> +expansion.  If you don't provide a named pattern that the compiler is
> +trying to expand, it may try a different expansion strategy.  If no
> +other expansion strategies are possible, the compiler will abort.

This is a general cleanup to make the text less casual.  This could do
with a second pass over it to remove the repeated use of "you" and the
anthropomorphism of the compiler.

I prefer the text below. We lose "without regard for the RTL template or
operand constraints", but this is contradictory anyway, as the RTL template
is used when expanding a define_insn.

  When expanding from gimple to RTL, only named @code{define_insn}
  constructs and @code{define_expand} constructs are used.  The set of
  names which are meaningful during expansion are detailed later in this
  chapter (@pxref{Standard Names}.  Note that the names which are used for
  expansion are hard-coded in the compiler---unnamed patterns and patterns
  with names which do not have a standard meaning are ignored during
  expansion.  If a named pattern that the compiler is trying to expand,
  is not provided, it may try a different expansion strategy.  If no
  other expansion strategies are possible, the compiler will abort.


>  
>  If a @code{define_insn} is used, the template given is inserted into the
> -insn list.  If a @code{define_expand} is used, one of three things
> -happens, based on the condition logic.  The condition logic may manually
> -create new insns for the insn list, say via @code{emit_insn()}, and
> -invoke @code{DONE}.  For certain named patterns, it may invoke @code{FAIL} to tell the
> -compiler to use an alternate way of performing that task.  If it invokes
> -neither @code{DONE} nor @code{FAIL}, the template given in the pattern
> +insn list.
> +
> +If a @code{define_expand} is used, one of three things happens, encoded in
> +the condition logic.  The condition logic may manually create new insns
> +for the insn list, say via @code{emit_insn ()}, and invoke @code{DONE},
> +indicating a successful expansion.  If the standard pattern name permits
> +it, the condition logic may invoke @code{FAIL} to express that an alternate
> +strategy should be used to performing the expansion.  If the condition logic
> +invokes neither @code{DONE} nor @code{FAIL}, the template given in the pattern
>  is inserted, as if the @code{define_expand} were a @code{define_insn}.

Split one paragraph to two. One dealing with define_insn expressions, one
for define_expand expressions. This aids readability.

s/based on the condition logic/encoded in the condition logic/ -- I
don't like the use of "based on" to describe code, the code explicitly
says what to do, the compiler doesn't "base" what to do on the code.

I dislike that the condition logic is the actor in the paragraph, but I
struggled to rewrite it, so I gave up and just replaced the unclear use of
"it" with "the condition logic".

>  Once the insn list is generated, various optimization passes convert,
> -replace, and rearrange the insns in the insn list.  This is where the
> -@code{define_split} and @code{define_peephole} patterns get used, for
> -example.

These examples don't add anything useful and introduce new keywords we've
not mentioned before.

> -Finally, the insn list's RTL is matched up with the RTL templates in the
> -@code{define_insn} patterns, and those patterns are used to emit the
> -final assembly code.  For this purpose, each named @code{define_insn}
> -acts like it's unnamed, since the names are ignored.
> +replace, and rearrange the insns in the insn list.
> +
> +Finally, the insn list's RTL is matched with the RTL templates from the
> +@code{define_insn} instruction patterns, and those patterns are used to
> +emit the final assembly code.  For this purpose, names are ignored.  All
> +@code{define_insn} are considered for matching.

Just a rewording. The previous text is unnecessarily verbose.