From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <binutils-return-91618-listarch-binutils=sources.redhat.com@sourceware.org>
Received: (qmail 11184 invoked by alias); 7 Apr 2016 12:57:18 -0000
Mailing-List: contact binutils-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <binutils.sourceware.org>
List-Subscribe: <mailto:binutils-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/binutils/>
List-Post: <mailto:binutils@sourceware.org>
List-Help: <mailto:binutils-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: binutils-owner@sourceware.org
Received: (qmail 11028 invoked by uid 89); 7 Apr 2016 12:57:17 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=sk:f182318, limm, sk:arc_opc, organised
X-HELO: mail-wm0-f49.google.com
Received: from mail-wm0-f49.google.com (HELO mail-wm0-f49.google.com) (74.125.82.49) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Thu, 07 Apr 2016 12:57:01 +0000
Received: by mail-wm0-f49.google.com with SMTP id n3so103602384wmn.0        for <binutils@sourceware.org>; Thu, 07 Apr 2016 05:57:00 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=1e100.net; s=20130820;        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to         :references:in-reply-to:references;        bh=vYXdU6JJ/0IiiBC5iFqWzBUjJtkpu84ut4FS/CSelms=;        b=V6TdjVXppD5/oWn9fLWFWl7AsIEaBhumCGcN10s1n0o9yz0o26ViFUAb0gaUhXDLJp         +ZyM/agpHSFYZlhiPZB6ttbnvaljXhtY3CTuRZRJ0oFYX84wD/IdgEbI2F28iMjuSqj3         TeIwzW0LPjUSvOobw6DpM5E7IqC251NUo7TCZQsz88s/gafLytgvO7JDC9OfoprSFYOb         LzSvM3GtNlpgtN0z8AAghqh954fsBl5RdzyjOgGvtsHV0eHJsqOTlIFF+5jFJhkGbDLS         I8u5zSdhuChDYJnjIWhWi2DHQBZJRgx2+oVk5N3MinWQVjDjDmnEg2fE/FNEvLjrCyjz         1nog==
X-Gm-Message-State: AD7BkJKyZMDRoB2BPdw80hmHj734l5ZJ+AepbsXGYUxKjtGQmAlYWsUgijaaj96T38S1Dw==
X-Received: by 10.194.78.178 with SMTP id c18mr3791317wjx.114.1460033818363;        Thu, 07 Apr 2016 05:56:58 -0700 (PDT)
Received: from localhost (host81-140-212-51.range81-140.btcentralplus.com. [81.140.212.51])        by smtp.gmail.com with ESMTPSA id gl8sm8401781wjb.30.2016.04.07.05.56.57        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);        Thu, 07 Apr 2016 05:56:57 -0700 (PDT)
From: Andrew Burgess <andrew.burgess@embecosm.com>
To: binutils@sourceware.org
Cc: Claudiu.Zissulescu@synopsys.com,	Cupertino.Miranda@synopsys.com,	noamca@mellanox.com,	Andrew Burgess <andrew.burgess@embecosm.com>
Subject: [PATCHv2 3/4] gas/arc: Handle multiple arc_opcode chains for same mnemonic
Date: Thu, 07 Apr 2016 12:57:00 -0000
Message-Id: <1460033811-17805-4-git-send-email-andrew.burgess@embecosm.com>
In-Reply-To: <1460033811-17805-1-git-send-email-andrew.burgess@embecosm.com>
References: <1460033811-17805-1-git-send-email-andrew.burgess@embecosm.com>
In-Reply-To: <1459637470-30538-1-git-send-email-andrew.burgess@embecosm.com>
References: <1459637470-30538-1-git-send-email-andrew.burgess@embecosm.com>
X-IsSubscribed: yes
X-SW-Source: 2016-04/txt/msg00109.txt.bz2

This commit completes support for having multiple instructions with the
same mnemonic in non-contiguous blocks within the arc_opcodes table.

The commit adds an iterator mechanism for the arc_opcode_hash_entry
structure, which is then used in find_opcode_match to consider all
arc_opcode entries with the same mnemonic, even when these instructions
are stored in non-contiguous blocks.

I extend the comment on the arc_opcodes table to discuss how entries
within the table are organised, and to mention how instructions can be
split into multiple groups if needed, but that the table is still
searched in table order.

There should be no user visible changes after this commit.

gas/ChangeLog:

	* config/tc-arc.c (struct arc_opcode_hash_entry_iterator): New
	structure.
	(arc_opcode_hash_entry_iterator_init): New function.
	(arc_opcode_hash_entry_iterator_next): New function.
	(find_opcode_match): Iterate over all arc_opcode entries
	referenced by the arc_opcode_hash_entry passed in as a parameter.

opcodes/ChangeLog:

	* arc-opc.c (arc_opcodes): Extend comment to discus table layout.
---
 gas/ChangeLog       |  9 +++++++
 gas/config/tc-arc.c | 72 ++++++++++++++++++++++++++++++++++++++++++++---------
 opcodes/ChangeLog   |  4 +++
 opcodes/arc-opc.c   | 32 +++++++++++++++++++++++-
 4 files changed, 104 insertions(+), 13 deletions(-)

diff --git a/gas/config/tc-arc.c b/gas/config/tc-arc.c
index 5b76a07..7efc630 100644
--- a/gas/config/tc-arc.c
+++ b/gas/config/tc-arc.c
@@ -318,6 +318,17 @@ struct arc_opcode_hash_entry
   const struct arc_opcode **opcode;
 };
 
+/* Structure used for iterating through an arc_opcode_hash_entry.  */
+struct arc_opcode_hash_entry_iterator
+{
+  /* Index into the OPCODE element of the arc_opcode_hash_entry.  */
+  size_t index;
+
+  /* The specific ARC_OPCODE from the ARC_OPCODES table that was last
+     returned by this iterator.  */
+  const struct arc_opcode *opcode;
+};
+
 /* Forward declaration.  */
 static void assemble_insn
   (const struct arc_opcode *, const expressionS *, int,
@@ -577,6 +588,47 @@ arc_find_opcode (const char *name)
   return entry;
 }
 
+/* Initialise the iterator ITER.  */
+
+static void
+arc_opcode_hash_entry_iterator_init (struct arc_opcode_hash_entry_iterator *iter)
+{
+  iter->index = 0;
+  iter->opcode = NULL;
+}
+
+/* Return the next ARC_OPCODE from ENTRY, using ITER to hold state between
+   calls to this function.  Return NULL when all ARC_OPCODE entries have
+   been returned.  */
+
+static const struct arc_opcode *
+arc_opcode_hash_entry_iterator_next (const struct arc_opcode_hash_entry *entry,
+				     struct arc_opcode_hash_entry_iterator *iter)
+{
+  if (iter->opcode == NULL && iter->index == 0)
+    {
+      gas_assert (entry->count > 0);
+      iter->opcode = entry->opcode[iter->index];
+    }
+  else if (iter->opcode != NULL)
+    {
+      const char *old_name = iter->opcode->name;
+
+      iter->opcode++;
+      if ((iter->opcode - arc_opcodes >= (int) arc_num_opcodes)
+	  || (strcmp (old_name, iter->opcode->name) != 0))
+	{
+	  iter->index++;
+	  if (iter->index == entry->count)
+	    iter->opcode = NULL;
+	  else
+	    iter->opcode = entry->opcode[iter->index];
+	}
+    }
+
+  return iter->opcode;
+}
+
 /* Like md_number_to_chars but used for limms.  The 4-byte limm value,
    is encoded as 'middle-endian' for a little-endian target.  FIXME!
    this function is used for regular 4 byte instructions as well.  */
@@ -1406,23 +1458,22 @@ find_opcode_match (const struct arc_opcode_hash_entry *entry,
 		   int nflgs,
 		   int *pcpumatch)
 {
-  const struct arc_opcode *first_opcode = entry->opcode[0];
-  const struct arc_opcode *opcode = first_opcode;
+  const struct arc_opcode *opcode;
+  struct arc_opcode_hash_entry_iterator iter;
   int ntok = *pntok;
   int got_cpu_match = 0;
   expressionS bktok[MAX_INSN_ARGS];
   int bkntok;
   expressionS emptyE;
 
-  gas_assert (entry->count > 0);
-  if (entry->count > 1)
-    as_fatal (_("unable to lookup `%s', too many opcode chains"),
-	      first_opcode->name);
+  arc_opcode_hash_entry_iterator_init (&iter);
   memset (&emptyE, 0, sizeof (emptyE));
   memcpy (bktok, tok, MAX_INSN_ARGS * sizeof (*tok));
   bkntok = ntok;
 
-  do
+  for (opcode = arc_opcode_hash_entry_iterator_next (entry, &iter);
+       opcode != NULL;
+       opcode = arc_opcode_hash_entry_iterator_next (entry, &iter))
     {
       const unsigned char *opidx;
       const unsigned char *flgidx;
@@ -1743,8 +1794,6 @@ find_opcode_match (const struct arc_opcode_hash_entry *entry,
       memcpy (tok, bktok, MAX_INSN_ARGS * sizeof (*tok));
       ntok = bkntok;
     }
-  while (++opcode - arc_opcodes < (int) arc_num_opcodes
-	 && !strcmp (opcode->name, first_opcode->name));
 
   if (*pcpumatch)
     *pcpumatch = got_cpu_match;
@@ -2054,9 +2103,8 @@ assemble_tokens (const char *opname,
     {
       const struct arc_opcode *opcode;
 
-      pr_debug ("%s:%d: assemble_tokens: %s trying opcode 0x%08X\n",
-		frag_now->fr_file, frag_now->fr_line, opcode->name,
-		opcode->opcode);
+      pr_debug ("%s:%d: assemble_tokens: %s\n",
+		frag_now->fr_file, frag_now->fr_line, opname);
       found_something = TRUE;
       opcode = find_opcode_match (entry, tok, &ntok, pflags,
 				  nflgs, &cpumatch);
diff --git a/opcodes/arc-opc.c b/opcodes/arc-opc.c
index f182318..69c65fc 100644
--- a/opcodes/arc-opc.c
+++ b/opcodes/arc-opc.c
@@ -1453,7 +1453,37 @@ const unsigned arc_NToperand = FKT_NT;
 
    The format of the opcode table is:
 
-   NAME OPCODE MASK CPU CLASS SUBCLASS { OPERANDS } { FLAGS }.  */
+   NAME OPCODE MASK CPU CLASS SUBCLASS { OPERANDS } { FLAGS }.
+
+   The table is organised such that, where possible, all instructions with
+   the same mnemonic are together in a block.  When the assembler searches
+   for a suitable instruction the entries are checked in table order, so
+   more specific, or specialised cases should appear earlier in the table.
+
+   As an example, consider two instructions 'add a,b,u6' and 'add
+   a,b,limm'.  The first takes a 6-bit immediate that is encoded within the
+   32-bit instruction, while the second takes a 32-bit immediate that is
+   encoded in a follow-on 32-bit, making the total instruction length
+   64-bits.  In this case the u6 variant must appear first in the table, as
+   all u6 immediates could also be encoded using the 'limm' extension,
+   however, we want to use the shorter instruction wherever possible.
+
+   It is possible though to split instructions with the same mnemonic into
+   multiple groups.  However, the instructions are still checked in table
+   order, even across groups.  The only time that instructions with the
+   same mnemonic should be split into different groups is when different
+   variants of the instruction appear in different architectures, in which
+   case, grouping all instructions from a particular architecture together
+   might be preferable to merging the instruction into the main instruction
+   table.
+
+   An example of this split instruction groups can be found with the 'sync'
+   instruction.  The core arc architecture provides a 'sync' instruction,
+   while the nps instruction set extension provides 'sync.rd' and
+   'sync.wr'.  The rd/wr flags are instruction flags, not part of the
+   mnemonic, so we end up with two groups for the sync instruction, the
+   first within the core arc instruction table, and the second within the
+   nps extension instructions.  */
 const struct arc_opcode arc_opcodes[] =
 {
 #include "arc-tbl.h"
-- 
2.5.1