From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29099 invoked by alias); 2 Jan 2002 23:03:33 -0000 Mailing-List: contact sid-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: sid-owner@sources.redhat.com Received: (qmail 28985 invoked from network); 2 Jan 2002 23:03:30 -0000 Message-ID: <3C3391D7.6050301@redhat.com> Date: Wed, 02 Jan 2002 15:03:00 -0000 From: Dave Brolley User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.2) Gecko/20010726 Netscape6/6.1 X-Accept-Language: en-us MIME-Version: 1.0 To: sid@sources.redhat.com, cgen@sources.redhat.com Subject: [patch][rfa]: Decoding (not-so) ambiguous insns in sid/sim Content-Type: multipart/mixed; boundary="------------000906080908080808060608" X-SW-Source: 2002-q1/txt/msg00000.txt.bz2 This is a multi-part message in MIME format. --------------000906080908080808060608 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-length: 6350 Hi, You may recall a patch which I submitted (and committed, after approval) which allowed the decoder for the opcodes-based disassembler to to distinguish between insns when the set of decodable bits of one insn is a superset of the decodable bits of another. http://sources.redhat.com/ml/binutils/2001-11/msg00406.html This situation occurs when one insn is a specialization of another. This patch adds the same capability to the decoders used by the cgen-based simulators in the sim and sid source trees. While the disassembler solution was as simple as sorting a list of insns, the simulators use a switch statement which is generated by a tree constructed by a cgen application to decode the insns. This patch modifes the cgen application which generates the tree. The patch can be divided into several pieces: 1) The current tree generator calls a function, filter-harmlessly-ambiguous-insns, on each tree node before creating further sub-trees from the insns remaining in the node. This function removes any insn whose decodable bits are a strict superset of another insn in the list. It turns out that these insns are not so "harmlessly ambiguous" for all ISAs. For example, an architecture may define memory access using a base register, but may also define that the base register be incremented if a particular register (stack pointer) is used. That particular instance (specialization) of the insn would have the same decodable bits as the more general insn and would also require a particular bit pattern for the field representing the base register. The current tree generator removes the specialized insns completely by calling "filter-harmlessly-ambiguous-insns". This patch removes that call (decode.scm:516) and adds additional code to handle the apparently ambiguous insns (decode.scm:574, decode.scm:595). This additional code filters identical insns and chooses a specialized insn over its more general cousin in the same tree node (The same preference used in the previous opcodes patch). The more general insn will still be decodable, since it will appear in other tree nodes which do not contain the specialized insn. 2) Supporting code for the above is in the new functions in insn.scm which perform the filtering. Note that "filter-harmlessly-ambiguous-insns" is no longer used, but I did not remove it (yet). I will remove it, if no one thinks it could possibly be useful. These changes alone result in a tree generator that works, however, not filtering the previously-believed-to-be-harmlessly-ambiguous insns exposes a problem in the tree generator which is that, for these insns it goes on to attempt to use every single insn bit in an effort to distinguish the insns before reaching the point where it applies the additional logic that I added. For a 32 bit ISA, this results in 2^n tree nodes generated (where n is the number of non-decodable bits), which is clearly unacceptable, not to mention that the additional tree nodes are all identical and resolve nothing! For one particular port, the tree generation, which previously took a few minutes now took over 12 hours. The root of the problem is that the heuristic which chooses bits to use will eventually choose bits which have no effect on the decoding of the insn. There is a heuristic function which computes a value representing the usefulness of each bit to the decoding process. It does correctly rate these bits as the least useful, but the fact remains that they will eventually be chosen anyway, as the other usful bits are exhausted. It turns out that the heuristic function assigns these bits a value of zero. Unfortunately, the existing heuristic function also assigns the value zero to the bits representing the specialized insn fields that were interested in. Two changes were necessary: 1) The change to decode.scm:317 ignores bits which are assigned the heuristic value of zero. 2) The heuristic function was changed to prioritize bits into 3 categories: a) bits which are true decodable bits b) bits which are used in specialization of an insn c) useless bits The previous heuristic counted the number of times in the ISA that a bit was required to be zero (p0) and the number of times it was required to be one (p1). The function then computed the geometric mean (sqrt (p0 * p1)). You can see that the more often a bit is constant in an insn, the higher this value will be. You can also see that if a bit is never constant in an insn (useless for decoding), the result will be zero. For a bit which is in a specialized field of an insn, however, one of p0 or p1 will be zero, resulting in an overal heurstic value of zero. We need to modify the function so that these bits receive a value greater than zero, but less that the value assigned to a truly decodable bit. The new function (decode.scm:241) works as follows: if (p0 + p1 == 0) // useless for decoding result = 0; else if (p0 * p1 == 0) // specialization bit: heuristic range: 0 <= h < num_insns result = num_insns - (p0 + p1); else // decodable bit -- heuristic range: num_insns < h result = num_insns + (sqrt (p0 * p1)); So the heuristic now chooses decodable bits first, followed by specialization bits and ignores the useless bits. Note also that a bit which is always 1 or always 0 is also useless for decoding and will be assigned a value of zero (since p0 or p1 will be equal to num_insns). This brought the build time for the port which had ballooned to 12 hours back down to its normal few minutes. The result of this patch should be: o No effect for ports with no specialized insns. The new heuristic will result in the same prioritization of bits as before. o Slightly larger decoder tree for ports with specialized insns. One extra tree level for each node containing a specialized insn. I know of only one such port. For that port, the specializations are meaningless since the the semantics of the specialized insn are the as those of the general insn. This is why the previous filtering implementation was not a problem. I have tested this port with no regressions. o The port I'm developing now works as expected :-) I know this is a complex patch, so please feel free to ask questions. I'm seeking approval to commit this. Dave --------------000906080908080808060608 Content-Type: text/plain; name="ambiguous.ChangeLog" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ambiguous.ChangeLog" Content-length: 723 2002-01-02 Dave Brolley * decode.scm (-distinguishing-bit-population): Compute num-insns, the number of insns in the list. Update the population count function to identify and prioritize 3 catgories of useful bits. (-population-top-few): Don't consider bits with a population count of zero. (-build-decode-table-entry): Don't call filter-harmlessly-ambiguous-insns. Filter out non-specialized and identical insns at the next tree level. * insn.scm (filter-harmlessly-ambiguous-insns): Note in a comment that this function is no longer used. (filter-non-specialized-ambiguous-insns): New function. (filter-identical-ambiguous-insns): New function. (find-identical-insn): New function. --------------000906080908080808060608 Content-Type: text/plain; name="ambiguous.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ambiguous.txt" Content-length: 10132 Index: cgen/decode.scm =================================================================== RCS file: /cvs/src/src/cgen/decode.scm,v retrieving revision 1.3 diff -c -p -b -r1.3 decode.scm *** cgen/decode.scm 2000/11/10 16:43:21 1.3 --- cgen/decode.scm 2002/01/02 20:58:04 *************** *** 222,228 **** (define (-distinguishing-bit-population masks mask-lens values lsb0?) (let* ((max-length (apply max mask-lens)) (0-population (make-vector max-length 0)) ! (1-population (make-vector max-length 0))) ; Compute the 1- and 0-population vectors (for-each (lambda (mask len value) (logit 5 " population count mask=" (number->hex mask) " len=" len "\n") --- 222,229 ---- (define (-distinguishing-bit-population masks mask-lens values lsb0?) (let* ((max-length (apply max mask-lens)) (0-population (make-vector max-length 0)) ! (1-population (make-vector max-length 0)) ! (num-insns (length masks))) ; Compute the 1- and 0-population vectors (for-each (lambda (mask len value) (logit 5 " population count mask=" (number->hex mask) " len=" len "\n") *************** *** 240,247 **** (list->vector (map (lambda (p0 p1) (logit 4 p0 "/" p1 " ") ! ; (sqrt (+ p0 p1 (* p0 p1))) ; funny function - nice curve ! (sqrt (* p0 p1))) ; geometric mean (vector->list 0-population) (vector->list 1-population)))) ) --- 241,262 ---- (list->vector (map (lambda (p0 p1) (logit 4 p0 "/" p1 " ") ! ; The most useful bits for decoding are those with counts in both ! ; p0 and p1. These are the bits which distinguish one insn from ! ; another. Assign these bits a high value (greater than num-insns). ! ; ! ; The next most useful bits are those with counts in either p0 ! ; or p1. These bits represent specializations of other insns. ! ; Assign these bits a value between 0 and (num-insns - 1). Note that ! ; p0 + p1 is guaranteed to be <= num-insns. ! ; ! ; Bits with no count in either p0 or p1 are useless for decoding ! ; and should never be considered. Assigning these bits a value of ! ; 0 ensures this. ! (cond ! ((= (+ p0 p1) 0) 0) ! ((= (* p0 p1) 0) (- num-insns (+ p0 p1))) ! (else (+ num-insns (sqrt (* p0 p1)))))) (vector->list 0-population) (vector->list 1-population)))) ) *************** *** 302,311 **** " picks=(" old-picks ") pop=(" remaining-population ")" " threshold=" count-threshold " new-picks=(" new-picks ")\n") (cond ; No new matches? ((null? new-picks) (if (null? old-picks) ! (logit 1 "-population-top-few: No bits left to pick from!\n")) old-picks) ; Way too many matches? ((> (+ (length new-picks) (length old-picks)) (+ size 3)) --- 317,332 ---- " picks=(" old-picks ") pop=(" remaining-population ")" " threshold=" count-threshold " new-picks=(" new-picks ")\n") (cond + ; No point picking bits with population count of zero. This leads to + ; the generation of layers of subtables which resolve nothing. Generating + ; these tables can slow the build by several orders of magnitude. + ((= 0 count-threshold) + (logit 2 "-population-top-few: count-threshold is zero!\n") + old-picks) ; No new matches? ((null? new-picks) (if (null? old-picks) ! (logit 2 "-population-top-few: No bits left to pick from!\n")) old-picks) ; Way too many matches? ((> (+ (length new-picks) (length old-picks)) (+ size 3)) *************** *** 495,501 **** (define -build-decode-table-entry-args #f) (define (-build-decode-table-entry insn-vec startbit decode-bitsize index index-list lsb0? invalid-insn) ! (let ((slot (filter-harmlessly-ambiguous-insns (vector-ref insn-vec index)))) (logit 2 "Processing decode entry " (number->string index) " in " --- 516,522 ---- (define -build-decode-table-entry-args #f) (define (-build-decode-table-entry insn-vec startbit decode-bitsize index index-list lsb0? invalid-insn) ! (let ((slot (vector-ref insn-vec index))) (logit 2 "Processing decode entry " (number->string index) " in " *************** *** 553,559 **** ; If bitnums is still nil there is an ambiguity. (if (null? bitnums) ! (begin ; If all insns are marked as DECODE-SPLIT, don't warn. (if (not (all-true? (map (lambda (insn) --- 574,589 ---- ; If bitnums is still nil there is an ambiguity. (if (null? bitnums) ! (begin ! ; Try filtering out insns which are more general cases of ! ; other insns in the slot. The filtered insns will appear ! ; in other slots as appropriate. ! (set! slot (filter-non-specialized-ambiguous-insns slot)) ! ! (if (= 1 (length slot)) ! ; Only 1 insn left in the slot, so take it. ! (dtable-entry-make index 'insn (car slot)) ! ; There is still more than one insn in 'slot', so there is still an ambiguity. (begin ; If all insns are marked as DECODE-SPLIT, don't warn. (if (not (all-true? (map (lambda (insn) *************** *** 565,571 **** (string-append ", " (obj:name insn))) slot)) "\n")) ! ; Things aren't entirely hopeless. See if any ifield-assertion ; specs are present. ; FIXME: For now we assume that if they all have an ; ifield-assertion spec, then there is no ambiguity (it's left --- 595,608 ---- (string-append ", " (obj:name insn))) slot)) "\n")) ! ; Things aren't entirely hopeless. We've warned about the ambiguity. ! ; Now, if there are any identical insns, filter them out. If only one ! ; remains, then use it. ! (set! slot (filter-identical-ambiguous-insns slot)) ! (if (= 1 (length slot)) ! ; Only 1 insn left in the slot, so take it. ! (dtable-entry-make index 'insn (car slot)) ! ; Otherwise, see if any ifield-assertion ; specs are present. ; FIXME: For now we assume that if they all have an ; ifield-assertion spec, then there is no ambiguity (it's left *************** *** 588,594 **** (dtable-entry-make index 'expr (exprtable-make (-gen-exprtable-name exprtable-entries) ! exprtable-entries))))) ; There is no ambiguity so generate the subtable. ; Need to build `subtable' separately because we --- 625,631 ---- (dtable-entry-make index 'expr (exprtable-make (-gen-exprtable-name exprtable-entries) ! exprtable-entries)))))))) ; There is no ambiguity so generate the subtable. ; Need to build `subtable' separately because we Index: cgen/insn.scm =================================================================== RCS file: /cvs/src/src/cgen/insn.scm,v retrieving revision 1.5 diff -c -p -b -r1.5 insn.scm *** cgen/insn.scm 2001/09/18 03:17:12 1.5 --- cgen/insn.scm 2002/01/02 20:58:04 *************** *** 708,715 **** ; Filter out instructions whose ifield patterns are strict subsets of ! ; another. For decoding purpose, it is sufficient to consider the ; more general cousin. (define (filter-harmlessly-ambiguous-insns insn-list) (logit 3 "Filtering " (length insn-list) " instructions.\n") --- 708,720 ---- ; Filter out instructions whose ifield patterns are strict subsets of ! ; another. For decoding purposes, it is sufficient to consider the ; more general cousin. + ; + ; NOTE: Not currently used. This filtering is not harmless for ISAs + ; in which certain values of a given ifield change the semantics of + ; the insn. For example, encoding register 15 in a field normally + ; used for specifying a register may result in a different addressing mode. (define (filter-harmlessly-ambiguous-insns insn-list) (logit 3 "Filtering " (length insn-list) " instructions.\n") *************** *** 735,740 **** --- 740,799 ---- insn-list) ) + ; Filter out instructions whose ifield patterns are strict supersets of + ; another, keeping the less general cousin. Used to resolve ambiguity + ; when there are no more bits to consider. + + (define (filter-non-specialized-ambiguous-insns insn-list) + (logit 3 "Filtering " (length insn-list) " instructions for non specializations.\n") + (find (lambda (insn) + (let* ((i-mask (insn-base-mask insn)) + (i-mask-len (insn-base-mask-length insn)) + (i-value (insn-value insn)) + (subset-insn (find-first + (lambda (insn2) ; insn2: possible submatch (more mask bits) + (let ((i2-mask (insn-base-mask insn2)) + (i2-mask-len (insn-base-mask-length insn2)) + (i2-value (insn-value insn2))) + (and (not (eq? insn insn2)) + (= i-mask-len i2-mask-len) + (mask-superset? i-mask i-value i2-mask i2-value)))) + insn-list)) + (keep? (not subset-insn))) + (if (not keep?) + (logit 2 + "Instruction " (obj:name insn) " specialization-filtered by " + (obj:name subset-insn) "\n")) + keep?)) + insn-list) + ) + + ; Filter out instructions whose ifield patterns are identical. + + (define (filter-identical-ambiguous-insns insn-list) + (logit 3 "Filtering " (length insn-list) " instructions for identical variants.\n") + (let loop ((l insn-list) (result nil)) + (cond ((null? l) (reverse! result)) + ((find-identical-insn (car l) (cdr l)) (loop (cdr l) result)) + (else (loop (cdr l) (cons (car l) result))) + ) + ) + ) + + (define (find-identical-insn insn insn-list) + (let ((i-mask (insn-base-mask insn)) + (i-mask-len (insn-base-mask-length insn)) + (i-value (insn-value insn))) + (find-first + (lambda (insn2) + (let ((i2-mask (insn-base-mask insn2)) + (i2-mask-len (insn-base-mask-length insn2)) + (i2-value (insn-value insn2))) + (and (= i-mask-len i2-mask-len) + (= i-mask i2-mask) + (= i-value i2-value)))) + insn-list)) + ) ; Helper function for above: does (m1,v1) match a STRICT superset of (m2,v2) ? ; --------------000906080908080808060608--