From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26416 invoked by alias); 6 Nov 2002 10:56:27 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 26405 invoked from network); 6 Nov 2002 10:56:24 -0000 Received: from unknown (HELO scrabble.freeuk.net) (212.126.144.6) by sources.redhat.com with SMTP; 6 Nov 2002 10:56:24 -0000 Received: from [217.158.39.130] (helo=picochip.com) by scrabble.freeuk.net with esmtp (Exim 3.36 #2) id 189NrE-00005M-00 for gcc@gcc.gnu.org; Wed, 06 Nov 2002 10:56:24 +0000 Message-ID: <3DC8F596.40500@picochip.com> Date: Wed, 06 Nov 2002 02:56:00 -0000 From: Dan Towner Organization: picoChip Designs Ltd. User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: gcc@gcc.gnu.org Subject: DFA scheduler producing sub-optimal code. Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2002-11/txt/msg00155.txt.bz2 Hi, I am using the DFA scheduler for a 16-bit VLIW. The VLIW has 3 instruction slots, and one constant value slot. On one of my test cases, I get the following schedule (I've added lines to show VLIW packets): ;; 4--> 10 R9=[R5+0x2] :slot1,nothing ------------------------------------------------------------------------ ;; 5--> 11 R10=[R5+0x4] :slot1,nothing ------------------------------------------------------------------------ ;; 6--> 12 R11=[R5+0x6] :slot1,nothing ;; 6--> 22 R3=R9 :slot0|slot1 ------------------------------------------------------------------------ ;; 7--> 23 R4=R10 :slot0|slot1 ------------------------------------------------------------------------ ;; 8--> 24 R5=R11 :slot0|slot1 ------------------------------------------------------------------------ ;; 9--> 70 R0=FP+0x4 :(slot0+slot1+slot2+slotC) Notice that cycles 7 and 8 could be scheduled into either slot0 or slot1. The instructions have been `split' from an SI register-to-register move, which has been decomposed into its constituent sub-register moves. However, the two instructions in their new form use different registers. Why doesn't the scheduler combine these two instructions into the same cycle? I've tried using the sched-verbose option (see output below), but I can't see any reason why it has chosen a new cycle for insn 24. Thanks, Dan. ---------------------- ;; Clock 4 ;; Ready list (t = 4): 70 12 11 10 ;; 4--> 10 R9=[R5+0x2] :slot1,nothing ;; dependences resolved: insn 22 into queue with cost=2 ;; Ready-->Q: insn 22: queued for 2 cycles. ;; Ready list (t = 4): 70 12 11 ;; Ready-->Q: insn 11: queued for 1 cycles. ;; Ready list (t = 4): 70 12 ;; Ready-->Q: insn 12: queued for 1 cycles. ;; Ready list (t = 4): 70 ;; Ready-->Q: insn 70: queued for 1 cycles. ;; Ready list (t = 4): ;; Q-->Ready: insn 70: moving to ready without stalls ;; Q-->Ready: insn 12: moving to ready without stalls ;; Q-->Ready: insn 11: moving to ready without stalls ;; Ready list after queue_to_ready: 11 12 70 ;; Clock 5 ;; Ready list (t = 5): 70 12 11 ;; 5--> 11 R10=[R5+0x4] :slot1,nothing ;; dependences resolved: insn 23 into queue with cost=2 ;; Ready-->Q: insn 23: queued for 2 cycles. ;; Ready list (t = 5): 70 12 ;; Ready-->Q: insn 12: queued for 1 cycles. ;; Ready list (t = 5): 70 ;; Ready-->Q: insn 70: queued for 1 cycles. ;; Ready list (t = 5): ;; Q-->Ready: insn 70: moving to ready without stalls ;; Q-->Ready: insn 12: moving to ready without stalls ;; Q-->Ready: insn 22: moving to ready without stalls ;; Ready list after queue_to_ready: 22 12 70 ;; Clock 6 ;; Ready list (t = 6): 70 22 12 ;; 6--> 12 R11=[R5+0x6] :slot1,nothing ;; dependences resolved: insn 24 into queue with cost=2 ;; Ready-->Q: insn 24: queued for 2 cycles. ;; Ready list (t = 6): 70 22 ;; 6--> 22 R3=R9 :slot0|slot1 ;; Ready list (t = 6): 70 ;; Ready-->Q: insn 70: queued for 1 cycles. ;; Ready list (t = 6): ;; Q-->Ready: insn 70: moving to ready without stalls ;; Q-->Ready: insn 23: moving to ready without stalls ;; Ready list after queue_to_ready: 23 70 ;; Clock 7 ;; Ready list (t = 7): 70 23 ;; 7--> 23 R4=R10 :slot0|slot1 ;; Ready list (t = 7): 70 ;; Ready-->Q: insn 70: queued for 1 cycles. ;; Ready list (t = 7): ;; Q-->Ready: insn 70: moving to ready without stalls ;; Q-->Ready: insn 24: moving to ready without stalls ;; Ready list after queue_to_ready: 24 70 ;; Clock 8 ;; Ready list (t = 8): 70 24 ;; 8--> 24 R5=R11 :slot0|slot1 ;; Ready list (t = 8): 70 ;; Ready-->Q: insn 70: queued for 1 cycles. ;; Ready list (t = 8): ;; Q-->Ready: insn 70: moving to ready without stalls ;; Ready list after queue_to_ready: 70 ;; Clock 9 ;; Ready list (t = 9): 70 ;; 9--> 70 R0=FP+0x4 :(slot0+slot1+slot2+slotC) ;; dependences resolved: insn 25 into queue with cost=1 ;; Ready-->Q: insn 25: queued for 1 cycles. ;; Ready list (t = 9): ;; Q-->Ready: insn 25: moving to ready without stalls ;; Ready list after queue_to_ready: 25 ============================================================================= Daniel Towner picoChip Designs Ltd., Riverside Buildings, 108, Walcot Street, BATH, BA1 5BG dant@picochip.com 07786 702589