From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20266 invoked by alias); 22 Jun 2010 18:45:08 -0000 Received: (qmail 20250 invoked by uid 22791); 22 Jun 2010 18:45:07 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,TW_IV,TW_OV,TW_VQ,TW_VZ,TW_XT,TW_ZB X-Spam-Check-By: sourceware.org Received: from mail-fx0-f47.google.com (HELO mail-fx0-f47.google.com) (209.85.161.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 22 Jun 2010 18:45:03 +0000 Received: by fxm1 with SMTP id 1so773807fxm.20 for ; Tue, 22 Jun 2010 11:45:01 -0700 (PDT) Received: by 10.103.4.14 with SMTP id g14mr2220891mui.84.1277232300882; Tue, 22 Jun 2010 11:45:00 -0700 (PDT) Received: from [93.103.18.160] (93-103-18-160.static.t-2.net [93.103.18.160]) by mx.google.com with ESMTPS id n10sm8616916mue.42.2010.06.22.11.44.59 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 22 Jun 2010 11:45:00 -0700 (PDT) Subject: Re: PATCH: PR target/44588: Very inefficient 8bit mod/div From: Uros Bizjak To: "H.J. Lu" Cc: gcc-patches@gcc.gnu.org In-Reply-To: References: <20100621193321.GA13780@intel.com> <1277229955.2613.1.camel@localhost> Content-Type: text/plain; charset="UTF-8" Date: Tue, 22 Jun 2010 19:12:00 -0000 Message-ID: <1277232299.2613.13.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2010-06/txt/msg02239.txt.bz2 On Tue, 2010-06-22 at 11:27 -0700, H.J. Lu wrote: > >> This patch adds 8bit divmov pattern for x86. X86 8bit divide > >> instructions return result in AX with > >> > >> AL <- Quotient > >> AH <- Remainder > >> > >> This patch models it and properly extends quotient. Tested > >> on Intel64 with -m64 and -m32. There are no regressions. > >> OK for trunk? > >> > >> BTW, there is only one divb used in subreg_get_info in > >> gcc compilers. The old code is > >> > >> movzbl mode_size(%r13), %edi > >> movzbl mode_size(%r14), %esi > >> xorl %edx, %edx > >> movl %edi, %eax > >> divw %si > >> testw %dx, %dx > >> jne .L1194 > >> > >> The new one is > >> > >> movzbl mode_size(%r13), %edi > >> movl %edi, %eax > >> divb mode_size(%r14) > >> movzbl %ah, %eax > >> testb %al, %al > >> jne .L1194 > >> > > > > Hm, something is not combined correctly, I'd say "testb %ah, %ah" is > > optimal in the second case. > > > > Here is another update adjusted for mov pattern changes in i386.md. > > 8bit result is stored in > > AL <- Quotient > AH <- Remainder > > If we use AX for quotient in 8bit divmod pattern, we have to make > sure that AX is valid for quotient. We have to extend AL with UNSPEC > since AH isn't the part of quotient,. Instead, I use AL for quotient and > use UNSPEC_MOVQI_EXTZH to extract remainder from AL. Quotient > access can be optimized very nicely. If remainder is used, we may have > an extract move for UNSPEC_MOVQI_EXTZH. I think this is a reasonable > comprise. Why we need to reinvent movqi_extzv_2 ? I guess that divqi3 has to be implemented as multiple-set divmod pattern using strict_low_part subregs to exactly describe in which subreg quotient and remainder go. Uros.