From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12598 invoked by alias); 4 Apr 2003 18:12:14 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 12588 invoked from network); 4 Apr 2003 18:12:13 -0000 Received: from unknown (HELO main.gmane.org) (80.91.224.249) by sources.redhat.com with SMTP; 4 Apr 2003 18:12:13 -0000 Received: from list by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 191VfM-0001It-00 for ; Fri, 04 Apr 2003 20:11:52 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: gcc@gcc.gnu.org Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 191VfJ-0001IX-00 for ; Fri, 04 Apr 2003 20:11:49 +0200 From: "Marcel Cox" Subject: Re: Slow memcmp for aligned strings on Pentium 3 Date: Fri, 04 Apr 2003 18:58:00 -0000 Message-ID: References: X-Complaints-To: usenet@main.gmane.org X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.50.4807.1700 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4807.1700 X-SW-Source: 2003-04/txt/msg00187.txt.bz2 "Kevin Atkinson" wrote in message news:Pine.LNX.4.44.0304040923350.912-300000@kevin-pc.atkinson.dhs.org... > Here you go. Still assume the memory is alligned: > > This is what I did: > > int cmpa2(const unsigned int * x, const unsigned int * y, size_t size) > { > int i = 0; > size_t s = size / 4; > while (i < s && x[i] == y[i]) ++i; > size -= i * 4; > if (size == 0) return 0; > // hopefully if this is inline expanded when size is known > // the compiler can eliminate many of these conditionals > else if (size >= 4) { // if original size % 4 == 0 this should > // always be the case > unsigned int xx = x[i], yy = y[i]; > asm("bswap %0" : "+r"(xx)); > asm("bswap %0" : "+r"(yy)); > return xx - yy; > } else { > const unsigned char * xb = (const unsigned char *)(x + i); > const unsigned char * yb = (const unsigned char *)(y + i); > // if size is known at compile time then the compiler should be > // able to select the correct choice at compile time > switch (size) { > case 1: > return *xb - *yb; > case 2: > return ((xb[0] - yb[0]) << 8) + (xb[1] - yb[1]); > case 3: > return ((xb[0] - yb[0]) << 16) + ((xb[1] - yb[1]) << 8) > + xb[2] - yb[2];} > } > } There is still a flaw in that you assume that the difference of 2 unsigned integers will return the correct signed result. This will not work if for instance in your "return xx - yy" statement, xx for instance is 0xf0000000 and yy is 0x100000. In that case, xx is clearly greater than yy, but the difference is 0xe0000000 which cast to signed integer will be a negative number and will indicate that xx is smaller than yy which is clearly wrong. Marcel