From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 48256 invoked by alias); 10 Apr 2016 09:52:47 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 48231 invoked by uid 89); 10 Apr 2016 09:52:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=HContent-transfer-encoding:8BIT, sean, outline X-HELO: st11p00im-asmtp002.me.com X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-04-10_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 clxscore=1015 suspectscore=5 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1510270003 definitions=main-1604100144 MIME-version: 1.0 Content-transfer-encoding: 8BIT Content-type: text/plain; charset=utf-8 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-04-10_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 clxscore=1011 suspectscore=5 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1510270003 definitions=main-1604100144 To: libc-alpha@sourceware.org From: Faissal Bensefia Subject: Faster string to integer conversion X-Enigmail-Draft-Status: N1110 Message-id: <570A225E.20409@me.com> Date: Sun, 10 Apr 2016 09:52:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 X-SW-Source: 2016-04/txt/msg00163.txt.bz2 Good day, The way glibc does strlen got me thinking, and I believe I've devised a way to do something similar for string to integer conversion. Using Sean Anderson's hasless, Juha Järvi's hasmore (both of which can be found here: https://graphics.stanford.edu/~seander/bithacks.html#HasLessInWord) and some things I came up with. I'll outline the algorithm here, I'll be happy to explain more if necessary. In theory, this should be significantly faster than what is being done in things such as strtol etc in glibc: 1. We handle the first 2 characters normally to check the base. 2. If the base is 10 we don't handle it number by number as normal, instead we continue on... 3. Instead we take a double word, we'll call this 'snippet' when we do calculations with it 4. Increment the string pointer by double word size (so, 4). 5. Check hasless(snippet,0x30), if true, start checking for a 0 terminator and if a zero terminator is not found, nor whitespace (which we skip) assume there was an invalid character. 6. Check hasmore(snippet,0x39), if true, assume there was an invalid character. 7. If neither hasless nor hasmore have returned true then xor 0x30303030 from the value. 8. We then multiply the current total for the number by 10000 9. And add: snippet&0xFF ((snippet>>8)&0xFF)*10 ((snippet>>16)&0xFF)*100 ((snippet>>24)&0xFF)*1000 10. Loop back to step 3 Thank you for your time, ~Faissal Bensefia