From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 74090 invoked by alias); 6 Aug 2017 05:36:19 -0000 Mailing-List: contact newlib-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: newlib-owner@sourceware.org Received: (qmail 74067 invoked by uid 89); 6 Aug 2017 05:36:16 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.2 spammy=H*r:Sun, UD:zip X-HELO: mout.kundenserver.de Received: from mout.kundenserver.de (HELO mout.kundenserver.de) (217.72.192.73) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 06 Aug 2017 05:36:15 +0000 Received: from [192.168.178.45] ([95.91.246.195]) by mrelayeu.kundenserver.de (mreue101 [212.227.15.183]) with ESMTPSA (Nemesis) id 0MSrrB-1e3qEo1ryH-00RtCv for ; Sun, 06 Aug 2017 07:36:11 +0200 From: Thomas Wolff Subject: Unicode update of width and other character properties To: newlib@sourceware.org Message-ID: Date: Sun, 06 Aug 2017 05:36:00 -0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-UI-Out-Filterresults: notjunk:1;V01:K0:W6HyVd39uTU=:+GvxkZ9bLsGxg30DfKoSam aliwBuougBQwvAudXYlaSVVsaCPWuklw1hvaCzfs9S90+D4RCUz2nau6jSSgPdqxfYTtjDFu7 t+mfsiq2jXktUo6vJzEJ7XXAqdc1XSakBsd6HBXSyWEd7au1RZ06tORUcxCcjp13ni0pj1p+7 UozR6zrvgdK/Sz7+4x4bNJKJeKPk6TjA7IA0t7m1Rb9KMAPupkBWH66QJ6UCuIo5pg3qX4nmF wHx7KTz/KsGLnYFOiHpb79xanDmf5K4v+ROrlNrKuQqFi2eXzRgzaUUb2T64yOz3G+ESHcNmB hyB0VIw8P9tKoo7u+GxXW/wnuCiJoAVlhJS77WrkXucv6sAzX7vjj0xGW7kd9qeKC5VYe0st1 R3iAq6S7YysdFRNC0vSCUjU53rclx4/nLzcoDhB023lr0XY/3yxePDeIw1SGLOmTqNFNIFGfB 9jk1wcMdejE71zoVa5eX5xxdm1ZwcV01px8yN4cfCJfboeDafHGk0xDqymJoXO2XfB63zwmXt fQfH5YI2042hB32sd8X+S8WEqGsLDw2kEYlG7wLxvnt3Oway5TEhms0OrjyrI3A8F2pi2dEHw EFvhNrBOaYXYsQCuBwyNHqRYT1u4VXLOEe8BQn4SvAmSYFhpGNup+/jLBZBThWYfHIXhoGhHW QmXQ+AWZj4dvazHk247QjHWnACevds2ppp/WEkFSCmUs8RmZrvAzTBHkKHwR5B0dG0dVwwyGU SSwjK6VassChNz8Fr0dKd+GUu5vp+z5IK6zf0vLmlNq89gXxWrNRp5G84E0X1rAmdnXSEHYlh +Uxp3h9Kp+lKcV0lSyikBgqNrPYtKz3XPv6d/zuKP+fYRz5hrpbNPR5l317N41BJPClaKUm3S 06nBspocyg7m7o9ifpHAkoyAYYfGY0ybHdU6lM6Ol/0taBhiNOafkNH4vskUlYUhNTln9OjXd Qjp5qr+YlBQ== X-IsSubscribed: yes X-SW-Source: 2017/txt/msg00720.txt.bz2 Hi, this is a proposal to update wcwidth and the character properties functions isw*/towupper/towlower to Unicode 10.0, as discussed in the mail thread https://cygwin.com/ml/cygwin/2017-07/msg00366.html, as well as to simplify automatic generation of respective tables for an easier update step. Table size is moderate (using ranges for character properties) but there is still an option to reduce the two big tables in size. The patch can be retrieved from http://towo.net/cygwin/charprops10.zip . The Makefile.widthdata does not yet distinguish the two subdirectories (libc/string, libc/ctypw) as it comes from a common development directory. There is a test program in which comparison for isw*/tow* functions between current and patched implementation can be compared. I also provide a log of deviations of the new approach to the current implementation, based on Unicode 5.2 data, to compare and check. If there are any disputable cases, I would consider that of course. My main aim was actually to get the wcwidth data updated, for which the change is more obviously clear. Thanks Thomas