From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from omta001.cacentral1.a.cloudfilter.net (omta001.cacentral1.a.cloudfilter.net [3.97.99.32]) by sourceware.org (Postfix) with ESMTPS id 82F6E385740B for ; Fri, 25 Jun 2021 18:54:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 82F6E385740B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=SystematicSw.ab.ca Authentication-Results: sourceware.org; spf=none smtp.mailfrom=systematicsw.ab.ca Received: from shw-obgw-4003a.ext.cloudfilter.net ([10.228.9.183]) by cmsmtp with ESMTP id wnczlxuqtFRDpwqyHl0HUn; Fri, 25 Jun 2021 18:54:33 +0000 Received: from [192.168.1.104] ([68.147.0.90]) by cmsmtp with ESMTP id wqyHlYJ9tqeviwqyHl0zhE; Fri, 25 Jun 2021 18:54:33 +0000 X-Authority-Analysis: v=2.4 cv=B4F8bMhM c=1 sm=1 tr=0 ts=60d62669 a=T+ovY1NZ+FAi/xYICV7Bgg==:117 a=T+ovY1NZ+FAi/xYICV7Bgg==:17 a=IkcTkHD0fZMA:10 a=CCpqsmhAAAAA:8 a=94nOnFI1EgyDtX4ev68A:9 a=QEXdDO2ut3YA:10 a=oq2aNKfg7m0A:10 a=ul9cdbp4aOFLsgKbc677:22 Reply-To: cygwin@cygwin.com Subject: Re: getclip and putclip garble unicode characters To: cygwin@cygwin.com References: <1442655532.20210624093554@yandex.ru> <29705e0f-f6b4-eca8-f350-b4100d2c7244@towo.net> From: Brian Inglis Organization: Systematic Software Message-ID: <2b37488d-83be-898f-15df-bc6f3bc0fa52@SystematicSw.ab.ca> Date: Fri, 25 Jun 2021 12:54:32 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <29705e0f-f6b4-eca8-f350-b4100d2c7244@towo.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-CA Content-Transfer-Encoding: 8bit X-CMAE-Envelope: MS4xfOALiskgNJj5RMcBRrRypvy+2iO6d4OZ3ZEkkS8HF7oeHaMEaB3sA+XVTqMQZoEHXyPwEEeF8Hxrdxh/l7BvugwT99lqUfqvBuQQvhMKb2psGBSevGor hzebColznr1rsmYM3l/9ap2cV93ZPqPJoslBBfuNo0NjTh5AJ5c70yi0E4yleLlYxhQOrs7nvQBOFfZ2h8AcoMCE+wli0kD6Z5k= X-Spam-Status: No, score=-1159.7 required=5.0 tests=BAYES_00, BODY_8BITS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jun 2021 18:54:36 -0000 On 2021-06-25 12:01, Thomas Wolff wrote: > Am 24.06.2021 um 08:35 schrieb Andrey Repin via Cygwin: >> Greetings, Миронов Леонид Владимирович! >>> getclip and putclip from cygutils-extra garble unicode characters: >>> non-latin characters copied to clipboard in windows are replaced with >>> question marks when retrieved with getclip in cygwin, and non-latin >>> characters copied to clipboard using putclip are pasted it in windows >>> looking like utf-8 displayed in cp1252 but can be retrieved with getclip >>> exactly as pasted, so it looks like the problem is not in the way the >>> data >>> is copied but in the way cygwin and windows communicate text encoding to >>> each other. LC_CTYPE=en_US.UTF-8, windows ANSI codepage is set to >>> cp1251 - 1251, not 1252. >> This looks like you are using a program incapable of dealing with unicode >> clipboard. To achieve better results, switch your input >> language/keyboard to >> matching language before copying text from application. I.e. switch to >> Russian then copy text, then check what is returned by getclip. >> But then, why LC_CTYPE is en_US? > getclip and putclip are just broken, they don't even work in a pure > UTF-8 environment. > Already noticed 9 years ago... > https://sourceware.org/legacy-ml/cygwin/2012-03/msg00648.html > including a script-based replacement. Just cat [<>] /dev/clipboard: recent Windows changes may have affected Windows<->X copy and paste transparency. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. [Data in binary units and prefixes, physical quantities in SI.]