From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from chr-exch-edg.severstal.com (chr-exch-edg.severstal.com [217.175.24.53]) by sourceware.org (Postfix) with ESMTPS id E80B0385702C for ; Fri, 25 Jun 2021 09:00:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E80B0385702C Received: from chr-mail-m03.severstal.severstalgroup.com (10.120.77.9) by chr-exch-edg2.severstal.severstalgroup.com (10.120.69.34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.792.3; Fri, 25 Jun 2021 12:00:01 +0300 Received: from chr-mail-m09.severstal.severstalgroup.com (10.120.77.12) by chr-mail-m03.severstal.severstalgroup.com (10.120.77.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.10; Fri, 25 Jun 2021 12:00:09 +0300 Received: from chr-mail-m09.severstal.severstalgroup.com ([fe80::f0d1:6943:dbc5:ff52]) by chr-mail-m09.severstal.severstalgroup.com ([fe80::f0d1:6943:dbc5:ff52%4]) with mapi id 15.01.2242.010; Fri, 25 Jun 2021 12:00:09 +0300 From: =?windows-1251?B?zOjw7u3u4iDL5e7t6OQgwuvg5Ojs6PDu4uj3?= To: "cygwin@cygwin.com" Subject: RE: getclip and putclip garble unicode characters Thread-Topic: getclip and putclip garble unicode characters Thread-Index: AddoNN8dUidQoEiqTx6xJKaDlsTnngAdSooAABRJ19A= Date: Fri, 25 Jun 2021 09:00:09 +0000 Message-ID: References: <1442655532.20210624093554@yandex.ru> In-Reply-To: <1442655532.20210624093554@yandex.ru> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.120.77.3] x-kse-serverinfo: chr-mail-m03.severstal.severstalgroup.com, 9 x-kse-attachmentfiltering-interceptor-info: protection disabled x-kse-antivirus-interceptor-info: scan successful x-kse-antivirus-info: Clean, bases: 6/25/2021 6:01:00 AM x-kse-bulkmessagesfiltering-scan-result: protection disabled Content-Type: text/plain; charset="windows-1251" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-KSE-ServerInfo: chr-exch-edg2.severstal.severstalgroup.com, 9 X-KSE-Attachment-Filter-Triggered-Rules: Clean X-KSE-Attachment-Filter-Triggered-Filters: Clean X-KSE-BulkMessagesFiltering-Scan-Result: protection disabled X-Spam-Status: No, score=2.3 required=5.0 tests=BAYES_50, BODY_8BITS, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Level: ** X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jun 2021 09:00:14 -0000 As far as copying from cygwin to windows is concerned, it happens in exactl= y the same way in all windows programs I tried pasting data to - word, outl= ook, chrome, console, you name it. Changing windows keyboard language has n= o effect either, windows still stubbornly treats clipboard contents as cp12= 52 (don't quite see how it is supposed to help - data on the clipboard is n= ot limited to one single-byte codepage anyway).=20 At first I missed that when copying from windows to cygwin getclip actually= gets data in cp1251 (windows ANSI codepage), thus cyrillic characters can = be at least recovered with iconv, but non-cyrillic non-latin characters - e= .g. greek, are replaced with question marks and are lost although in window= s everything can be pasted back without issues, again regardless of the pro= gram and keyboard language. So in a nutshell, when copy-pasting from cygwin putclip to windows unicode = is treated as cp1252 while copy-pasting from windows to cygwin getclip unic= ode is treated as cp1251. Sorry for top-posting. -----Original Message----- From: Andrey Repin =20 Sent: Thursday, June 24, 2021 9:36 AM To: =CC=E8=F0=EE=ED=EE=E2 =CB=E5=EE=ED=E8=E4 =C2=EB=E0=E4=E8=EC=E8=F0=EE=E2= =E8=F7 ; cygwin@cygwin.com Subject: Re: getclip and putclip garble unicode characters Greetings, =CC=E8=F0=EE=ED=EE=E2 =CB=E5=EE=ED=E8=E4 =C2=EB=E0=E4=E8=EC=E8= =F0=EE=E2=E8=F7! > getclip and putclip from cygutils-extra garble unicode characters: > non-latin characters copied to clipboard in windows are replaced with=20 > question marks when retrieved with getclip in cygwin, and non-latin=20 > characters copied to clipboard using putclip are pasted it in windows=20 > looking like utf-8 displayed in cp1252 but can be retrieved with=20 > getclip exactly as pasted, so it looks like the problem is not in the=20 > way the data is copied but in the way cygwin and windows communicate=20 > text encoding to each other. LC_CTYPE=3Den_US.UTF-8, windows ANSI codepag= e is set to cp1251 - 1251, not 1252. This looks like you are using a program incapable of dealing with unicode c= lipboard. To achieve better results, switch your input language/keyboard to= matching language before copying text from application. I.e. switch to Rus= sian then copy text, then check what is returned by getclip. But then, why LC_CTYPE is en_US? -- With best regards, Andrey Repin Thursday, June 24, 2021 9:33:54 Sorry for my terrible english...