From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 70615 invoked by alias); 22 Jun 2018 11:20:11 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 70601 invoked by uid 89); 22 Jun 2018 11:20:10 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=2.0 required=5.0 tests=BAYES_50,FREEMAIL_FROM,KAM_THEBAT,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 spammy=UD:ru, desktop, sk:www.joe, sk:wwwjoe X-HELO: forward106p.mail.yandex.net Received: from forward106p.mail.yandex.net (HELO forward106p.mail.yandex.net) (77.88.28.109) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 22 Jun 2018 11:20:07 +0000 Received: from mxback19j.mail.yandex.net (mxback19j.mail.yandex.net [IPv6:2a02:6b8:0:1619::95]) by forward106p.mail.yandex.net (Yandex) with ESMTP id 2D07C2D82B24; Fri, 22 Jun 2018 14:20:03 +0300 (MSK) Received: from smtp2p.mail.yandex.net (smtp2p.mail.yandex.net [2a02:6b8:0:1472:2741:0:8b6:7]) by mxback19j.mail.yandex.net (nwsmtp/Yandex) with ESMTP id EgQTU1aF7k-K2Ym7Dih; Fri, 22 Jun 2018 14:20:03 +0300 Received: by smtp2p.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id wK9hYyXidm-K218o0LX; Fri, 22 Jun 2018 14:20:02 +0300 (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (Client certificate not present) Authentication-Results: smtp2p.mail.yandex.net; dkim=pass header.i=@yandex.ru Received: from [192.168.1.10] (HELO daemon2.darkdragon.lan) by daemon2 (Office Mail Server 0.8.12 build 08053101) with SMTP; Fri, 22 Jun 2018 11:17:28 -0000 Date: Fri, 22 Jun 2018 17:30:00 -0000 From: Andrey Repin Reply-To: cygwin@cygwin.com Message-ID: <59130091.20180622141728@yandex.ru> To: Lee , cygwin@cygwin.com Subject: Re: UTF-8 character encoding In-Reply-To: References: <1183751257.20180621042620@yandex.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2018-06/txt/msg00247.txt.bz2 Greetings, Lee! > On 6/20/18, Andrey Repin wrote: >> Greetings, Lee! >> >>> I'm looking at >>> https://cygwin.com/packaging-hint-files.html#pvr.hint >>> and it starts off with >>> Use UTF-8 character encoding. >> >>> How do I do that and how do I check that I actually did use UTF-8 >>> character encoding _without_ using file? >> >> https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/ > I think I don't know enough to ask the right question. A quick search > yesterday on byte order markers turned up > > https://msdn.microsoft.com/en-us/library/windows/desktop/dd374101(v=vs.85).aspx > with this bit > Note Microsoft uses UTF-16, little endian byte order. Yes, default multibyte Windows encoding is UTF-16LE. But in general, this is application specific. > So... keep it simple, set > LANG=en_US.UTF-8 > and use vi or something else that comes with cygwin to create the file > and I'll have a file with UTF-8 character encoding - correct? I'm not familiar with vi, but this is true for other *NIX editors I know, they use current locale settings by default, unless something else is specified in their configuration or prompted by other cases (like byte order mark). IMO, best chance is to use an editor that explicitly supports saving texts in the desired encoding. And please no BOM for UTF-8 files. -- With best regards, Andrey Repin Friday, June 22, 2018 14:13:14 Sorry for my terrible english... -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple