From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpout2.vodafonemail.de (smtpout2.vodafonemail.de [145.253.239.133]) by sourceware.org (Postfix) with ESMTPS id 95CEA3857C4C for ; Mon, 5 Apr 2021 06:43:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 95CEA3857C4C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nexgo.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Stromeko@nexgo.de Received: from smtp.vodafone.de (smtpa03.fra-mediabeam.com [10.2.0.34]) by smtpout2.vodafonemail.de (Postfix) with ESMTP id 2811A120BBB for ; Mon, 5 Apr 2021 08:43:28 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nexgo.de; s=vfde-smtpout-mb-15sep; t=1617605008; bh=bH+AQIsIa+H/+dYCw2b6bdjZJCfeqX/fHsqSfrorLJQ=; h=From:To:Subject:References:Date:In-Reply-To; b=fcvusdO0F7yAKIPYG9aD8i3j13OKywGidx/xqIbSXV1IbSvPC/sOVAQA2vwqiWTpj FaYIgzJL1DPtSYWJXZn7POXnq/wsNyG+o6T1eSqgR47StImI8cA+iIdN60MzrdueJu gRLQ7pbjmhYkOP3dMkeLEF4XCuXBEqlkBpM7IX20= Received: from Gertrud (p57b9d8ab.dip0.t-ipconnect.de [87.185.216.171]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp.vodafone.de (Postfix) with ESMTPSA id BAD4914027E for ; Mon, 5 Apr 2021 06:43:27 +0000 (UTC) From: Achim Gratz To: cygwin@cygwin.com Subject: Re: Perl Unidecode modules - which to use (if not Text::Unidecode)? References: Date: Mon, 05 Apr 2021 08:43:19 +0200 In-Reply-To: (Mark Aitchison's message of "Fri, 2 Apr 2021 09:35:31 +1300") Message-ID: <87sg45tens.fsf@Rainer.invalid> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-purgate-type: clean X-purgate-Ad: Categorized by eleven eXpurgate (R) http://www.eleven.de X-purgate: This mail is considered clean (visit http://www.eleven.de for further information) X-purgate: clean X-purgate-size: 1211 X-purgate-ID: 155817::1617605007-0000600F-632965B1/0/0 X-Spam-Status: No, score=-0.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 06:43:31 -0000 Mark Aitchison writes: > I am writing perl programs that I'd like to know will work under both > Linux and Cygwin, and have to deal with Unicode now. Why not do it properly, i.e. actually work in unicode? > I had used Text::Unidecode happily in Linux but find no cygwin > version. Possibly I am not looking in the right places for it, but > possibly there are different Unicode-related modules that are > well-supported under both cygwin and linux that I should be using > instead, and I guess Unicode might be one of those things where it > depends on the underlying o/s so it probably pays to go with whatever > is the standard set of modules. Text::Unidecode _strips_ unicode characters down to ASCII so that programs that are non-Unicode aware will not balk. This may have been useful in the past, but I no longer see the point when the standard environment almost everywhere is either UTF-8 or UTF-16 these days. [=E2=80=A6] See "perldoc unicode" for starters. Regards, Achim. --=20 +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Q+, Q and microQ: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds