public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Joel Rees <joel.rees@gmail.com>
To: cygwin@cygwin.com
Subject: Re: Perl Unidecode modules - which to use (if not Text::Unidecode)?
Date: Mon, 5 Apr 2021 06:26:18 +0900	[thread overview]
Message-ID: <CAAr43iOdVea3YYThgdYpJxRCaVtFVhyHz_FwMTQhqTw8+YT-zg@mail.gmail.com> (raw)
In-Reply-To: <d3342ff4-f717-f882-5c41-b27ab272dc03@cyberXpress.co.nz>

Erk.

Sorry for the feint, Mark.

CPAN is the perl way to get perl modules and such, but see below.

2021年4月2日(金) 5:38 Mark Aitchison <M.Aitchison@cyberxpress.co.nz>:

> I am writing perl programs that I'd like to know will work under both
> Linux and Cygwin,
> and have to deal with Unicode now.
>
> I had used Text::Unidecode happily in Linux but find no cygwin version.
> Possibly I am not
> looking in the right places for it, but possibly there are different
> Unicode-related
> modules that are well-supported under both cygwin and linux that I should
> be using
> instead, and I guess Unicode might be one of those things where it depends
> on the
> underlying o/s so it probably pays to go with whatever is the standard set
> of modules.
>
> 1. What perl Unicode modules should I consider, if not Text::Unidecode?
> The present need
> is to be able to convert those few "foreign" characters (like
> ÇĆĈĊçĉċĜĞĠĢĝģğġËÌÍÎÏÒÓÔÕ)
> that are basically ASCII with accent marks to their closest ASCII
> equivalents, but I'd
> like to do more with Unicode in the future, without going down any
> dead-ends as far as
> being able to run under cygwin is concerned.
>

"Stripping those few foreign accent characters" is probably not really what
you want to do.

Those "accent characters" are misinterpreted foreign encoding (likely not
to be Unicode) characters. Simply "stripping" the "accent characters" will
basically convert them to truly meaningless junk. I suppose the meaningless
junk can then be interpreted by the reader as "used to be a be a foreign
word here", but why bother contributing further to information entropy?

2. I see some talk of Internationalization in Chapter 2 of "Setting up
> Cygwin", but
> cannot see anything relating to perl modules, and I don't see any easy way
> to search many
> months of the mailing list for a keyword... is there any information I
> should know about?


Have you read the perldoc on internationalization?

  parent reply	other threads:[~2021-04-04 21:26 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01 20:35 Mark Aitchison
2021-04-04 20:22 ` L A Walsh
2021-04-04 20:27 ` L A Walsh
2021-04-04 21:26 ` Joel Rees [this message]
2021-04-05  9:26   ` L A Walsh
2021-04-05 10:49     ` Joel Rees
2021-04-05 21:50       ` Mark Aitchison
2021-04-05 22:39         ` Joel Rees
2021-04-05  6:43 ` Achim Gratz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAr43iOdVea3YYThgdYpJxRCaVtFVhyHz_FwMTQhqTw8+YT-zg@mail.gmail.com \
    --to=joel.rees@gmail.com \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).