From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by sourceware.org (Postfix) with ESMTPS id 1E40C3858D37 for ; Thu, 3 Feb 2022 04:12:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1E40C3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=ucar.edu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ucar.edu Received: by mail-io1-xd2d.google.com with SMTP id z199so1699728iof.10 for ; Wed, 02 Feb 2022 20:12:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucar-edu.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=Mp1dKoMlVRY7psD2KDKwG/BBDngq7JoAZ0Gc3WZYXpw=; b=ZV+eFhUyrZBfXnhJVXpejOqr73iYXCnOVOcHiL4tSQkgc7umEhgGXyH0y/rjoUv7rS Gj3WZDl/QNkDTk7gJZllKDqjYS7NFMHcrwvVXOhDdmNjGZZw5OwkMn5eg+gfSCck9Cp6 6Hpan29pcQIOB7B38eSwx5KGGUPH6hOrC48Dopjy5Wre9//pbw8Zkehhua50g+fKXZYv rvB9hrQ0NafH8Kx9zpTiA/7TDmTftUDrWhdalt/gDPw6du369Nazgn6VkdNv9Pe1GiBG GGxBchGATHYAPhnbU4UB2GHvp7mMgMDvNGnbCd1UivSaOXT77o77qeLsnU+X9Ref9D1D atOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=Mp1dKoMlVRY7psD2KDKwG/BBDngq7JoAZ0Gc3WZYXpw=; b=8Da7arL/GQKivAkh0NSdmFtOiH1hk1DocLh1zE70ZAkOBK0k1XIXNQgJ7QN4rynx+2 EVXAq76IabRfkk4ZR0bEmrzM3GMPFfDJwy3ou0M9ZkjAIyVvI0zs5+ostzcmfbjJdIuK qnBgZu+r49Lx9T5LGdvkCypMDlb37j2O4nzzUpGdpb0kZ72lVqAYIcSxmqeIKAahWjFp M2JC30hlsmbs0cd6uwtLx1m+GzamzPdmloCj1h0jsi/Pjk+ih3+hJ8SlVLof8eIXbVea OeFKhFnaimbzIJfD5qSAWF0qvKgk2lEAstHZadqZOGGPqipeiC1pyq18d3m/Ymdlge5q c73g== X-Gm-Message-State: AOAM531TDMl0B6U2zrZe2MTwISNEDwppLF589hI66aq+K7oDbuUpD0yv PuYoYkPgu7pdSkaaIXjIyVtWhghJjdMx7g== X-Google-Smtp-Source: ABdhPJznAqxjYry2dzO4/aTQroLyueQ4ECCAzpss3/pOK3scgJ1MImLqKc1xDVjtC+T0rBZiryfPgw== X-Received: by 2002:a05:6638:d4c:: with SMTP id d12mr11235878jak.283.1643861578472; Wed, 02 Feb 2022 20:12:58 -0800 (PST) Received: from [192.168.0.3] (97-118-143-191.hlrn.qwest.net. [97.118.143.191]) by smtp.gmail.com with ESMTPSA id f13sm21773817ion.18.2022.02.02.20.12.57 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 02 Feb 2022 20:12:58 -0800 (PST) Message-ID: <214212b2-270b-ad62-837b-fb34697a2f33@ucar.edu> Date: Wed, 2 Feb 2022 21:12:56 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.5.1 Subject: Re: Removing ^X in paths Content-Language: en-US To: L A Walsh Cc: cygwin@cygwin.com References: <0255429a-409d-c17a-7b4d-8cbbfbea7255@ucar.edu> <61FB3CA1.8000001@tlinx.org> From: Dennis Heimbigner In-Reply-To: <61FB3CA1.8000001@tlinx.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Feb 2022 04:13:01 -0000 I am using 64bit. And it has nothing to do misreading characters. The ^X is described in this document: https://www.cygwin.com/cygwin-ug-net/using-specialnames.html, There you will see this text: "If you don't want or can't use UTF-8 as character set for whatever reason, you will nevertheless be able to access the file. How does that work? When Cygwin converts the filename from UTF-16 to your character set, it recognizes characters which can't be converted. If that occurs, Cygwin replaces the non-convertible character with a special character sequence. The sequence starts with an ASCII CAN character (hex code 0x18, equivalent Control-X), followed by the UTF-8 representation of the character. The result is a filename containing some ugly looking characters. While it doesn't look nice, it is nice, because Cygwin knows how to convert this filename back to UTF-16. The filename will be converted using your usual character set. However, when Cygwin recognizes an ASCII CAN character, it skips over the ASCII CAN and handles the following bytes as a UTF-8 character. Thus, the filename is symmetrically converted back to UTF-16 and you can access the file." There is no obvious good reason to continue this convention. On 2/2/2022 7:23 PM, L A Walsh wrote: > On 2022/02/02 12:40, Dennis Heimbigner wrote: >> It appears that windows now supports the UTF-8 codepage. > It has since early 2000's. >> I light of this, it seems time to change cygwin so it no longer adds >> those >> control-x (^X)  characters in e.g. path names. > ^x is ASCII.  Cygwin doesn't insert ^X characters in paths. > > Perhaps you are thinking of '\' which looks like ¥ (a capital 'Y' with > 2 horizontal lines, (Fullwidth Yen Sign  U+FFE5)...if that's the case, > some 8-bit font > displayed that sign instead of a backslash in non-unicode locals. > > Are you using a 32-bit or 64-bit version of Cygwin?  on what version > of windows? > > If you still use a 32-bit version, you might need to move to a 64-bit > version. > I know the 32-bit version sometimes had the problem because it supported > fewer fonts and fewer characters at the same time. > > You might check out your locale (if in english, try setting: > LC_CTYPE="en_US.UTF-8" > in your shell and also check that your used font has a backslash in the > 0x7f position. > > But in shell, ^x is usually a character to erase the whole line -- so > it really > wouldn't do to have it in a PATH. > > Hope this helps, and sorry if this is completely off base. > > Linda > >> >