Git Bash Is Displaying Strange Characters on Windows 7

Getting weird characters in Git for Windows bash when running cypress from command line

The issue is described in this GitHub issue.

The problem is that Cypress is sending UTF-8 encoded text through its stdout which is mangled by Windows before being received by Mintty (which is what hosts bash and runs git on Windows).

I understand that Mintty doesn't yet instruct Windows to not mangle the stdout it processes - (cmd.exe does, however, which is why it works there) - but we can do that ourselves by changing our Windows OEM Code Page setting using the chcp program (located at C:\Windows\System32\chcp.com and yes, that's a .com, not .exe). You can add a command to your .bashrc file so it will always run when you fire up Mintty:

  1. Open mintty on Windows - presumably this starts a bash shell.

  2. Go to your home directory (i.e. cd ~)

  3. Open or create a .bashrc file.

  4. Put this in the file (update the path to your chp.com program as appropriate):

    /c/Windows/System32/chcp.com 65001
  5. Then restart the terminal window and it should work.

Unicode (utf-8) with git-bash

As CharlesB said in a comment, msysgit 1.7.10 handles unicode correctly. There are still a few issues but I can confirm that updating did solve the issue I was having.

See: https://github.com/msysgit/msysgit/wiki/Git-for-Windows-Unicode-Support

Git bash on Windows different result than terminal on CentOS for regex

From the bash manual:

A pair of characters separated by a hyphen denotes a range expression; any character that falls between those two characters, inclusive, using the current locale’s collating sequence and character set, is matched. If the first character following the ‘[’ is a ‘!’ or a ‘^’ then any character not enclosed is matched. A ‘-’ may be matched by including it as the first or last character in the set.

Your Git Bash locale uses rules that don't match accented characters in ranges like a-z, your CentOS locale does. This can be addressed by using a consistent locale like C for collation. Plus your - is in the wrong spot; it needs to be first or last, and the backslash needs to be escaped with another backslash to match a literal one.

#!/bin/bash
LC_COLLATE=C
customer=Reportçós
cleanedCustomer=${customer//[^a-zA-Z0-9 \\_.-]/}
printf "%s\n" "$cleanedCustomer"

Bash: Strange characters even after setting locale to UTF-8: • prints as ΓÇó

An example with querying the current characters mapping locale charmap used by the system locale, and filtering the output with recode to render it with compatible characters mapping:

#!/usr/bin/env sh

cat <<EOF | recode -qf "UTF-8...$(locale charmap)"
• These are
• UTF-8 bullets in source
• But it can gracefully degrade with recode
EOF

With a charmap=ISO-8859-1 it renders as:

o These are
o UTF-8 bullets in source
o But it can gracefully degrade with recode

Alternate method using iconv instead of recode and results may even be better.

#!/usr/bin/env sh

cat <<EOF | iconv -f 'UTF-8' -t "$(locale charmap)//TRANSLIT"
• These are
• UTF-8 bullets followed by a non-breaking space in source
• But it can gracefully degrade with iconv
• Europe's currency sign is € for Euro.
EOF

iconv output with an fr_FR.iso-8859-15@Euro locale:

o These are
o UTF-8 bullets followed by a non-breaking space in source
o But it can gracefully degrade with iconv
o Europe's currency sign is € for Euro.

Weird control characters from Gradle in Windows 10

I'm guessing you could pass --console plain in the gradle command line to disable the rich console which is likely the cause of the "funky" characters

https://docs.gradle.org/current/userguide/gradle_command_line.html



Related Topics



Leave a reply



Submit