LINUX to Windows bad encoding response
UTF-8 is designed to encode the Unicode character set. It can't in general be used to encode arbitrary binary data, because (depending on the implementation) it may misbehave when the binary data represents illegal Unicode characters.
You need to pass the request from PHP to your C# program in unencoded binary form, or in an encoding such as Base64 which is designed for arbitrary binary data.
Http response decoding behaves differently from Windows to Linux
The problem came from Linux having its default Charset set to UTF-8.
Adding the argument -Dfile.encoding=ISO-8859-1
to $CATALINA_OPTS in Tomcat's config solved my problem.
Linux using command file -i return wrong value charset=unknow-8bit for a windows-1252 encoded file
It's important to understand what a character encoding is and isn't.
A text file is actually just a stream of bits; or, since we've mostly agreed that there are 8 bits in a byte, a stream of bytes. A character encoding is a lookup table (and sometimes a more complicated algorithm) for deciding what characters to show to a human for that stream of bytes.
For instance, the character "€" encoded in Windows-1252 is the string of bits 10000000
. That same string of bits will mean other things in other encodings - most encodings assign some meaning to all 256 possible bytes.
If a piece of software knows that the file is supposed to be read as Windows-1252, it can look up a mapping for that encoding and show you a "€". This is how browsers are displaying the right thing: you've told them in the Content-Type header to use the Windows-1252 lookup table.
Once you save the file to disk, that "Windows-1252" label form the Content-Type header isn't stored anywhere. So any program looking at that file can see that it contains the string of bits 10000000
but it doesn't know what mapping table to look that up in. Nothing you do in the HTTP headers is going to change that - none of those are going to affect how it's saved on disk.
In this particular case the "file" command could look at the "encoding" marker inside the XML document, and find the "windows-1252" there. My guess is that it simply doesn't have that functionality. So instead it uses its general logic for guessing an encoding: it's probably something ASCII-compatible, because it starts with the bytes that spell <?xml
in ASCII; but it's not ASCII itself, because it has bytes outside the range 00000000
to 01111111
; anything beyond that is hard to guess, so output "unknown-8bit".
mysql console (windows-linux), wrong character set?
Set PuTTY to interpret received data as UTF8 in Window -> Translation "Character set on received data".
Java String encoding - Linux different than on Windows
Both machines have the same Locale in Java (
Locale.getDefault()
) -> I tried that already.
It is the default charset, not the default locale that determines what character set is used when decoding / encoding a string without a specified charset.
Check what Charset.defaultCharset().name()
returns on your Windows and Linux machines. I expect that they will be different, based on the symptoms that you are reporting.
Wrong text encoding when parsing json data
You're reading the data as ISO 8859-1 but the file is actually UTF-8. I think there's an argument (or setting) to the file reader that should solve that.
Also: curl isn't going to care about the encodings. It's really something in your Java code that's wrong.
Related Topics
Irregular Shaped Windows Form (C#)
Finding Out If a Type Implements a Generic Interface
How to Call a .Net Assembly from C/C++
Logoff Interactive Users in Windows from a Service
Non-Virtual Interface Design Pattern in C#/C++
How to Add a Reference to an Unmanaged C++ Project Called by a C# Project
Visual Studio C# Intellisense Not Automatically Displaying
How to Upload File Using Ajax.Beginform() Asynchronously
How to Get Data by SQLdatareader.Getvalue by Column Name
How to Pass an Object into a Timer Event
Insert Text into the Textbox of Another Application
How to Pass Current User Information to All Layers in Ddd
How to Dllexport a C++ Class for Use in a C# Application
How to Get The Http Post Data in C#
Asynchronous Controller Is Blocking Requests in ASP.NET MVC Through Jquery