In C# Differencebetween Toupper() and Toupperinvariant()

In C# what is the difference between ToUpper() and ToUpperInvariant()?

ToUpper uses the current culture. ToUpperInvariant uses the invariant culture.

The canonical example is Turkey, where the upper case of "i" isn't "I".

Sample code showing the difference:

using System;
using System.Drawing;
using System.Globalization;
using System.Threading;
using System.Windows.Forms;

public class Test
{
[STAThread]
static void Main()
{
string invariant = "iii".ToUpperInvariant();
CultureInfo turkey = new CultureInfo("tr-TR");
Thread.CurrentThread.CurrentCulture = turkey;
string cultured = "iii".ToUpper();

Font bigFont = new Font("Arial", 40);
Form f = new Form {
Controls = {
new Label { Text = invariant, Location = new Point(20, 20),
Font = bigFont, AutoSize = true},
new Label { Text = cultured, Location = new Point(20, 100),
Font = bigFont, AutoSize = true }
}
};
Application.Run(f);
}
}

For more on Turkish, see this Turkey Test blog post.

I wouldn't be surprised to hear that there are various other capitalization issues around elided characters etc. This is just one example I know off the top of my head... partly because it bit me years ago in Java, where I was upper-casing a string and comparing it with "MAIL". That didn't work so well in Turkey...

What is difference between ToUpperInvariant() and ToUpper(new CultureInfo(en-US, false))

ToUpper() is the same as ToUpper(CultureInfo.CurrentCulture),
whereas ToUpperInvariant() is the same as ToUpper(CultureInfo.InvariantCulture), the comments hint that you already figured that out.

So of course there is a difference here - CultureInfo.InvariantCulture should only be used when not interacting with humans (parsers etc), as it gives a consistent result, whereas ToUpper(CultureInfo.CurrentCulture) varies quite a lot between computers, servers, etc.

CultureInfo.InvariantCulture is an english-inspired culture similar to but is not equal to en-US and is not bound to any country or region, and cannot be customized by users (as explicitly stated in the documentation).

As to explicitly answer your question regarding ToUpper - yes there are differences. In all of those cases (presented below), ToUpperInvariant() is the same char as the lowercase source:

lc    en-US     Invariant
== ===== =========

µ Μ µ
ı I ı
ſ S ſ
Dž DŽ Dž
Lj LJ Lj
Nj NJ Nj
Dz DZ Dz
ͅ Ι ͅ // ͅͅͅͅͅͅͅthis one lives in the 4th dimension.
ς Σ ς
ϐ Β ϐ
ϑ Θ ϑ
ϕ Φ ϕ
ϖ Π ϖ
ϰ Κ ϰ
ϱ Ρ ϱ
ϵ Ε ϵ
ẛ Ṡ ẛ
ι Ι ι

Is there a good tool for MySQL that will help me optimise my queries and index settings?

Here's some info about EXPLAIN (referenced from the High Performance MySQL book from O'Reilly):

When you run an EXPLAIN on a query, it tells you everything MySQL knows about that query in the form of reports for each table involved in the query.

Each of these reports will tell you...

  • the ID of the table (in the query)
  • the table's role in a larger selection (if applicable, might just say SIMPLE if it's only one table)
  • the name of the table (duh)
  • the join type (if applicable, defaults to const)
  • a list of indexes on the table (or NULL if none), possible_keys
  • the name of the index that MySQL decided to use, key
  • the size of the key value (in bytes)
  • ref shows the cols or values used to match against the key
  • rows is the number of rows that MySQL thinks it needs to examine in order to satisfy the query. This should be kept as close to your calculated minimum as possible!
  • ...then any extra information MySQL wishes to convey

The book is completely awesome at providing information like this, so if you haven't already, get your boss to sign off on a purchase.

Otherwise, I hope some more knowledgeable SO user can help :)

ToUpper() is changing sometimes the character (µ - M)

You can use the method String.ToUpperInvariant().

In this method, the invariant culture is used.

This method is exactly the same as calling myString.ToUpper(CultureInfo.InvariantCulture);

Can I use: TextEntered.ToUpperInvariant().Contains(a) to count chars in a string?

You could use Count:

int countOfA = textEntered.Count(x => x == 'a');

EDIT:

int countOfA = textEntered.ToUpper().Count(x => x == 'A');

What is wrong with ToLowerInvariant()?

Google gives a hint pointing to CA1308: Normalize strings to uppercase

It says:

Strings should be normalized to uppercase. A small group of characters, when they are converted to lowercase, cannot make a round trip. To make a round trip means to convert the characters from one locale to another locale that represents character data differently, and then to accurately retrieve the original characters from the converted characters.

So, yes - ToUpper is more reliable than ToLower.

In the future I suggest googling first - I do that for all those FxCop warnings I get thrown around ;) Helps a lot to read the corresponding documentation ;)

C#: better way than to combine StartsWith and two ToUpperInvariant calls

You can use the overloaded StartsWith method taking a StringComparison enum value:

keyAttributeValue.StartsWith(STR_ConnectionString, StringComparison.OrdinalIgnoreCase) // or use StringComparison.InvariantCultureIgnoreCase here

string.ToLower() and string.ToLowerInvariant()

Depending on the current culture, ToLower might produce a culture specific lowercase letter, that you aren't expecting. Such as producing ınfo without the dot on the i instead of info and thus mucking up string comparisons. For that reason, ToLowerInvariant should be used on any non-language-specific data. When you might have user input that might be in their native language/character-set, would generally be the only time you use ToLower.

See this question for an example of this issue:
C#- ToLower() is sometimes removing dot from the letter "I"

Convert string to upper case using string interpolation

Given the DateTime.ToString Method documentation, no. As what you want to do is manipulate string case and not DateTime formating, this makes sense.

For string interpolation quick format, you'd want that the object to be formatted implement IFormattable interface, which is not the case of String type.



Related Topics



Leave a reply



Submit