How to Read Unicode Input and Compare Unicode Strings in Python

How can I compare a unicode type to a string in python?

You must be looping over the wrong data set; just loop directly over the JSON-loaded dictionary, there is no need to call .keys() first:

data = json.loads(response)
myList = [item for item in data if item == "number1"]

You may want to use u"number1" to avoid implicit conversions between Unicode and byte strings:

data = json.loads(response)
myList = [item for item in data if item == u"number1"]

Both versions work fine:

>>> import json
>>> data = json.loads('{"number1":"first", "number2":"second"}')
>>> [item for item in data if item == "number1"]
[u'number1']
>>> [item for item in data if item == u"number1"]
[u'number1']

Note that in your first example, us is not a UTF-8 string; it is unicode data, the json library has already decoded it for you. A UTF-8 string on the other hand, is a sequence encoded bytes. You may want to read up on Unicode and Python to understand the difference:

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky
The Python Unicode HOWTO
Pragmatic Unicode by Ned Batchelder

On Python 2, your expectation that your test returns True would be correct, you are doing something else wrong:

>>> us = u'MyString'
>>> us
u'MyString'
>>> type(us)
<type 'unicode'>
>>> us.encode('utf8') == 'MyString'
True
>>> type(us.encode('utf8'))
<type 'str'>

There is no need to encode the strings to UTF-8 to make comparisons; use unicode literals instead:

myComp = [elem for elem in json_data if elem == u"MyString"]

Compare unicode string with byte string

This is probably much easier in Python 3 due to a change in how strings are handled.

Try opening your file with the encoding specified and pass the file-like to the csv library See csv Examples

import csv
with open('some.csv', newline='', encoding='UTF-16LE') as fh:
    reader = csv.reader(fh)
    for row in reader:  # reader is iterable
        # work with row

After some comments, the read attempt comes from a FTP server.

Switching a string read to FTP binary and reading through a io.TextIOWrapper() may work out

Out now with even more context managers!:

import io
import csv
from ftplib import FTP

with FTP("ftp.example.org") as ftp:
    with io.BytesIO() as binary_buffer:
        # read all of products.csv into a binary buffer
        ftp.retrbinary("RETR products.csv", binary_buffer.write)
        binary_buffer.seek(0)  # rewind file pointer
        # create a text wrapper to associate an encoding with the file-like for reading
        with io.TextIOWrapper(binary_buffer, encoding="UTF-16LE") as csv_string:
            for row in csv.reader(csv_string):
                # work with row

How do I compare a Unicode string that has different bytes, but the same value?

Unicode normalization will get you there for this one:

>>> import unicodedata
>>> unicodedata.normalize("NFC", "\uf9fb") == "\u7099"
True

Use unicodedata.normalize on both of your strings before comparing them with == to check for canonical Unicode equivalence.

Character U+F9FB is a "CJK Compatibility" character. These characters decompose into one or more regular CJK characters when normalized.

How do I check if a string is unicode or ascii?

In Python 3, all strings are sequences of Unicode characters. There is a bytes type that holds raw bytes.

In Python 2, a string may be of type str or of type unicode. You can tell which using code something like this:

def whatisthis(s):
    if isinstance(s, str):
        print "ordinary string"
    elif isinstance(s, unicode):
        print "unicode string"
    else:
        print "not a string"

This does not distinguish "Unicode or ASCII"; it only distinguishes Python types. A Unicode string may consist of purely characters in the ASCII range, and a bytestring may contain ASCII, encoded Unicode, or even non-textual data.

How to get Unicode input from user in Python?

\u is an escape sequence recognized in string literals:

Escape sequences only recognized in string literals are:
Escape      Meaning                                  Notes
Sequence

\N{name}    Character named name
            in the Unicode database                  (4)
\uxxxx      Character with 16-bit hex value xxxx     (5)
\Uxxxxxxxx  Character with 32-bit hex value xxxxxxxx (6)
Notes:

Changed in version 3.3: Support for name aliases 1 has been added.
Exactly four hex digits are required.
Any Unicode character can be encoded this way. Exactly eight hex digits are required.

Use

varUnicode = input('\tEnter your Unicode\n\t>')
print('\\u{}'.format(varUnicode.zfill(4)).encode('raw_unicode_escape').decode('unicode_escape'))

or (maybe better)

varUnicode = input('\tEnter your Unicode\n\t>')
print('\\U{}'.format(varUnicode.zfill(8)).encode('raw_unicode_escape').decode('unicode_escape'))