How to read a single character at a time from a file in Python?
with open(filename) as f:
while True:
c = f.read(1)
if not c:
print "End of file"
break
print "Read a character:", c
Character reading from file in Python
Ref: http://docs.python.org/howto/unicode
Reading Unicode from a file is therefore simple:
import codecs
with codecs.open('unicode.rst', encoding='utf-8') as f:
for line in f:
print repr(line)
It's also possible to open files in update mode, allowing both reading and writing:
with codecs.open('test', encoding='utf-8', mode='w+') as f:
f.write(u'\u4500 blah blah blah\n')
f.seek(0)
print repr(f.readline()[:1])
EDIT: I'm assuming that your intended goal is just to be able to read the file properly into a string in Python. If you're trying to convert to an ASCII string from Unicode, then there's really no direct way to do so, since the Unicode characters won't necessarily exist in ASCII.
If you're trying to convert to an ASCII string, try one of the following:
Replace the specific unicode chars with ASCII equivalents, if you are only looking to handle a few special cases such as this particular example
Use the
unicodedata
module'snormalize()
and thestring.encode()
method to convert as best you can to the next closest ASCII equivalent (Ref https://web.archive.org/web/20090228203858/http://techxplorer.com/2006/07/18/converting-unicode-to-ascii-using-python):>>> teststr
u'I don\xe2\x80\x98t like this'
>>> unicodedata.normalize('NFKD', teststr).encode('ascii', 'ignore')
'I donat like this'
Python : read text file character by character in loop
I'd approach this differently, and make a function that takes in a filename that returns a generator:
def reader(filename):
with open(filename) as f:
while True:
# read next character
char = f.read(1)
# if not EOF, then at least 1 character was read, and
# this is not empty
if char:
yield char
else:
return
Then you need to give the filename only once
r = reader('filename')
And the file is kept opened for much faster operation. To fetch next character, use the next
built-in function
print(next(r)) # 0
print(next(r)) # 1
...
You can also use itertools
, such as islice
on this object slice characters, or use that in a for
loop:
# skip characters until newline
for c in r:
if r == '\n':
break
Python not able to read – character from text file
In open()
the default encoding is platform dependent, you can find out what is the default for your system by checking what locale.getpreferredencoding()
returns. This is from the documentation
For the 2nd part of your question, since you are not getting an error when you do not specify utf-8
as encoding, you could just use the output for locale.getpreferredencoding()
as the encoding method.
Read file up to a character
This is still far from optimal, but it would be a pure-Python implementation of a very simple buffer:
def my_open(filename, char):
with open(filename) as f:
old_fb=""
for file_buffer in iter(lambda: f.read(1024), ''):
if old_fb:
file_buffer = old_fb + file_buffer
pos = file_buffer.find(char)
while pos != -1 and file_buffer:
yield file_buffer[:pos]
file_buffer = file_buffer[pos+1:]
pos = file_buffer.find(char)
old_fb = file_buffer
yield old_fb
# Usage:
for line in my_open("weirdfile", "~"):
print(line)
Read a specific line or character from a text file, not recognizing the text
Your code works, but it cannot recognize the characters because readlines() also includes a newline character, so it reads 'x\n' rather than 'x'. Therefore there is no literal match. Replace .readlines()
with .read().splitlines()
to solve this.
Related Topics
Accessing Mp3 Metadata with Python
Character Reading from File in Python
Inserting Line at Specified Position of a Text File
Adding a Background Image to a Plot
Update a Dataframe in Pandas While Iterating Row by Row
Appending a Dictionary to a List - I See a Pointer Like Behavior
Difference Between .String and .Text Beautifulsoup
Can't Get Argparse to Read Quoted String with Dashes in It
Complexity of *In* Operator in Python
How to Resolve Typeerror: Can Only Concatenate Str (Not "Int") to Str
Binary Representation of Float in Python (Bits Not Hex)
Python - Is a Dictionary Slow to Find Frequency of Each Character
What Is the Most Pythonic Way to Pop a Random Element from a List
How to Convert a Time.Struct_Time Object into a Datetime Object
Writing a Dict to Txt File and Reading It Back
Difference Between Methods and Functions, in Python Compared to C++