Setting smaller buffer size for sys.stdin?
You can completely remove buffering from stdin/stdout by using python's -u
flag:
-u : unbuffered binary stdout and stderr (also PYTHONUNBUFFERED=x)
see man page for details on internal buffering relating to '-u'
and the man page clarifies:
-u Force stdin, stdout and stderr to be totally unbuffered. On
systems where it matters, also put stdin, stdout and stderr in
binary mode. Note that there is internal buffering in xread-
lines(), readlines() and file-object iterators ("for line in
sys.stdin") which is not influenced by this option. To work
around this, you will want to use "sys.stdin.readline()" inside
a "while 1:" loop.
Beyond this, altering the buffering for an existing file is not supported, but you can make a new file object with the same underlying file descriptor as an existing one, and possibly different buffering, using os.fdopen. I.e.,
import os
import sys
newin = os.fdopen(sys.stdin.fileno(), 'r', 100)
should bind newin
to the name of a file object that reads the same FD as standard input, but buffered by only about 100 bytes at a time (and you could continue with sys.stdin = newin
to use the new file object as standard input from there onwards). I say "should" because this area used to have a number of bugs and issues on some platforms (it's pretty hard functionality to provide cross-platform with full generality) -- I'm not sure what its state is now, but I'd definitely recommend thorough testing on all platforms of interest to ensure that everything goes smoothly. (-u
, removing buffering entirely, should work with fewer problems across all platforms, if that might meet your requirements).
Disable buffering of sys.stdin in Python 3
The trick is to use tty.setcbreak(sys.stdin.fileno(), termios.TCSANOW)
and before that store the terminal attributes via termios.getattr
in variable to restore the default behavior. With cbreak
set, sys.stdin.read(1)
is unbuffered. This also suppress the ansi controll code response from the terminal.
def getpos():
buf = ""
stdin = sys.stdin.fileno()
tattr = termios.tcgetattr(stdin)
try:
tty.setcbreak(stdin, termios.TCSANOW)
sys.stdout.write("\x1b[6n")
sys.stdout.flush()
while True:
buf += sys.stdin.read(1)
if buf[-1] == "R":
break
finally:
termios.tcsetattr(stdin, termios.TCSANOW, tattr)
# reading the actual values, but what if a keystroke appears while reading
# from stdin? As dirty work around, getpos() returns if this fails: None
try:
matches = re.match(r"^\x1b\[(\d*);(\d*)R", buf)
groups = matches.groups()
except AttributeError:
return None
return (int(groups[0]), int(groups[1]))
Python 3 on Windows: extend stdin.readline() line buffer size
This is a bug: https://bugs.python.org/issue41849
sys.stdin.readline()
has 512-character buffer, indeedinput()
has 16K-character buffer
So currently input()
can be used as a workaround.
Win32 buffer size for read from stdin
You don’t say which version of the runtime and OS you use, but I cannot reproduce this problem with MSVC 19.16.27031.1 on Windows 10. There are a few documented reasons it might fail. From the MSDN documentation of ReadFile
:
Characters can be read from the console input buffer by using
ReadFile
with a handle to console input. The console mode determines the exact behavior of theReadFile
function. By default, the console mode isENABLE_LINE_INPUT
, which indicates thatReadFile
should read until it reaches a carriage return. If you press Ctrl+C, the call succeeds, butGetLastError
returnsERROR_OPERATION_ABORTED
. For more information, seeCreateFile
.
There’s another way you could be getting this error, relating to asynchronous I/O, but that does not seem to be the problem here. You probably want to turn off the ENABLE_LINE_INPUT
flag with SetConsoleMode
. The documentation also says the call could fail with ERROR_NOT_ENOUGH_QUOTA
if the memory pages of the buffer cannot be locked. However, you use a static buffer that should not have this problem.
If you’re reading a file on disk, and not a console stream, you might map it to memory, which eliminates any intermediate buffering and loads the sections of files as needed, by the same mechanism as virtual memory.
How often does sys.stdin generate data?
OK, so here's what worked for me:
import sys
while True:
print sys.stdin.readline()
And start the script with python -u ...
.
I'll admit that Thomas' link to the other thread helped me find out that .readline()
should be used directly in order for -u
to have any effect.
Explanation: -u
disables process-level buffering of stdin (as in "the standard input" and not the sys.stdin
object specifically), and using .readline()
instead of for line in sys.stdin
avoids the internal buffering of sys.stdin
.
UPDATE: As to your question about this one-liner: "How is it assumed that interpreter will cross this line if t > e:
every one second?"... the "one liner" under observation is:
import sys, time
l = 0
e = int(time.time())
for line in sys.stdin:
t = int(time.time())
l += 1
if t > e:
e = t
print l
l = 0
time.time()
returns the current time in seconds as float
; converting it to int
basically just rounds it down to full seconds; and the first moment int(time.time())
is greater than e
, which was also set to be int(time.time())
, is when almost exactly one second has passed.
But the snippet still suffers from the exact same input buffering issue your original snippet; also, it's invoked without the -u
flag, so I cannot imagine why it would ever work reliably on any system, unless the buffering semantics on that system were different at both the Python process STDIN level as well as in the implementation of sys.stdin
.
Why scanf can read more than 1024 character while stdin stream buffer is 1024 bytes only?
As noted in a comment, when scanf()
gets to the end of the first buffer full, if it still needs more data, it goes back to the system to get more, possibly many times. The buffer is merely a convenience and optimization measure.
taking multiline input with sys.stdin
So, took your code out of the function and ran some tests.
import sys
buffer = []
while run:
line = sys.stdin.readline().rstrip('\n')
if line == 'quit':
run = False
else:
buffer.append(line)
print buffer
Changes:
- Removed the 'for' loop
- Using 'readline' instead of 'readlines'
- strip'd out the '\n' after input, so all processing afterwards is much easier.
Another way:
import sys
buffer = []
while True:
line = sys.stdin.readline().rstrip('\n')
if line == 'quit':
break
else:
buffer.append(line)
print buffer
Takes out the 'run' variable, as it is not really needed.
Related Topics
Printing List Elements on Separate Lines in Python
How to .Decode('String-Escape') in Python 3
How to Get the Original Variable Name of Variable Passed to a Function
Python: Execute Cat Subprocess in Parallel
Why Don't These List Operations Return the Resulting List
Getting the Index of the Returned Max or Min Item Using Max()/Min() on a List
I Need to Securely Store a Username and Password in Python, What Are My Options
How to Open a File Using the Open with Statement
How to Time a Code Segment for Testing Performance with Pythons Timeit
Convert Pandas Timezone-Aware Datetimeindex to Naive Timestamp, But in Certain Timezone
Display a Decimal in Scientific Notation
Changing User Agent on Urllib2.Urlopen
How to Upgrade All Python Packages with Pip
How to Lowercase a String in Python