How to Read a Large CSV File With Pandas

How to read a few lines in a large CSV file with pandas?

Try this

train = pd.read_csv('file.csv', iterator=True, chunksize=150000)

If you only want to read the first n rows:

train = pd.read_csv(..., nrows=n)

If you only want to read rows from n to n+100

train = pd.read_csv(..., skiprows=n, nrows=n+100)

How to read a large csv with pandas?

Try to use the chunksize parameter, filter in chunks and then concat

t_min, t_max, n_min, n_max, c_min, c_max = map(float, raw_input('t_min, t_max, n_min, n_max, c_min, c_max: ').split())

num_of_rows = 1024
TextFileReader = pd.read_csv(path, header=None, chunksize=num_of_rows)

dfs = []
for chunk_df in TextFileReader:
dfs.append(chunk_df.loc[(chunk_df[0] >= t_min) & (chunk_df[0] <= t_max) & (chunk_df[1] >= n_min) & (chunk_df[1] <= n_max) & (chunk_df[2] >= c_min) & (chunk_df[2] <= c_max)])

df = pd.concat(dfs,sort=False)


Related Topics



Leave a reply



Submit