Only Reading First N Rows of CSV File With CSV Reader in Python

only reading first N rows of csv file with csv reader in python

The shortest and most idiomatic way is probably to use itertools.islice:

import itertools
...
        for row in itertools.islice(reader1, 200):
            ...

Python Pandas: How to read only first n rows of CSV files in?

If you only want to read the first 999,999 (non-header) rows:

read_csv(..., nrows=999999)

If you only want to read rows 1,000,000 ... 1,999,999

read_csv(..., skiprows=1000000, nrows=999999)

nrows : int, default None Number of rows of file to read. Useful for
reading pieces of large files*

skiprows : list-like or integer
Row numbers to skip (0-indexed) or number of rows to skip (int) at the start of the file

and for large files, you'll probably also want to use chunksize:

chunksize : int, default None
Return TextFileReader object for iteration

pandas.io.parsers.read_csv documentation

Read from a .csv the first n rows and store the column in to a list

You can use pandas to do this:

import pandas as pd

df = pd.read_csv("test.csv", nrows=2000, header=None) #header = None avoids the first row to be read as column names
df_list = df.values.tolist()

How do I read different sections of a CSV file when the first 5 lines sometimes has more than 1 columns?

In both cases, I'd generalize your CSV as follows:

lines 1-4: special "lines" of text
line 5: garbage (discard)
lines 6-...: meaningful "rows"

Here's that general approach in code. The parse_special_csv function takes a filename as input and return two lists:

the first is a list of "lines" (1-4); they're technically rows, but it's more about how you treat them/what you do with them
the second is a list of rows, (lines 6-...)

My thinking being, that once you have the data split out, and file is completely parsed, you'll know what to do with lines and what to do with rows:

import csv

def parse_special_csv(fname):
    lines = []
    rows = []
    with open(fname, 'r', newline='') as f:
        reader = csv.reader(f)

        # Treat lines 1-4 as just "lines"
        for i in range(4):
            row = next(reader)    # manually advance the reader
            lines.append(row[0])  # safe to index for first column, because *you know* these lines have column-like data
        
        # Discard line 5
        next(reader)

        # Treat the remaining lines as CSV rows
        for row in reader:
            rows.append(row)

    return lines, rows

lines, rows = parse_special_csv('sample1.csv')
print('sample1')
print('lines:')
print(lines)
print('rows:')
print(rows)
print()

lines, rows = parse_special_csv('sample2.csv')
print('sample2')
print('lines:')
print(lines)
print('rows:')
print(rows)
print()

And I get, based on your samples:

sample1
lines:
[
 'File For EMS Team Downloaded By Bob Mortimer At 17:22:36 09/11/2021',
 'line two content',
 'line 3 content.',
 'line 4 content.'
]
rows:
[
 ['1', 'TEAM', 'Bob Jones', 'Sar a  require transport', 'A', '', '18:34:04hrs on 17/10/21'],
 ['2', 'TEAM', 'Peter Smith', 'Sar h', 'H', '', '20:43:49hrs on 17/10/21'],
 ['3', 'TEAM', 'Neil Barnes', 'SAR H', 'H', '', '20:15:12hrs on 17/10/21']
]

sample2
lines:
[
 'File For EMS Team Downloaded By Bob Mortimer At 17:22:36 09/11/2021',
 'line two content',
 'line 3 content.',
 'line 4 content.'
]
rows:
[
 ['1', 'TEAM', 'Bob Jones', 'Sar a  require transport', 'A', '', '18:34:04hrs on 17/10/21'],
 ['2', 'TEAM', 'Peter Smith', 'Sar h', 'H', '', '20:43:49hrs on 17/10/21'],
 ['3', 'TEAM', 'Neil Barnes', 'SAR H', 'H', '', '20:15:12hrs on 17/10/21']
]

Also, next(reader) may look a little foreign, but it's the correct way to manually advance the CSV reader^1 (and any iterator in Python, in general^2).

How to read first 100 rows of a csv file in python appending comma, serial number and full stop marks?

Assuming that the crucial fields are separated by multiple spaces:

import re

with open('test.csv', 'r') as f:
    next(f)
    pat = re.compile(r'\s{2,}')

    for i, row in enumerate(f, 1):
        print('{}. {}.'.format(i, pat.sub(', ', row.strip(), 1)))
        if i == 100: break

Regex \s{2,} details:

\s - whitespace character
{2,} - {n,m} where n >= 0 and m >= n. Repeats the previous item between n and m times. Greedy, so repeating m times is tried before reducing the repetition to n times. Ex. a{2,4} matches aaaa, aaa or aa

Sample output:

1. what is your name, i am maxi.
2. are you happy, yes i am.
3. what you do, i am a student.

How to read desired rows from large CSV files in python

You can use islice from itertools

So here i have as sample csv file

    X      Y
0  21  test3
1   8  test1
2  75  test1
3  26  test2
4  98  test3
5  63  test3
6  65  test3
7  39  test3
8  74  test1
9  26  test2

And suppose I want only rows 3 and 4

>>> from itertools import islice
>>> with open('test.csv') as f:
...     rows = csv.reader(f)
...     rowiter = islice(rows, 3, 5)
...     for item in rowiter:
...             print(item)

gives me the following output

['2', '75', 'test1']
['3', '26', 'test2']

Update

input_file = 'trusted.csv'
start = 10
stop = start + 10
users = []

with open(input_file, encoding='UTF-8') as f:
    rows = csv.reader(f,delimiter=",",lineterminator="\n")    
    rowiter = islice(rows, start, stop)
    for row in rowiter :
      user = {}
      user['username'] = row[0]
      user['id'] = int(row[1])
      user['access_hash'] = int(row[2])
      user['name'] = row[3]
      users.append(user)

How to read first 1000 entries in a csv file

As you've discovered a csv.reader does not support slicing. You can use itertools.islice() to accomplish this with objects that are iterable. E.g.,

import itertools

entries = []
with open('mnist_train.csv', 'r') as f:
    mycsv = csv.reader(f)
    for row in itertools.islice(mycsv, 1000):
        entries.append(row)

Only Reading First N Rows of CSV File With CSV Reader in Python