Count the length (number of lines) of a CSV file?
another way to read the number of lines is
file.readlines.size
How to obtain the total numbers of rows from a CSV file in Python?
You need to count the number of rows:
row_count = sum(1 for row in fileObject) # fileObject is your csv.reader
Using sum()
with a generator expression makes for an efficient counter, avoiding storing the whole file in memory.
If you already read 2 rows to start with, then you need to add those 2 rows to your total; rows that have already been read are not being counted.
How can I count the lines of multiple csv that are in one folder?
The pros of using count.fields
is that it doesn't load the file into the memory.
Thus, it should be faster than using read.csv
or another function.
Get the list of files:
files <- list.files(path, full.names=TRUE)
Get the number of rows in each file:
lapply(X = files, FUN = function(x) {
length(count.fields(x, skip = 1))
})
Benchmark
library(rbenchmark)
benchmark("count.fields" = {
lapply(X = files, FUN = function(x) {
length(count.fields(x, skip = 1))
})
},
"read.csv" = {
lapply(X = files, FUN = function(x) {
nrow(read.csv(x, skip = 1))
})
},
"fread" = {
lapply(X = files, FUN = function(x) {
nrow(data.table::fread(x, skip = 1))
})
},
replications = 1000,
columns = c("test", "replications", "elapsed",
"relative", "user.self", "sys.self"))
test replications elapsed relative user.self sys.self
1 count.fields 1000 0.81 1.000 0.28 0.50
3 fread 1000 6.24 7.704 4.57 1.66
2 read.csv 1000 2.93 3.617 2.16 0.76
Row count in a csv file
with open(adresse,"r") as f:
reader = csv.reader(f,delimiter = ",")
data = list(reader)
row_count = len(data)
You are trying to read the file twice, when the file pointer has already reached the end of file after saving the data
list.
Python 3 Count number of rows in a CSV
If you are using pandas you can easily do that, without much coding stuff.
import pandas as pd
df = pd.read_csv('filename.csv')
## Fastest would be using length of index
print("Number of rows ", len(df.index))
## If you want the column and row count then
row_count, column_count = df.shape
print("Number of rows ", row_count)
print("Number of columns ", column_count)
Rails how to get the row count from CSV fast
Eventually I found solution, CSV.read(file.path).length
is faster than CSV.parse(f.read).length
Related Topics
Ruby Class Instance Variables and Inheritance
Unable to Load Gem Cocoa Pods While Creating Repo
Understanding Ruby Symbol as Method Call
How to Start the Ruby Debugger on Exception
Simple Ruby Input Validation Library
Mongoid: Find Through Array of Ids
Ruby Way to Group Anagrams in String Array
Convert String with Comma to Integer
How to Organize Minitest/Unit Tests
How to View a Sample of the Call Stack in Ruby
Check If String Contains Any Substring in an Array in Ruby
Is Time.Zone.Now.To_Date Equivalent to Date.Today
Rails Ssl Issue: (Https://Example.Com) Didn't Match Request.Base_Url (Http://Example.Com)
How to Get My Aws Lambda to Access Gems Stored in Vendor/Bundle