Number of Rows Each Data Frame in a List

Count the number of row in each data.frame in a list

Like this?

> table(df$Func)

FUNC1 FUNC2 FUNC3
2 1 1

Count rows of dataframes within a list of dataframes

The solution you described works for me:

Make a reproducible example:

datalist <- list(
data.frame(V1 = 1:2, topics = I(list(mtcars, mtcars))),
data.frame(V1 = 1:2, topics = I(list(mtcars, mtcars)))
)
str(datalist)
# List of 2
# $ :'data.frame': 2 obs. of 2 variables:
# ..$ V1 : int [1:2] 1 2
# ..$ topics:List of 2
# .. ..$ :'data.frame': 32 obs. of 11 variables:
# .. .. ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# .. .. ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
# .. .. ..$ disp: num [1:32] 160 160 108 258 360 ...
# .. .. ..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
# .. .. ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
# .. .. ..$ wt : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
# .. .. ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
# .. .. ..$ vs : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
# .. .. ..$ am : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
# .. .. ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
# .. .. ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
# .. ..$ :'data.frame': 32 obs. of 11 variables:
# .. .. ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# .. .. ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
# .. .. ..$ disp: num [1:32] 160 160 108 258 360 ...
# .. .. ..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
# .. .. ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
# .. .. ..$ wt : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
# .. .. ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
# .. .. ..$ vs : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
# .. .. ..$ am : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
# .. .. ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
# .. .. ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
# .. ..- attr(*, "class")= chr "AsIs"
# $ :'data.frame': 2 obs. of 2 variables:
# ..$ V1 : int [1:2] 1 2
# ..$ topics:List of 2
# .. ..$ :'data.frame': 32 obs. of 11 variables:
# .. .. ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# .. .. ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
# .. .. ..$ disp: num [1:32] 160 160 108 258 360 ...
# .. .. ..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
# .. .. ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
# .. .. ..$ wt : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
# .. .. ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
# .. .. ..$ vs : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
# .. .. ..$ am : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
# .. .. ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
# .. .. ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
# .. ..$ :'data.frame': 32 obs. of 11 variables:
# .. .. ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# .. .. ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
# .. .. ..$ disp: num [1:32] 160 160 108 258 360 ...
# .. .. ..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
# .. .. ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
# .. .. ..$ wt : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
# .. .. ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
# .. .. ..$ vs : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
# .. .. ..$ am : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
# .. .. ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
# .. .. ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
# .. ..- attr(*, "class")= chr "AsIs"

Your solution:

library(purrr)
map(datalist, ~ sapply(.x[["topics"]], NROW))
# [[1]]
# [1] 32 32
#
# [[2]]
# [1] 32 32

Reorder a list of dataframes by the number of rows in each dataframe

Assuming your list is called 'lst'

lst= lst[order(sapply(lst,nrow),decreasing = T)]

list the number of rows in dataframe in pandas

You can just assign a numpy array using np.arange:

In[4]:
df['dummy'] = np.arange(1, len(df) + 1)
df

Out[4]:
slNo name dummy
0 1 a 1
1 2 b 2
2 3 c 3

How to show certain list elements by number of rows

sapply(rr, nrow) > 5 produces a logical vector (FALSE, FALSE, FALSE, TRUE, ... specifying which elements in rr has more than five rows. This vector can then be use to extract tose elements from the list.

set.seed(1)
rr <- replicate(6,
as.data.frame(matrix(1:(sample(2:6, 1)*2), ncol=2)),
simplify=FALSE)

# Find and extract the dataframes with more than 5 rows
rr[sapply(rr, nrow) > 5]

sum and average of rows from list of data frames

Instead of the while loop, an option in R would be to get the sum of corresponding elements of list with Reduce and divide by the length of the 'sample_list'

Reduce(`+`, sample_list)/length(sample_list)
# x
#1 6
#2 7
#3 8
#4 9
#5 10

Or a concise approach is rowMeans after converting it to a single data.frame

rowMeans(do.call(cbind, sample_list))

How do I get the row count of a Pandas DataFrame?

For a dataframe df, one can use any of the following:

  • len(df.index)
  • df.shape[0]
  • df[df.columns[0]].count() (== number of non-NaN values in first column)

Performance plot


Code to reproduce the plot:

import numpy as np
import pandas as pd
import perfplot

perfplot.save(
"out.png",
setup=lambda n: pd.DataFrame(np.arange(n * 3).reshape(n, 3)),
n_range=[2**k for k in range(25)],
kernels=[
lambda df: len(df.index),
lambda df: df.shape[0],
lambda df: df[df.columns[0]].count(),
],
labels=["len(df.index)", "df.shape[0]", "df[df.columns[0]].count()"],
xlabel="Number of rows",
)


Related Topics



Leave a reply



Submit