Select Rows of a Data.Frame That Contain Only Numbers in a Certain Column

Select rows of a data.frame that contain only numbers in a certain column

You could use grep:

df[grep("[[:digit:]]", df$b), ]
# a b
#1 1 4
#2 5 -2
#3 3 1
#4 1 0
#6 6 2

Filtering Dataframe by keeping numeric values of a specific column only in R

You could use a regular expression to filter the relevant rows of your dataframe.
The regular expression ^\\d+(\\.\\d+)?$ will check for character that contains only digits, possibly with . as a decimal separator (i.e. 2, 2.3). You could then convert the Cost column to numeric using as.numeric() if needed.

See the example below:

Group = c("A", "A", "A", "B", "B", "C", "C", "C")
Cost = c(21,22,"closed", 12, 11,"ended", "closing", 13)
Year = c(2017,2016,2015,2017,2016,2017,2016,2015)
df = data.frame(Group, Cost, Year)

df[grep(pattern = "^\\d+(\\.\\d+)?$", df[,"Cost"]), ]
#> Group Cost Year
#> 1 A 21 2017
#> 2 A 22 2016
#> 4 B 12 2017
#> 5 B 11 2016
#> 8 C 13 2015

Note that this technique works even if your Cost column is of factor class while using df[!is.na(as.numeric(df$Cost)), ] does not. For the latter you need to add as.character() first: df[!is.na(as.numeric(as.character(df$Cost))), ]. Both techniques keep factor levels.

Select all dataframe rows containing a specific integer

df.loc[df.x == 1, 'x'].count()

How to check if a pandas dataframe contains only numeric values column-wise?

You can check that using to_numeric and coercing errors:

pd.to_numeric(df['column'], errors='coerce').notnull().all()

For all columns, you can iterate through columns or just use apply

df.apply(lambda s: pd.to_numeric(s, errors='coerce').notnull().all())

E.g.

df = pd.DataFrame({'col' : [1,2, 10, np.nan, 'a'], 
'col2': ['a', 10, 30, 40 ,50],
'col3': [1,2,3,4,5.0]})

Outputs

col     False
col2 False
col3 True
dtype: bool

Select only a number of rows from a pandas Dataframe based on a condition

Not sure how your dataframe looks like but you could groupby teams and then use head(16) to get only the first 16 of them.

df.groupby('club').head(16)

How to select only numbers from a dataframe in R using which()

You can use the built-in as.numeric() converter to do something like this:

x <- my_data_frame$Column.Title
xn <- as.numeric(x)
which(!is.na(xn))

This won't distinguish between NAs created by failed coercion and pre-existing (numeric) NA values.

If there's a small enough variety of "missing" values you could read the data in with read.csv(..., na.strings=c("NA","missing","no input"))

Selecting only numeric columns from a data frame

EDIT: updated to avoid use of ill-advised sapply.

Since a data frame is a list we can use the list-apply functions:

nums <- unlist(lapply(x, is.numeric), use.names = FALSE)  

Then standard subsetting

x[ , nums]

## don't use sapply, even though it's less code
## nums <- sapply(x, is.numeric)

For a more idiomatic modern R I'd now recommend

x[ , purrr::map_lgl(x, is.numeric)]

Less codey, less reflecting R's particular quirks, and more straightforward, and robust to use on database-back-ended tibbles:

dplyr::select_if(x, is.numeric)

Newer versions of dplyr, also support the following syntax:

x %>% dplyr::select(where(is.numeric))


Related Topics



Leave a reply



Submit