How to convert certain columns only to numeric?
1) All your columns
is character
columns <- sapply(weatherDF, is.character)
airport xdate ws wd humidity
TRUE TRUE TRUE TRUE TRUE
2) Why not simply ?
weatherDF[, 3:ncol(weatherDF)] <- lapply(3:ncol(weatherDF), function(x) as.numeric(weatherDF[[x]]))
or
columns <-c("ws", "wd", "humidity")
weatherDF[, columns] <- lapply(columns, function(x) as.numeric(weatherDF[[x]]))
If your dont know which columns is numeric you can try to find it using tryCatch
like
weatherDF[,1:ncol(weatherDF)]=lapply(1:ncol(weatherDF),function(x) {
tryCatch({
as.numeric(weatherDF[[x]])
},warning = function(w) {
weatherDF[[x]]}
)} )
Converting only specific columns in dataframe to numeric
Have you tried Series.astype()?
df['ID'] = df['ID'].astype(float)
df['Number'] = df['Number'].astype(float)
or for all columns besides date:
for col in [x for x in df.columns if x != 'Date']:
df[col] = df[col].astype(float)
or
df[[x for x in df.columns if x != 'Date']].transform(lambda x: x.astype(float), axis=1)
converting multiple columns from character to numeric format in r
You could try
DF <- data.frame("a" = as.character(0:5),
"b" = paste(0:5, ".1", sep = ""),
"c" = letters[1:6],
stringsAsFactors = FALSE)
# Check columns classes
sapply(DF, class)
# a b c
# "character" "character" "character"
cols.num <- c("a","b")
DF[cols.num] <- sapply(DF[cols.num],as.numeric)
sapply(DF, class)
# a b c
# "numeric" "numeric" "character"
Use to_numeric on certain columns only in PANDAS
You can use:
Tracker_sample[['product1','product2','product3','product4','Total']].apply(pd.to_numeric, errors='coerce').fillna(0)
How to convert a data frame column to numeric type?
Since (still) nobody got check-mark, I assume that you have some practical issue in mind, mostly because you haven't specified what type of vector you want to convert to numeric
. I suggest that you should apply transform
function in order to complete your task.
Now I'm about to demonstrate certain "conversion anomaly":
# create dummy data.frame
d <- data.frame(char = letters[1:5],
fake_char = as.character(1:5),
fac = factor(1:5),
char_fac = factor(letters[1:5]),
num = 1:5, stringsAsFactors = FALSE)
Let us have a glance at data.frame
> d
char fake_char fac char_fac num
1 a 1 1 a 1
2 b 2 2 b 2
3 c 3 3 c 3
4 d 4 4 d 4
5 e 5 5 e 5
and let us run:
> sapply(d, mode)
char fake_char fac char_fac num
"character" "character" "numeric" "numeric" "numeric"
> sapply(d, class)
char fake_char fac char_fac num
"character" "character" "factor" "factor" "integer"
Now you probably ask yourself "Where's an anomaly?" Well, I've bumped into quite peculiar things in R, and this is not the most confounding thing, but it can confuse you, especially if you read this before rolling into bed.
Here goes: first two columns are character
. I've deliberately called 2nd one fake_char
. Spot the similarity of this character
variable with one that Dirk created in his reply. It's actually a numerical
vector converted to character
. 3rd and 4th column are factor
, and the last one is "purely" numeric
.
If you utilize transform
function, you can convert the fake_char
into numeric
, but not the char
variable itself.
> transform(d, char = as.numeric(char))
char fake_char fac char_fac num
1 NA 1 1 a 1
2 NA 2 2 b 2
3 NA 3 3 c 3
4 NA 4 4 d 4
5 NA 5 5 e 5
Warning message:
In eval(expr, envir, enclos) : NAs introduced by coercion
but if you do same thing on fake_char
and char_fac
, you'll be lucky, and get away with no NA's:
> transform(d, fake_char = as.numeric(fake_char),
char_fac = as.numeric(char_fac))
char fake_char fac char_fac num
1 a 1 1 1 1
2 b 2 2 2 2
3 c 3 3 3 3
4 d 4 4 4 4
5 e 5 5 5 5
If you save transformed data.frame
and check for mode
and class
, you'll get:
> D <- transform(d, fake_char = as.numeric(fake_char),
char_fac = as.numeric(char_fac))
> sapply(D, mode)
char fake_char fac char_fac num
"character" "numeric" "numeric" "numeric" "numeric"
> sapply(D, class)
char fake_char fac char_fac num
"character" "numeric" "factor" "numeric" "integer"
So, the conclusion is: Yes, you can convert character
vector into a numeric
one, but only if it's elements are "convertible" to numeric
. If there's just one character
element in vector, you'll get error when trying to convert that vector to numerical
one.
And just to prove my point:
> err <- c(1, "b", 3, 4, "e")
> mode(err)
[1] "character"
> class(err)
[1] "character"
> char <- as.numeric(err)
Warning message:
NAs introduced by coercion
> char
[1] 1 NA 3 4 NA
And now, just for fun (or practice), try to guess the output of these commands:
> fac <- as.factor(err)
> fac
???
> num <- as.numeric(fac)
> num
???
Kind regards to Patrick Burns! =)
pandas: to_numeric for multiple columns
UPDATE: you don't need to convert your values afterwards, you can do it on-the-fly when reading your CSV:
In [165]: df=pd.read_csv(url, index_col=0, na_values=['(NA)']).fillna(0)
In [166]: df.dtypes
Out[166]:
GeoName object
ComponentName object
IndustryId int64
IndustryClassification object
Description object
2004 int64
2005 int64
2006 int64
2007 int64
2008 int64
2009 int64
2010 int64
2011 int64
2012 int64
2013 int64
2014 float64
dtype: object
If you need to convert multiple columns to numeric dtypes - use the following technique:
Sample source DF:
In [271]: df
Out[271]:
id a b c d e f
0 id_3 AAA 6 3 5 8 1
1 id_9 3 7 5 7 3 BBB
2 id_7 4 2 3 5 4 2
3 id_0 7 3 5 7 9 4
4 id_0 2 4 6 4 0 2
In [272]: df.dtypes
Out[272]:
id object
a object
b int64
c int64
d int64
e int64
f object
dtype: object
Converting selected columns to numeric dtypes:
In [273]: cols = df.columns.drop('id')
In [274]: df[cols] = df[cols].apply(pd.to_numeric, errors='coerce')
In [275]: df
Out[275]:
id a b c d e f
0 id_3 NaN 6 3 5 8 1.0
1 id_9 3.0 7 5 7 3 NaN
2 id_7 4.0 2 3 5 4 2.0
3 id_0 7.0 3 5 7 9 4.0
4 id_0 2.0 4 6 4 0 2.0
In [276]: df.dtypes
Out[276]:
id object
a float64
b int64
c int64
d int64
e int64
f float64
dtype: object
PS if you want to select all string
(object
) columns use the following simple trick:
cols = df.columns[df.dtypes.eq('object')]
Map as.numeric to only specific columns of a dataframe
You can use map_at
df[] <- purrr::map_at(df, numeric_variables, as.numeric)
df
# A tibble: 3 x 4
# id age name income
# <dbl> <dbl> <chr> <dbl>
#1 1 18 jim 100
#2 2 21 bob 200
#3 3 16 alice 300
Related Topics
Reading a CSV File Organized Horizontally
R 3.3.0 Installing a Package on Windows: Gcc Not Found Error
Replace Specific Values Based on Another Dataframe
Ggplot: How to Set Default Color for All Geoms
Check If String Contains Only Numbers or Only Characters (R)
How to Assign from a Function with Multiple Outputs
Displaying True When Shiny Files Are Split into Different Folders
Set Number of Columns (Or Rows) in a Facetted Plot
How to Reorder the Items in a Legend
Changing the Symbol in the Legend Key in Ggplot2
How to One-Hot-Encode Factor Variables with Data.Table
Format Text Inside R Code Chunk
Why Is Subsetting on a "Logical" Type Slower Than Subsetting on "Numeric" Type
Understanding Element Wise Clearing of R's Workspace
Merge Getsymbols Result into One Xts Object
Rcpp Warning: "Directory Not Found for Option '-L/Usr/Local/Cellar/Gfortran/4.8.2/Gfortran'"