How to remove trailing zeros in R dataframe
If you want to view your Score
data as text, with no trailing zeroes, then use:
df_view <- df[c("Gene", "Score")]
df_view$Score <- sub("0+$", "", as.character(df_view$Score))
df_view
Gene Score
1 AAT2 15.401
2 ACB1 5.1188
3 ACF2 15.045
4 ADE16 3.0408
5 ADE17 0.28143
6 ADE4 19.792
Data:
df <- data.frame(Gene=c("AAT2", "ACB1", "ACF2", "ADE16", "ADE17", "ADE4"),
Score=c(15.40100, 5.11880, 15.04500, 3.04080, 0.28143, 19.79200),
stringsAsFactors=FALSE)
Removing trailing zeros and decimal point in R
One way:
data.frame(val = c("4.20", "4.00")) %>%
type.convert(as.is =TRUE) %>%
as_tibble()%>%
mutate(val = as.character(val))
# A tibble: 2 x 1
val
<chr>
1 4.2
2 4
Using str_remove
:
data.frame(val = c("4.20", "4.00")) %>%
mutate(val = str_remove(val, '\\.?0+$'))
val
1 4.2
2 4
Any of the following can work:
formatC(c(1,2.40,5.06), zero.print = "")
[1] "1" "2.4" "5.06"
prettyNum(c(1,2.40,5.06), zero.print = "")
[1] "1" "2.4" "5.06"
prettyNum(c(1,2.40,5.06), drop0trailing = TRUE)
[1] "1" "2.4" "5.06"
formatC(c(1,2.40,5.06), drop0trailing = TRUE)
[1] "1" "2.4" "5.06"
Remove Unwanted 0's from numeric element - R
It is true that the trailing "000"'s disappear with sub
or gsub
using that pattern, but not because of the pattern matching any characters. Rather it's entirely because of the initial conversion to "character" class:
> df <- c(1.560, 1.790, 3456.000, 1.0700, 0.16000, 1.347, 4.200)
>
> sub("\\.000","",df)
[1] "1.56" "1.79" "3456" "1.07" "0.16" "1.347" "4.2"
> as.character(df) #no `sub(` at all
[1] "1.56" "1.79" "3456" "1.07" "0.16" "1.347" "4.2"
And if you wanted 2 digits to the right of the decimal point you could do:
format(as.vector(df), digits=2)
[1] " 1.56" " 1.79" "3456.00" " 1.07" " 0.16" " 1.35" " 4.20"
And to get rid of the quotes use print
(although they remain character value so you cannot use arithmetic operators on that result.:
print(format(as.vector(df), digits=2) , quote=FALSE)
[1] 1.56 1.79 3456.00 1.07 0.16 1.35 4.20
How do I remove trailing zeros from a character function in R?
You could replace all the 0's which come in the end of the string along with %
sign with just %
sign.
gsub('0*%$', '%', data)
#[1] "65.45%" "75.65%" "-34.55%" "-2.04%"
Remove leading zeros in numbers *within a data frame*
I am interpreting the intention of your question is to convert each numeric cell in the data.frame
into a "pretty-printed" string which is possible using string substitution and a simple regular expression (a good question BTW since I do not know any method to configure the output of numeric data to suppress leading zeros without converting the numeric data into a string!):
df2 <- data.frame(lapply(df,
function(x) gsub("^0\\.", "\\.", gsub("^-0\\.", "-\\.", as.character(x)))),
stringsAsFactors = FALSE)
df2
# est low2.5 up2.5
# 1 .05 .01 .09
# 2 -.16 -.2 -.12
# 3 -.02 -.05 0
# 4 0 -.03 .04
# 5 -.11 -.2 -.01
# 6 .15 .1 .2
# 7 -.26 -.3 -.22
# 8 -.23 -.28 -.17
str(df2)
# 'data.frame': 8 obs. of 3 variables:
# $ est : chr ".05" "-.16" "-.02" "0" ...
# $ low2.5: chr ".01" "-.2" "-.05" "-.03" ...
# $ up2.5 : chr ".09" "-.12" "0" ".04" ...
If you want to get a fixed number of digits after the decimal point (as shown in the expected output but not asked for explicitly) you could use sprintf
or format
:
df3 <- data.frame(lapply(df, function(x) gsub("^0\\.", "\\.", gsub("^-0\\.", "-\\.", sprintf("%.2f", x)))), stringsAsFactors = FALSE)
df3
# est low2.5 up2.5
# 1 .05 .01 .09
# 2 -.16 -.20 -.12
# 3 -.02 -.05 .00
# 4 .00 -.03 .04
# 5 -.11 -.20 -.01
# 6 .15 .10 .20
# 7 -.26 -.30 -.22
# 8 -.23 -.28 -.17
Note: This solution is not robust against different decimal point character (different locales) - it always expects a decimal point...
R: as.character removes trailing zeros of numbers. How to avoid
You are looking for format()
with a nsmall
argument as the number of digits.
This number can be computed as the maximum of the base 10 logarithm of your numeric vector (but you obviously can enter any arbitrary value).
Here is the code:
kappa = c(0.10, NA, 0.0740)
n_digits = max(abs(log(kappa)), na.rm=TRUE)
format(kappa, nsmall=n_digits)
#> [1] "0.100" " NA" "0.074"
Created on 2022-12-13 with reprex v2.0.2
Removing trailing values per group in data.table
For the sake of completeness, here is a data.table
solution which uses last()
and .I
:
df[!df[, last(.I[last(Value) == 0]), by = Country]$V1]
Country Value Value2
1: NL 1 100
2: NL 2 200
3: NL 3 400
4: DE 3 200
5: DE 0 200
6: DE 1 100
7: GB 2 800
df[, last(.I[last(Value) == 0]), by = Country]
returns the indices .I
into the original dataset df
of the rows to be removed:
Country V1
1: NL 4
2: GB 9
Caveat
This approach as well as the other answers posted so far will only remove one trailing zero but not multiple trailing zeros.
Removing multiple trailing zeros
In case of multiple trailing zeros at the end of a country's sequence the rle()
function can be used:
library(data.table)
df2[, {
r <- rle(Value)
if (last(r$values) == 0)
head(.SD, -last(r$lengths))
else
.SD
}, Country]
Country Value Value2
1: NL 1 100
2: NL 2 200
3: NL 3 400
4: DE 3 200
5: DE 0 200
6: DE 1 100
7: GB 2 800
8: FR 1 100
9: FR 0 200
10: FR 3 300
Data
df2 <- fread("Country Value Value2
NL 1 100
NL 2 200
NL 3 400
NL 0 500
DE 3 200
DE 0 200
DE 1 100
GB 2 800
GB 0 600
FR 1 100
FR 0 200
FR 3 300
FR 0 400
FR 0 500")
Note that there are two trailing zero for country
group FR
.
Remove trailing .0 from strings of entire DataFrame
Let's try DataFrame.replace
:
import pandas as pd
df = pd.DataFrame({
'a': ['20', '34.0'],
'b': ['39.0', '.016.0'],
'c': ['17-50', '001-6784532']
})
df = df.replace(r'\.0$', '', regex=True)
print(df)
Optional DataFrame.astype
if the columns are not already str
:
df = df.astype(str).replace(r'\.0$', '', regex=True)
Before:
a b c
0 20 39.0 17-50
1 34.0 .016.0 001-6784532
After:
a b c
0 20 39 17-50
1 34 .016 001-6784532
rtrim
/rstrip
will not work here as they don't parse regex but rather take a list of characters to remove. For this reason, they will remove all 0
because 0
is in the "list" to remove.
Related Topics
How to Change Color Scheme in Corrplot
Making Commandargs Comma Delimited or Parsing Spaces
What's a Prettier Way to Print Info with R
Using Sample() with Sample Space Size = 1
Robust Standard Errors for Mixed-Effects Models in Lme4 Package of R
Tiff Plot Generation and Compression: R VS. Gimp VS. Irfanview VS. Photoshop File Sizes
Change Distance Between X-Axis Ticks in Ggplot2
Using Discrete Custom Color in a Plotly Heatmap
Embed Instagram/Youtube into Shiny R App
Why Can't One Have Several 'Value.Var' in 'Dcast'
Convert Latitude/Longitude to State Plane Coordinates
Ggplotly Not Displaying Geom_Line Correctly
Split Character Vector into Sentences