How to Remove Trailing Zeros in R Dataframe

How to remove trailing zeros in R dataframe

If you want to view your Score data as text, with no trailing zeroes, then use:

df_view <- df[c("Gene", "Score")]
df_view$Score <- sub("0+$", "", as.character(df_view$Score))

df_view

   Gene   Score
1  AAT2  15.401
2  ACB1  5.1188
3  ACF2  15.045
4 ADE16  3.0408
5 ADE17 0.28143
6  ADE4  19.792

Data:

df <- data.frame(Gene=c("AAT2", "ACB1", "ACF2", "ADE16", "ADE17", "ADE4"),
                 Score=c(15.40100, 5.11880, 15.04500, 3.04080, 0.28143, 19.79200),
                 stringsAsFactors=FALSE)

Removing trailing zeros and decimal point in R

One way:

data.frame(val = c("4.20", "4.00")) %>% 
  type.convert(as.is  =TRUE) %>% 
  as_tibble()%>%
  mutate(val = as.character(val))

# A tibble: 2 x 1
  val  
  <chr>
1 4.2  
2 4

Using str_remove:

data.frame(val = c("4.20", "4.00")) %>%  
  mutate(val = str_remove(val, '\\.?0+$'))

  val
1 4.2
2   4

Any of the following can work:

formatC(c(1,2.40,5.06), zero.print = "")
[1] "1"    "2.4"  "5.06"
prettyNum(c(1,2.40,5.06), zero.print = "")
[1] "1"    "2.4"  "5.06"
prettyNum(c(1,2.40,5.06), drop0trailing = TRUE)
[1] "1"    "2.4"  "5.06"
formatC(c(1,2.40,5.06), drop0trailing = TRUE)
[1] "1"    "2.4"  "5.06"

Remove Unwanted 0's from numeric element - R

It is true that the trailing "000"'s disappear with sub or gsub using that pattern, but not because of the pattern matching any characters. Rather it's entirely because of the initial conversion to "character" class:

>  df <- c(1.560, 1.790, 3456.000, 1.0700, 0.16000, 1.347, 4.200)
> 
> sub("\\.000","",df)
[1] "1.56"  "1.79"  "3456"  "1.07"  "0.16"  "1.347" "4.2"  
> as.character(df)  #no `sub(` at all
[1] "1.56"  "1.79"  "3456"  "1.07"  "0.16"  "1.347" "4.2"

And if you wanted 2 digits to the right of the decimal point you could do:

format(as.vector(df), digits=2)
[1] "   1.56" "   1.79" "3456.00" "   1.07" "   0.16" "   1.35" "   4.20"

And to get rid of the quotes use print(although they remain character value so you cannot use arithmetic operators on that result.:

print(format(as.vector(df), digits=2) , quote=FALSE)
[1]    1.56    1.79 3456.00    1.07    0.16    1.35    4.20

How do I remove trailing zeros from a character function in R?

You could replace all the 0's which come in the end of the string along with % sign with just % sign.

gsub('0*%$', '%', data)
#[1] "65.45%"  "75.65%"  "-34.55%" "-2.04%"

Remove leading zeros in numbers within a data frame

I am interpreting the intention of your question is to convert each numeric cell in the data.frame into a "pretty-printed" string which is possible using string substitution and a simple regular expression (a good question BTW since I do not know any method to configure the output of numeric data to suppress leading zeros without converting the numeric data into a string!):

df2 <- data.frame(lapply(df,
                         function(x) gsub("^0\\.", "\\.", gsub("^-0\\.", "-\\.", as.character(x)))),
                  stringsAsFactors = FALSE)
df2
#    est low2.5 up2.5
# 1  .05    .01   .09
# 2 -.16    -.2  -.12
# 3 -.02   -.05     0
# 4    0   -.03   .04
# 5 -.11    -.2  -.01
# 6  .15     .1    .2
# 7 -.26    -.3  -.22
# 8 -.23   -.28  -.17

str(df2)
# 'data.frame': 8 obs. of  3 variables:
# $ est   : chr  ".05" "-.16" "-.02" "0" ...
# $ low2.5: chr  ".01" "-.2" "-.05" "-.03" ...
# $ up2.5 : chr  ".09" "-.12" "0" ".04" ...

If you want to get a fixed number of digits after the decimal point (as shown in the expected output but not asked for explicitly) you could use sprintf or format:

df3 <- data.frame(lapply(df, function(x) gsub("^0\\.", "\\.", gsub("^-0\\.", "-\\.", sprintf("%.2f", x)))), stringsAsFactors = FALSE)
df3
#    est low2.5 up2.5
# 1  .05    .01   .09
# 2 -.16   -.20  -.12
# 3 -.02   -.05   .00
# 4  .00   -.03   .04
# 5 -.11   -.20  -.01
# 6  .15    .10   .20
# 7 -.26   -.30  -.22
# 8 -.23   -.28  -.17

Note: This solution is not robust against different decimal point character (different locales) - it always expects a decimal point...

R: as.character removes trailing zeros of numbers. How to avoid

You are looking for format() with a nsmall argument as the number of digits.

This number can be computed as the maximum of the base 10 logarithm of your numeric vector (but you obviously can enter any arbitrary value).

Here is the code:

kappa = c(0.10, NA, 0.0740)
n_digits = max(abs(log(kappa)), na.rm=TRUE)
format(kappa, nsmall=n_digits)
#> [1] "0.100" "   NA" "0.074"

^{Created on 2022-12-13 with reprex v2.0.2}

Removing trailing values per group in data.table

For the sake of completeness, here is a data.table solution which uses last() and .I:

df[!df[, last(.I[last(Value) == 0]), by = Country]$V1]

   Country Value Value2
1:      NL     1    100
2:      NL     2    200
3:      NL     3    400
4:      DE     3    200
5:      DE     0    200
6:      DE     1    100
7:      GB     2    800

df[, last(.I[last(Value) == 0]), by = Country] returns the indices .I into the original dataset df of the rows to be removed:

   Country V1
1:      NL  4
2:      GB  9

Caveat

This approach as well as the other answers posted so far will only remove one trailing zero but not multiple trailing zeros.

Removing multiple trailing zeros

In case of multiple trailing zeros at the end of a country's sequence the rle() function can be used:

library(data.table)
df2[, {
  r <- rle(Value)
  if (last(r$values) == 0)
    head(.SD, -last(r$lengths))
  else
    .SD
}, Country]

    Country Value Value2
 1:      NL     1    100
 2:      NL     2    200
 3:      NL     3    400
 4:      DE     3    200
 5:      DE     0    200
 6:      DE     1    100
 7:      GB     2    800
 8:      FR     1    100
 9:      FR     0    200
10:      FR     3    300

Data

df2 <- fread("Country Value Value2
NL     1    100
NL     2    200
NL     3    400
NL     0    500
DE     3    200
DE     0    200
DE     1    100
GB     2    800
GB     0    600
FR     1    100
FR     0    200
FR     3    300
FR     0    400
FR     0    500")

Note that there are two trailing zero for country group FR.

Remove trailing .0 from strings of entire DataFrame

Let's try DataFrame.replace:

import pandas as pd

df = pd.DataFrame({
    'a': ['20', '34.0'],
    'b': ['39.0', '.016.0'],
    'c': ['17-50', '001-6784532']
})

df = df.replace(r'\.0$', '', regex=True)

print(df)

Optional DataFrame.astype if the columns are not already str:

df = df.astype(str).replace(r'\.0$', '', regex=True)

Before:

      a       b            c
0    20    39.0        17-50
1  34.0  .016.0  001-6784532

After:

    a     b            c
0  20    39        17-50
1  34  .016  001-6784532

rtrim/rstrip will not work here as they don't parse regex but rather take a list of characters to remove. For this reason, they will remove all 0 because 0 is in the "list" to remove.

How to Remove Trailing Zeros in R Dataframe