Retain Numerical Precision in an R Data Frame

Retain numerical precision in an R data frame?

If you really want set up R to print its results with utterly unreasonable precision, then use: options(digits=16).

Note that this does nothing for that accuracy of functions using htese results. It merely changes how values appear when they are printed to the console. There is no rounding of the values as they are being stored or accessed unless you put in more significant digits than the abscissa can handle. The 'digits' option has no effect on the maximal precision of floating point numbers.

Limiting the number of decimals in a dataframe (R)

Here is.num is TRUE for numeric columns and FALSE otherwise. We then apply round to the numeric columns:

is.num <- sapply(DF, is.numeric)
DF[is.num] <- lapply(DF[is.num], round, 8)

If what you meant was not that you need to change the data frame but just that you want to display the data frame to 8 digits then it's just:

print(DF, digits = 8)

In dplyr 1.0.0 and later one can use across within mutate like this:

library(dplyr)
DF %>% mutate(across(where(is.numeric), ~ round(., 8)))

Reduced Precision Numeric Data

I started writing a package, pack to help with a problem like this. I was using it to support another package that was an API to a now-defunct service.

If you just want a 1-byte integer (<256) you can use as.raw and send the result; then use as.integer on the machine receiving the data.

> as.raw(255)
[1] ff
> as.integer(as.raw(255))
[1] 255

For a 2-byte integer, you can use pack and send the result; then use unpack on the machine receiving the data.

> library(pack)
> pack("v", 255)
[1] ff 00
> pack("v", 256)
[1] 00 01
> unpack("v", as.raw(255))
[[1]]
[1] 255

I've never used it, but I've heard good things about RProtoBuf.

R: Possible loss of precision when saving data frame to CSV?

If you want to increase the precision in your write.csv function, you could achieve that with sprintf. "%.20f" will make sure that the first 20 digits are the same, which is enough for R to conclude that the numbers are equal.

set.seed(1)
df <- data.frame(ID = 1:10, X = rnorm(10))

write.csv(data.frame(df$ID, newX =sprintf("%.20f",df$X)), "test.csv",
row.names = F)

x <- read.csv("test.csv")
x == df

#df.ID newX
[1,] TRUE TRUE
[2,] TRUE TRUE
[3,] TRUE TRUE
[4,] TRUE TRUE
[5,] TRUE TRUE
[6,] TRUE TRUE
[7,] TRUE TRUE
[8,] TRUE TRUE
[9,] TRUE TRUE
[10,] TRUE TRUE

How to control over data precision when saving dataframe to dbf in R

If you want more insight on why this is happening, check the first circle of the R Inferno (a quick and easy read). Basically R is storing your floating points with a numerical error, but when showing you head(coverage_adj$VAR3) it is hiding that.

In your case the easiest thing to do is use round(..., digits=2) on your data before saving again, e.g:

write.dbf(round(coverage_adj, digits=2), "coverage_adj.dbf")

That should be all you need.

Decimal places deleted in R

try running this option first

options(digits = 15)

How to display numeric columns in an R dataframe without scientific notation ('e+07')

As Joshua said, it is a printing issue not a storage issue. You can change the way all numbers are printed (=by adjusting getOption("scipen").

x <- c(1, 2, 509703045845, 0.0001)
print(x)
options(scipen = 50)
print(x)

Alternatively, you may wish to change the way just those numbers are formatted. (This converts them to character.) It is worth getting to know format and formatC. To get you started, compare

format(x)
format(x, digits = 10)
format(x, digits = 3)
format(x, digits = 3, scientific = 5)
format(x, trim = TRUE, digits = 3, scientific = 5)
formatC(x)
formatC(x, format = "fg")
formatC(x, format = "fg", flag = "+")

Formatting Decimal places in R

Background: Some answers suggested on this page (e.g., signif, options(digits=...)) do not guarantee that a certain number of decimals are displayed for an arbitrary number. I presume this is a design feature in R whereby good scientific practice involves showing a certain number of digits based on principles of "significant figures". However, in many domains (e.g., APA style, business reports) formatting requirements dictate that a certain number of decimal places are displayed. This is often done for consistency and standardisation purposes rather than being concerned with significant figures.

Solution:

The following code shows exactly two decimal places for the number x.

format(round(x, 2), nsmall = 2)

For example:

format(round(1.20, 2), nsmall = 2)
# [1] "1.20"
format(round(1, 2), nsmall = 2)
# [1] "1.00"
format(round(1.1234, 2), nsmall = 2)
# [1] "1.12"

A more general function is as follows where x is the number and k is the number of decimals to show. trimws removes any leading white space which can be useful if you have a vector of numbers.

specify_decimal <- function(x, k) trimws(format(round(x, k), nsmall=k))

E.g.,

specify_decimal(1234, 5)
# [1] "1234.00000"
specify_decimal(0.1234, 5)
# [1] "0.12340"

Discussion of alternatives:

The formatC answers and sprintf answers work fairly well. But they will show negative zeros in some cases which may be unwanted. I.e.,

formatC(c(-0.001), digits = 2, format = "f")
# [1] "-0.00"
sprintf(-0.001, fmt = '%#.2f')
# [1] "-0.00"

One possible workaround to this is as follows:

formatC(as.numeric(as.character(round(-.001, 2))), digits = 2, format = "f")
# [1] "0.00"

How to control number of decimal digits in write.table() output?

You can use the function format() as in:

write.table(format(ttf.all, digits=2), 'clipboard', sep='\t',row.names=F)

format() is a generic function that has methods for many classes, including data.frames. Unlike round(), it won't throw an error if your dataframe is not all numeric. For more details on the formatting options, see the help file via ?format



Related Topics



Leave a reply



Submit