R Function Prcomp Fails with Na's Values Even Though Na's Are Allowed

Omit NA and data imputation before doing PCA analysis using R

For na.action to have an effect, you need to explicitly supply a formula argument:

princomp(formula = ~., data = mydf, cor = TRUE, na.action=na.exclude)

# Call:
# princomp(formula = ~., data = mydf, na.action = na.exclude, cor = TRUE)
#
# Standard deviations:
# Comp.1 Comp.2 Comp.3
# 1.3748310 0.8887105 0.5657149

The formula is needed because it triggers dispatch of princomp.formula, the only princomp method that does anything useful with na.action.

methods('princomp')
[1] princomp.default* princomp.formula*

names(formals(stats:::princomp.formula))
[1] "formula" "data" "subset" "na.action" "..."

names(formals(stats:::princomp.default))
[1] "x" "cor" "scores" "covmat" "subset" "..."

How to get rid of NA's without erasing the values named NA

If your problem is with reading actual strings "NA" as NA values, read.csv2 function has an argument na.strings, which has a default value of "NA". That should be changed to something different, maybe even "". I've also seen "<NA>" used in some cases.



Related Topics



Leave a reply



Submit