Duplicate 'Row.Names' Are Not Allowed Error

duplicate 'row.names' are not allowed error

Then tell read.table not to use row.names:

systems <- read.table("http://getfile.pl?test.csv", 
header=TRUE, sep=",", row.names=NULL)

and now your rows will simply be numbered.

Also look at read.csv which is a wrapper for read.table which already sets the sep=',' and header=TRUE arguments so that your call simplifies to

systems <- read.csv("http://getfile.pl?test.csv", row.names=NULL)

R duplicate 'row.names' are not allowed

According to the R documentation here,

If there is a header and the first row contains one fewer field 
than the number of columns, the first column in the input is used
for the row names. Otherwise if row.names is missing, the rows are numbered.

... therefore I'd suggest that the first row may have one fewer field than the number of columns, so read.table() is selecting the first column (which contains more than one copy of molecular_function) as the row names.

Desperate to solve duplicate 'row.names' are not allowed in R plm package: There are no duplicates

This looks like a bug in the plm function. Your du column in gust has named values; that is causing plm to crash.

You can work around the bug by removing those names:

gust$du <- unname(gust$du)

After I do that, I get successful results:

> summary(plm(du ~ g, data = gust, index = c("ETreg", "year"), model = "pooling"))
Pooling Model

Call:
plm(formula = du ~ g, data = gust, model = "pooling", index = c("ETreg",
"year"))

Balanced Panel: n = 3, T = 11, N = 33

Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-2.53175 -1.02819 0.27557 0.77953 3.84676

Coefficients:
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 0.851158 0.263640 3.2285 0.002939 **
g -0.365347 0.053228 -6.8639 1.079e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares: 157.46
Residual Sum of Squares: 62.488
R-Squared: 0.60314
Adj. R-Squared: 0.59034
F-statistic: 47.1127 on 1 and 31 DF, p-value: 1.0794e-07

duplicate 'row.names' are not allowed -- still killing me

I believe the problem is caused by the fact that the row names are numbers, e.g., 199990901000, which are larger than the greatest integer value .Machine$integer.max which is 2147483647. While row names of a data.frame are of type character it might cause a problem in later processing steps, perhaps.

Therefore, I suggest to treat the first column as a regular data column and not as row.names.

The code below worked for me to read the file and to coerce many columns to factor:

library(data.table)
url <- sprintf("https://docs.google.com/uc?id=%s&export=download",
"1NwcvwwaPLWaSmKOuQiVrWAK4iKn9f10S")
d5_17cou <- fread(url, dec = ",", colClasses = list(character = 1L))
cols <- names(d5_17cou)[8:37]
d5_17cou[, (cols) := lapply(.SD, as.factor), .SDcols = cols]
str(d5_17cou)
Classes ‘data.table’ and 'data.frame':    22431 obs. of  39 variables:
$ S007 : chr "199905600001" "199905600002" "199905600003" "199905600004" ...
$ S003A : int 56 56 56 56 56 56 56 56 56 56 ...
$ cou.year : int 561999 561999 561999 561999 561999 561999 561999 561999 561999 561999 ...
$ year : int 1999 1999 1999 1999 1999 1999 1999 1999 1999 1999 ...
$ s017ay : num 0.692 1.051 1.051 0.752 0.752 ...
$ uitem : int 1 2 3 4 5 6 7 8 9 10 ...
$ item : int 1 2 3 4 5 6 7 8 9 10 ...
$ a025r : Factor w/ 2 levels "1","3": 2 2 2 2 1 1 1 2 1 2 ...
$ a034r : Factor w/ 2 levels "1","3": 1 2 1 1 1 2 1 2 1 1 ...
$ a038r : Factor w/ 2 levels "1","3": 1 2 2 2 2 1 2 1 2 1 ...
$ a040r : Factor w/ 2 levels "1","3": 1 1 1 1 1 1 1 1 1 1 ...
$ a041r : Factor w/ 2 levels "1","3": 1 1 1 1 1 1 1 1 1 1 ...
$ a042r : Factor w/ 2 levels "1","3": 2 1 2 2 1 1 1 1 1 1 ...
$ c001r : Factor w/ 2 levels "1","3": 1 1 2 1 1 1 1 1 1 1 ...
$ c024r : Factor w/ 2 levels "1","3": 2 2 2 2 1 2 2 1 2 2 ...
$ c037r : Factor w/ 2 levels "1","3": 1 1 1 1 2 2 1 2 2 1 ...
$ charity : Factor w/ 2 levels "1","3": 1 1 2 1 1 1 1 2 2 2 ...
$ clz.outgr4: Factor w/ 2 levels "1","3": 2 1 1 1 1 1 1 2 1 1 ...
$ d019r : Factor w/ 2 levels "1","3": 2 2 2 2 2 2 2 2 2 2 ...
$ d023r : Factor w/ 2 levels "1","3": 2 2 1 1 2 2 2 2 2 2 ...
$ e014r : Factor w/ 2 levels "1","3": 1 2 1 1 2 2 2 2 2 2 ...
$ e018r : Factor w/ 2 levels "1","3": 2 2 2 2 1 1 1 2 2 1 ...
$ e035r : Factor w/ 2 levels "1","3": 2 2 1 2 2 1 1 2 2 2 ...
$ e114r : Factor w/ 2 levels "1","3": 2 1 1 1 1 1 1 1 1 1 ...
$ e143r : Factor w/ 2 levels "1","3": 2 1 1 2 2 1 1 1 1 2 ...
$ e146r : Factor w/ 2 levels "1","3": 1 1 2 1 1 2 1 1 2 1 ...
$ e190rr : Factor w/ 2 levels "1","3": 2 1 2 2 2 1 2 2 2 2 ...
$ f022r : Factor w/ 2 levels "1","3": 1 1 1 2 1 1 1 1 1 1 ...
$ f028r : Factor w/ 2 levels "1","3": 1 2 2 2 2 1 2 1 1 1 ...
$ f051r : Factor w/ 2 levels "1","3": 1 2 2 2 2 1 1 2 1 1 ...
$ f064r : Factor w/ 2 levels "1","3": 1 2 2 2 1 1 2 2 2 1 ...
$ f066r : Factor w/ 2 levels "1","3": 1 2 2 2 2 1 2 2 2 2 ...
$ f121r : Factor w/ 2 levels "1","3": 1 2 2 1 1 2 2 1 2 2 ...
$ helpef : Factor w/ 2 levels "1","3": 1 2 2 2 2 2 2 2 2 2 ...
$ jpay : Factor w/ 2 levels "1","3": 1 1 1 1 1 1 1 1 1 1 ...
$ prices1 : Factor w/ 2 levels "1","3": 1 1 1 1 1 1 1 1 1 1 ...
$ psub.all : Factor w/ 2 levels "1","3": 2 1 1 1 1 2 1 2 2 1 ...
$ oriend : int 1 1 1 1 1 1 1 1 1 1 ...
$ dupl : int 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, ".internal.selfref")=<externalptr>

Note that the first column S007 is explicitely read in as character column (otherwise fread() uses int64) and is part of the dataset, now. Consequently, the numbering of all subsequent columns is changed.

BTW, fread() is much faster than read.table().

Error in `.rowNamesDF -`(x, value = value) : 'row.names' duplicate are not allowed. In addition: Warning message: non-unique values

We don't need a for loop here. Just index the data.frame to subset the columns, unlist and construct data.frame directly

out <-  data.frame(country = unlist(total_authority[c(1,3)]), 
score = unlist(total_authority[c(2,4)]),
year = rep(names(total_authority)[c(2,4)], each = nrow(total_authority)))
row.names(out) <- NULL

-output

> out
country score year
1 Albania 0.00000000000000003122502 1994
2 Algeria 0.00000000000000003122502 1994
3 American Somoa 0.00000000000000003122502 1994
4 Angola 0.00000000000000003122502 1994
5 Anguilla 0.00000000000000003122502 1994
6 Antigua 0.00000000000000003122502 1994
7 Argentina 0.00289122132708816018468 1994
8 Armenia 0.00000000000000003122502 1994
9 Aruba 0.00000528966979389429013 1994
10 Australia 0.00622391681538347982944 1994
11 Albania 0.00000320558770721281009 1995
12 Algeria 0.00000000000000002775558 1995
13 American Somoa 0.00000000000000002775558 1995
14 Angola 0.00000000000000002775558 1995
15 Anguilla 0.00000000000000002775558 1995
16 Antigua 0.00000000000000002775558 1995
17 Argentina 0.02245380108584869860433 1995
18 Armenia 0.00000000000000002775558 1995
19 Aruba 0.00000000000000002775558 1995
20 Australia 0.40763348337921900821357 1995

Regarding the error with duplicate row.names, it occurs because the authority created is a data.frame with a single column ([), instead, we need a vector by extracting the column ([[)

final_output<-data.frame()
for (count in 1:2) {
df <- data.frame(country=actors)
df$year=rep(names(total_authority)[2*count],nrow(df))
df$authority<-total_authority[[2*count]]
final_output <- rbind(final_output, df)
}

-output

> final_output
country year authority
1 Albania 1994 0.00000000000000003122502
2 Algeria 1994 0.00000000000000003122502
3 American Somoa 1994 0.00000000000000003122502
4 Angola 1994 0.00000000000000003122502
5 Anguilla 1994 0.00000000000000003122502
6 Antigua 1994 0.00000000000000003122502
7 Argentina 1994 0.00289122132708816018468
8 Armenia 1994 0.00000000000000003122502
9 Aruba 1994 0.00000528966979389429013
10 Australia 1994 0.00622391681538347982944
11 Albania 1995 0.00000320558770721281009
12 Algeria 1995 0.00000000000000002775558
13 American Somoa 1995 0.00000000000000002775558
14 Angola 1995 0.00000000000000002775558
15 Anguilla 1995 0.00000000000000002775558
16 Antigua 1995 0.00000000000000002775558
17 Argentina 1995 0.02245380108584869860433
18 Armenia 1995 0.00000000000000002775558
19 Aruba 1995 0.00000000000000002775558
20 Australia 1995 0.40763348337921900821357


Related Topics



Leave a reply



Submit