duplicate 'row.names' are not allowed error
Then tell read.table not to use row.names
:
systems <- read.table("http://getfile.pl?test.csv",
header=TRUE, sep=",", row.names=NULL)
and now your rows will simply be numbered.
Also look at read.csv
which is a wrapper for read.table
which already sets the sep=','
and header=TRUE
arguments so that your call simplifies to
systems <- read.csv("http://getfile.pl?test.csv", row.names=NULL)
R duplicate 'row.names' are not allowed
According to the R documentation here,
If there is a header and the first row contains one fewer field
than the number of columns, the first column in the input is used
for the row names. Otherwise if row.names is missing, the rows are numbered.
... therefore I'd suggest that the first row may have one fewer field than the number of columns, so read.table()
is selecting the first column (which contains more than one copy of molecular_function
) as the row names.
Desperate to solve duplicate 'row.names' are not allowed in R plm package: There are no duplicates
This looks like a bug in the plm
function. Your du
column in gust
has named values; that is causing plm
to crash.
You can work around the bug by removing those names:
gust$du <- unname(gust$du)
After I do that, I get successful results:
> summary(plm(du ~ g, data = gust, index = c("ETreg", "year"), model = "pooling"))
Pooling Model
Call:
plm(formula = du ~ g, data = gust, model = "pooling", index = c("ETreg",
"year"))
Balanced Panel: n = 3, T = 11, N = 33
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-2.53175 -1.02819 0.27557 0.77953 3.84676
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 0.851158 0.263640 3.2285 0.002939 **
g -0.365347 0.053228 -6.8639 1.079e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Total Sum of Squares: 157.46
Residual Sum of Squares: 62.488
R-Squared: 0.60314
Adj. R-Squared: 0.59034
F-statistic: 47.1127 on 1 and 31 DF, p-value: 1.0794e-07
duplicate 'row.names' are not allowed -- still killing me
I believe the problem is caused by the fact that the row names are numbers, e.g., 199990901000,
which are larger than the greatest integer value .Machine$integer.max
which is 2147483647
. While row names of a data.frame are of type character it might cause a problem in later processing steps, perhaps.
Therefore, I suggest to treat the first column as a regular data column and not as row.names.
The code below worked for me to read the file and to coerce many columns to factor:
library(data.table)
url <- sprintf("https://docs.google.com/uc?id=%s&export=download",
"1NwcvwwaPLWaSmKOuQiVrWAK4iKn9f10S")
d5_17cou <- fread(url, dec = ",", colClasses = list(character = 1L))
cols <- names(d5_17cou)[8:37]
d5_17cou[, (cols) := lapply(.SD, as.factor), .SDcols = cols]
str(d5_17cou)
Classes ‘data.table’ and 'data.frame': 22431 obs. of 39 variables:
$ S007 : chr "199905600001" "199905600002" "199905600003" "199905600004" ...
$ S003A : int 56 56 56 56 56 56 56 56 56 56 ...
$ cou.year : int 561999 561999 561999 561999 561999 561999 561999 561999 561999 561999 ...
$ year : int 1999 1999 1999 1999 1999 1999 1999 1999 1999 1999 ...
$ s017ay : num 0.692 1.051 1.051 0.752 0.752 ...
$ uitem : int 1 2 3 4 5 6 7 8 9 10 ...
$ item : int 1 2 3 4 5 6 7 8 9 10 ...
$ a025r : Factor w/ 2 levels "1","3": 2 2 2 2 1 1 1 2 1 2 ...
$ a034r : Factor w/ 2 levels "1","3": 1 2 1 1 1 2 1 2 1 1 ...
$ a038r : Factor w/ 2 levels "1","3": 1 2 2 2 2 1 2 1 2 1 ...
$ a040r : Factor w/ 2 levels "1","3": 1 1 1 1 1 1 1 1 1 1 ...
$ a041r : Factor w/ 2 levels "1","3": 1 1 1 1 1 1 1 1 1 1 ...
$ a042r : Factor w/ 2 levels "1","3": 2 1 2 2 1 1 1 1 1 1 ...
$ c001r : Factor w/ 2 levels "1","3": 1 1 2 1 1 1 1 1 1 1 ...
$ c024r : Factor w/ 2 levels "1","3": 2 2 2 2 1 2 2 1 2 2 ...
$ c037r : Factor w/ 2 levels "1","3": 1 1 1 1 2 2 1 2 2 1 ...
$ charity : Factor w/ 2 levels "1","3": 1 1 2 1 1 1 1 2 2 2 ...
$ clz.outgr4: Factor w/ 2 levels "1","3": 2 1 1 1 1 1 1 2 1 1 ...
$ d019r : Factor w/ 2 levels "1","3": 2 2 2 2 2 2 2 2 2 2 ...
$ d023r : Factor w/ 2 levels "1","3": 2 2 1 1 2 2 2 2 2 2 ...
$ e014r : Factor w/ 2 levels "1","3": 1 2 1 1 2 2 2 2 2 2 ...
$ e018r : Factor w/ 2 levels "1","3": 2 2 2 2 1 1 1 2 2 1 ...
$ e035r : Factor w/ 2 levels "1","3": 2 2 1 2 2 1 1 2 2 2 ...
$ e114r : Factor w/ 2 levels "1","3": 2 1 1 1 1 1 1 1 1 1 ...
$ e143r : Factor w/ 2 levels "1","3": 2 1 1 2 2 1 1 1 1 2 ...
$ e146r : Factor w/ 2 levels "1","3": 1 1 2 1 1 2 1 1 2 1 ...
$ e190rr : Factor w/ 2 levels "1","3": 2 1 2 2 2 1 2 2 2 2 ...
$ f022r : Factor w/ 2 levels "1","3": 1 1 1 2 1 1 1 1 1 1 ...
$ f028r : Factor w/ 2 levels "1","3": 1 2 2 2 2 1 2 1 1 1 ...
$ f051r : Factor w/ 2 levels "1","3": 1 2 2 2 2 1 1 2 1 1 ...
$ f064r : Factor w/ 2 levels "1","3": 1 2 2 2 1 1 2 2 2 1 ...
$ f066r : Factor w/ 2 levels "1","3": 1 2 2 2 2 1 2 2 2 2 ...
$ f121r : Factor w/ 2 levels "1","3": 1 2 2 1 1 2 2 1 2 2 ...
$ helpef : Factor w/ 2 levels "1","3": 1 2 2 2 2 2 2 2 2 2 ...
$ jpay : Factor w/ 2 levels "1","3": 1 1 1 1 1 1 1 1 1 1 ...
$ prices1 : Factor w/ 2 levels "1","3": 1 1 1 1 1 1 1 1 1 1 ...
$ psub.all : Factor w/ 2 levels "1","3": 2 1 1 1 1 2 1 2 2 1 ...
$ oriend : int 1 1 1 1 1 1 1 1 1 1 ...
$ dupl : int 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, ".internal.selfref")=<externalptr>
Note that the first column S007
is explicitely read in as character column (otherwise fread()
uses int64
) and is part of the dataset, now. Consequently, the numbering of all subsequent columns is changed.
BTW, fread()
is much faster than read.table()
.
Error in `.rowNamesDF -`(x, value = value) : 'row.names' duplicate are not allowed. In addition: Warning message: non-unique values
We don't need a for
loop here. Just index the data.frame to subset the columns, unlist
and construct data.frame
directly
out <- data.frame(country = unlist(total_authority[c(1,3)]),
score = unlist(total_authority[c(2,4)]),
year = rep(names(total_authority)[c(2,4)], each = nrow(total_authority)))
row.names(out) <- NULL
-output
> out
country score year
1 Albania 0.00000000000000003122502 1994
2 Algeria 0.00000000000000003122502 1994
3 American Somoa 0.00000000000000003122502 1994
4 Angola 0.00000000000000003122502 1994
5 Anguilla 0.00000000000000003122502 1994
6 Antigua 0.00000000000000003122502 1994
7 Argentina 0.00289122132708816018468 1994
8 Armenia 0.00000000000000003122502 1994
9 Aruba 0.00000528966979389429013 1994
10 Australia 0.00622391681538347982944 1994
11 Albania 0.00000320558770721281009 1995
12 Algeria 0.00000000000000002775558 1995
13 American Somoa 0.00000000000000002775558 1995
14 Angola 0.00000000000000002775558 1995
15 Anguilla 0.00000000000000002775558 1995
16 Antigua 0.00000000000000002775558 1995
17 Argentina 0.02245380108584869860433 1995
18 Armenia 0.00000000000000002775558 1995
19 Aruba 0.00000000000000002775558 1995
20 Australia 0.40763348337921900821357 1995
Regarding the error with duplicate row.names, it occurs because the authority
created is a data.frame
with a single column ([
), instead, we need a vector by extracting the column ([[
)
final_output<-data.frame()
for (count in 1:2) {
df <- data.frame(country=actors)
df$year=rep(names(total_authority)[2*count],nrow(df))
df$authority<-total_authority[[2*count]]
final_output <- rbind(final_output, df)
}
-output
> final_output
country year authority
1 Albania 1994 0.00000000000000003122502
2 Algeria 1994 0.00000000000000003122502
3 American Somoa 1994 0.00000000000000003122502
4 Angola 1994 0.00000000000000003122502
5 Anguilla 1994 0.00000000000000003122502
6 Antigua 1994 0.00000000000000003122502
7 Argentina 1994 0.00289122132708816018468
8 Armenia 1994 0.00000000000000003122502
9 Aruba 1994 0.00000528966979389429013
10 Australia 1994 0.00622391681538347982944
11 Albania 1995 0.00000320558770721281009
12 Algeria 1995 0.00000000000000002775558
13 American Somoa 1995 0.00000000000000002775558
14 Angola 1995 0.00000000000000002775558
15 Anguilla 1995 0.00000000000000002775558
16 Antigua 1995 0.00000000000000002775558
17 Argentina 1995 0.02245380108584869860433
18 Armenia 1995 0.00000000000000002775558
19 Aruba 1995 0.00000000000000002775558
20 Australia 1995 0.40763348337921900821357
Related Topics
What Do the %Op% Operators in Mean? for Example "%In%"
Predict.Lm() with an Unknown Factor Level in Test Data
Get All Diagonal Vectors from Matrix
How to Access the Help/Documentation .Rd Source Files in R
R Keep Rows with at Least One Column Greater Than Value
How to Draw the Boxplot with Significant Level
Add Extra Level to Factors in Dataframe
No Rtools Compatible with R Version 3.5.0 Was Found
Add Secondary X Axis Labels to Ggplot with One X Axis
How to Override a Non-Visible Function in the Package Namespace
Show Frequencies Along with Barplot in Ggplot2
In R, Use Gsub to Remove All Punctuation Except Period
Creating a Unique Sequence of Dates
How to Save Data File into .Rdata
How to See Data from .Rdata File
Add (Subtract) Months Without Exceeding the Last Day of the New Month