Creating a Data.Frame Using R.Net

How to create a data.frame from expression?

Here is one solution

library(rland)
library(dplyr)

as_expression_vec <- parse_expr("c(a = \"lion\", b = \"zebra\")")

bind_rows(eval(as_expression_vec))

# A tibble: 1 x 2
a b
<chr> <chr>
1 lion zebra

How can I convert a data table in C# to data.fame in R using R.net

This worked in R.NET 1.6.5 and outputs:

output from console application

Did not find a nice way of converting DataTable to string[,] as CreateCharacterMatrix expects but CreateDataFrame is an alternative possibility but that expects a IEnumerable[] and is column oriented (as data.frame in R is).

Going through the Tests in the source code might help further:
https://github.com/jmp75/rdotnet/tree/master/RDotNet.Tests

There is also this: https://github.com/jmp75/rdotnet-onboarding

Which is in some ways more helpful because the official documentation is sparse:
https://jmp75.github.io/rdotnet/getting_started/

using System;
using System.Data;
using RDotNet;

namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
DataTable dtb = new DataTable();
dtb.Columns.Add("Column1", Type.GetType("System.String"));
dtb.Columns.Add("Column2", Type.GetType("System.String"));
DataRow dtr1 = dtb.NewRow();
dtr1[0] = "abc";
dtr1[1] = "cdf";
dtb.Rows.Add(dtr1);
DataRow dtr2 = dtb.NewRow();
dtr2[0] = "asdasd";
dtr2[1] = "cdasdasf";
dtb.Rows.Add(dtr2);

using (var engine = REngine.GetInstance())
{
string[,] stringData = new string[dtb.Rows.Count, dtb.Columns.Count];
for (int row = 0; row < dtb.Rows.Count; row++)
{
for (int col = 0; col < dtb.Columns.Count; col++)
{
stringData[row, col] = dtb.Rows[row].ItemArray[col].ToString();
}
}
CharacterMatrix matrix = engine.CreateCharacterMatrix(stringData);
engine.SetSymbol("myRDataFrame", matrix);
engine.Evaluate("myRDataFrame <- as.data.frame(myRDataFrame, stringsAsFactors = FALSE)");
engine.Evaluate("str(myRDataFrame)");

}
Console.ReadKey();
}
}
}

How to make data frame from two vectors in R?

It would be easier to read with read.table with delimiter space. But, there is an issue with space as the 'Country' may have multiple words and this should be read as a single column. In order to do that, we can insert single quotes as boundary for the Country using sub and then read with read.table while specifying the col.names as 'v2'

df1 <- read.table(text = sub("^([^0-9]+)\\s", ' "\\1"', v1), 
header = FALSE, col.names = v2, fill = TRUE, check.names = FALSE)

-output

df1
Country(ordependency) Population(2020) YearlyChange NetChange Density(P/Km²) LandArea(Km²) Migrants(net) Fert.Rate Med.Age UrbanPop%
1 China 1439323776 0.39 5540090 153 9388211 -348399 1.7 38 61
2 India 1380004385 0.99 13586631 464 2973190 -532687 2.2 28 35
3 United States 331002651 0.59 1937734 36 9147420 954806 1.8 38 83
4 Indonesia 273523615 1.07 2898047 151 1811570 -98955 2.3 30 56
5 Pakistan 220892340 2.00 4327022 287 770880 -233379 3.6 23 35
6 Brazil 212559417 0.72 1509890 25 8358140 21200 1.7 33 88
7 Nigeria 206139589 2.58 5175990 226 910770 -60000 5.4 18 52
8 Bangladesh 164689383 1.01 1643222 1265 130170 -369501 2.1 28 39
9 Russia 145934462 0.04 62206 9 16376870 182456 1.8 40 74
10 Tokelau 1357 1.27 17 136 10 N.A. N.A. 0 0
11 Holy See 801 0.25 2 2003 0 N.A. N.A. N.A. 0
12 Côte d'Ivoire 26378274 2.57 661730 83 318000 -8000 4.7 19 51
13 Czech Republic (Czechia) 10708981 0.18 19772 139 77240 22011 1.6 43 74
14 United Arab Emirates 9890402 1.23 119873 118 83600 40000 1.4 33 86
15 Papua New Guinea 8947024 1.95 170915 20 452860 -800 3.6 22 13
16 Bosnia and Herzegovina 3280819 -0.61 -20181 64 51000 -21585 1.3 43 52
17 Saint Pierre & Miquelon 5794 -0.48 -28 25 230 N.A. N.A. 100 0
WorldShare
1 18.47
2 17.70
3 4.25
4 3.51
5 2.83
6 2.73
7 2.64
8 2.11
9 1.87
10 NA
11 NA
12 0.34
13 0.14
14 0.13
15 0.11
16 0.04
17 NA

For those cases where the count is less, we can update the column values by shifting the columns values with row/column indexing

library(stringr)
cnt <- str_count(sub("^([^0-9]+)\\s", '', v1), "\\s+") + 2
i1 <- cnt == 10
df1[i1, 10:11] <- df1[i1, 9:10]
df1[i1, 9] <- NA

data

v1 <- c("China 1439323776 0.39 5540090 153 9388211 -348399 1.7 38 61 18.47", 
"India 1380004385 0.99 13586631 464 2973190 -532687 2.2 28 35 17.70",
"United States 331002651 0.59 1937734 36 9147420 954806 1.8 38 83 4.25",
"Indonesia 273523615 1.07 2898047 151 1811570 -98955 2.3 30 56 3.51",
"Pakistan 220892340 2.00 4327022 287 770880 -233379 3.6 23 35 2.83",
"Brazil 212559417 0.72 1509890 25 8358140 21200 1.7 33 88 2.73",
"Nigeria 206139589 2.58 5175990 226 910770 -60000 5.4 18 52 2.64",
"Bangladesh 164689383 1.01 1643222 1265 130170 -369501 2.1 28 39 2.11",
"Russia 145934462 0.04 62206 9 16376870 182456 1.8 40 74 1.87 ",
"Tokelau 1357 1.27 17 136 10 N.A. N.A. 0 0.00", "Holy See 801 0.25 2 2003 0 N.A. N.A. N.A. 0.00",
"Côte d'Ivoire 26378274 2.57 661730 83 318000 -8000 4.7 19 51 0.34",
"Czech Republic (Czechia) 10708981 0.18 19772 139 77240 22011 1.6 43 74 0.14",
"United Arab Emirates 9890402 1.23 119873 118 83600 40000 1.4 33 86 0.13",
"Papua New Guinea 8947024 1.95 170915 20 452860 -800 3.6 22 13 0.11",
"Bosnia and Herzegovina 3280819 -0.61 -20181 64 51000 -21585 1.3 43 52 0.04",
"Saint Pierre & Miquelon 5794 -0.48 -28 25 230 N.A. N.A. 100 0.00"
)

v2 <- c("Country(ordependency)", "Population(2020)", "YearlyChange",
"NetChange", "Density(P/Km²)", "LandArea(Km²)", "Migrants(net)",
"Fert.Rate", "Med.Age", "UrbanPop%", "WorldShare")

How to add a row to a data frame in R?

Like @Khashaa and @Richard Scriven point out in comments, you have to set consistent column names for all the data frames you want to append.

Hence, you need to explicitly declare the columns names for the second data frame, de, then use rbind(). You only set column names for the first data frame, df:

df<-data.frame("hi","bye")
names(df)<-c("hello","goodbye")

de<-data.frame("hola","ciao")
names(de)<-c("hello","goodbye")

newdf <- rbind(df, de)


Related Topics



Leave a reply



Submit