How to change node and link colors in R googleVis sankey chart
As soon as you have to color links from 2 originated nodes you'll need 2 colors for links.
Also you have 5 nodes in total, so you'll need 5 colors for them.
Lets create 2 arrays in JSON format with colors for nodes and links
colors_link <- c('green', 'blue')
colors_link_array <- paste0("[", paste0("'", colors_link,"'", collapse = ','), "]")
colors_node <- c('yellow', 'lightblue', 'red', 'black', 'brown')
colors_node_array <- paste0("[", paste0("'", colors_node,"'", collapse = ','), "]")
Next, insert that array into options:
opts <- paste0("{
link: { colorMode: 'source',
colors: ", colors_link_array ," },
node: { colors: ", colors_node_array ," }
}" )
And, finally plot graph:
plot( gvisSankey(datSK, from="From", to="To", weight="Weight",
options=list(
sankey=opts)))
Note, that in options colorMode is set to 'source' which means you would like to color links from originated nodes. Alternatively, set 'target' to color links for destinated nodes
EDIT: add description for multilevel sankeys
It is a bit tricky to find how to assign colors for multilevel sankeys.
We need to create other dateframe:
datSK <- data.frame(From=c(rep("A",3), rep("B", 3), rep(c("X", "Y", "Z"), 2 )),
To=c(rep(c("X", "Y", "Z"),2), rep("M", 3), rep("N", 3)),
Weight=c(5,7,6,2,9,4,3,4,5,6, 4,8))
Here we have to change only arrays of colors. Command to built plot is the same
Let's assume we want these colors for the nodes and links :
colors_link <- c('green', 'blue', 'yellow', 'brown', 'red')
colors_link_array <- paste0("[", paste0("'", colors_link,"'", collapse = ','), "]")
colors_node <- c('yellow', 'lightblue', 'red', 'black', 'brown', 'green', 'brown')
colors_node_array <- paste0("[", paste0("'", colors_node,"'", collapse = ','), "]")
Result would be :
The most trickiest part is to understand how these colors are assigned:
- Links are assigned in the order they appear in dataset (row_wise)
For the nodes colors are assigned in the order plot is built.
- From A to X, Y, Z - green
- From X to M, N - blue
- From Y to M, N - yellow
- From Z to M, N - brown
- From B to X, Y, Z - red
More detailed information on how to format sankey diagram : https://developers.google.com/chart/interactive/docs/gallery/sankey
Assigning node and link colors in R googleVis sankey chart
It has been a year, I do not know whether you still need the answer or not but this is what I found:
plot(gvisSankey(datSK, from="From",
to="To", weight="Weight",
options=list(sankey="{
link: { colorMode: 'gradient'},
node: { colors: ['blue', 'blue', 'orange',
'green','orange', 'green',
'blue','orange','green']}
}")))
Google's Sankey Chart will assign the color based on the appearance order of the nodes. Here is how I decide the appearance order of the node. Basically I create a string of a list of node connections, split them, and extract the unique nodes, then assign the colors.
# Create a stringlist of node pairs
nodestringlist <- paste(datSK$From,datSK$To, collapse=' ')
# Split them up
nodestringvector <- strsplit(nodestringlist, split =' ')
# Find the unique nodes in order they appear
node_order <- unique(nodestringvector[[1]])
#output: "A1" "A2" "B2" "C2" "B1" "C1" "A3" "B3" "C3"
Is this what you want?
Sankey Diagram in R using GoogleVis
Hard to believe, but you need to change the order of columns in your data frame.
Data
df <- data.frame(First_Area=c('group1', 'group1', 'group1', 'group2', 'group2', 'group2', 'group2', 'group3', 'group3', 'group3', 'group3', 'group3', 'group4', 'group4', 'group4', 'group4', 'group5', 'group5', 'group5', 'group5', 'group6', 'group6', 'group6', 'group7', 'group7', 'group8', 'group8', 'group8', 'group8', 'group8', 'group9', 'group9', 'group9', 'group10', 'group10', 'group10', 'group11', 'group11', 'group11'), Second_Area=c('group2', 'group3', 'group12', 'group13', 'group5', 'group6', 'group7', 'group8', 'group9', 'group10', 'group11', 'group14', 'group15', 'group16', 'group17', 'group18', 'group19', 'group20', 'group21', 'group22', 'group23', 'group24', 'group25', 'group26', 'group27', 'group28', 'group29', 'group30', 'group31', 'group32', 'group33', 'group34', 'group35', 'group36', 'group37', 'group38', 'group39', 'group40', 'group41'), record_count=c(25000, 25000, 25000, 5555, 5555, 5555, 8335, 5556, 5556, 5556, 5556, 2776, 5555, 5555, 5555, 5555, 2500, 2500, 500, 55, 1851, 1851, 1853, 5000, 555, 1100, 500, 1500, 1500, 956, 1852, 1852, 1852, 1852, 1852, 1852, 1852, 1852, 1852))
Code
plot(
googleVis::gvisSankey(df, from="First_Area",
to="Second_Area", weight="record_count",
options=list(
height=1250,
sankey="{link:{color:{fill:'lightblue'}}}"
))
)
Result
googleVis sankey-diagrams not displaying correctly
Excluding the rows with a value of zero yields this result:
library(reshape2)
library(googleVis)
library(xlsx)
download.file("https://ben.epe.gov.br/downloads/Matriz%20Energ%c3%a9tica%20Nacional%20ab2014.xlsx", tf <- tempfile(fileext = ".xlsx"), mode = "wb")
a <- xlsx::read.xlsx(tf, sheetName = 'consolidada tep', rowIndex = 24:49 , colIndex=2:10, header=FALSE) # startRow=4
b <- as.matrix(xlsx::read.xlsx(tf, sheetName = 'consolidada tep', rowIndex = 2:3, colIndex=2:10, header=FALSE, stringsAsFactors=FALSE) )
b <- paste0(b[1,],b[2,])
colnames(a) <- b
c <- xlsx::read.xlsx(tf, sheetName = 'consolidada tep', rowIndex = 24:49, colIndex=1:1, header=FALSE, stringsAsFactors=FALSE) # startRow=4
a <- cbind(c,a)
a2 <- melt(a,id='X1')[,c(2,1,3)]
colnames(a2) <- c('source','target','value')
plot(gvisSankey(subset(a2, value > 0), from="source", to="target", weight="value0", options=list(height=500, sankey="{link:{color:{fill:'lightblue'}}}")))
Making a Sankey Diagram with googleVis in R
If I understand correctly you have 3 states: type, organization and team. Type is always the origin, team is the final destination and organization is at first a destination and then an origin.
In the second SQL statement you use "Type" again as the origin, when the origin should be "Organization".
Your SQL has to be modified to look like this:
BrewersDraft <- sqldf("SELECT Type, Organization, COUNT(Name) AS PLAYERS
FROM df
GROUP BY 1,2
UNION ALL
SELECT Organization, (Tm) AS MLB_TEAM, COUNT(Name) AS PLAYERS
FROM df
GROUP BY 1,2")
How to make a googleVis multiple Sankey from a data.frame?
Function gvisSankey
does accept mid-levels directly. These levels have to be coded in underlying data.
source <- sample(c("NorthSrc", "SouthSrc", "EastSrc", "WestSrc"), 100, replace=T)
mid <- sample(c("NorthMid", "SouthMid", "EastMid", "WestMid"), 100, replace=T)
destination <- sample(c("NorthDes", "SouthDes", "EastDes", "WestDes"), 100, replace=T)
dummy <- rep(1,100) # For aggregation
Now, we'll reshape original data:
library(dplyr)
datSM <- dat %>%
group_by(source, mid) %>%
summarise(toMid = sum(dummy) ) %>%
ungroup()
Data frame datSM
summarises number of units from Source to Mid.
datMD <- dat %>%
group_by(mid, destination) %>%
summarise(toDes = sum(dummy) ) %>%
ungroup()
Data frame datMD
summarises number of units from Mid to Destination. This data frame will be added to the final data frame. Data frame need to be ungroup
and have same colnames
.
colnames(datSM) <- colnames(datMD) <- c("From", "To", "Dummy")
As the datMD
is appended as the last one, gvisSankey
will recognise the middle step automatically.
datVis <- rbind(datSM, datMD)
p <- gvisSankey(datVis, from="From", to="To", weight="dummy")
plot(p)
Here is the plot:
Related Topics
S4 Classes: Multiple Types Per Slot
Questions About Set.Seed() in R
Warning When Defining Factor: Duplicated Levels in Factors Are Deprecated
Multiply Columns in a Data Frame by a Vector
R: Split Elements of a List into Sublists
How to Increase Stack Space Overflow for Pandoc in R
How Does One Merge Dataframes by Row Name Without Adding a "Row.Names" Column
Changing Word Template for Knitr in Rmarkdown
Reproduce a 'The Economist' Chart with Dual Axis
Can Lapply Not Modify Variables in a Higher Scope
Scoping and Functions in R 2.11.1:What's Going Wrong
How to Calculate the Area of Polygon Overlap in R
Error in R Gbm Function When Cv.Folds > 0
How to Install Rhadoop Packages (Rmr, Rhdfs, Rhbase)
Remove Empty Factors from Clustered Bargraph in Ggplot2 with Multiple Facets
Plotting a 95% Confidence Interval for a Lm Object
R:Convert Nested List into a One Level List
How to Use 'Assign()' or 'Get()' on Specific Named Column of a Dataframe