How to Change Node and Link Colors in R Googlevis Sankey Chart

How to change node and link colors in R googleVis sankey chart

As soon as you have to color links from 2 originated nodes you'll need 2 colors for links.
Also you have 5 nodes in total, so you'll need 5 colors for them.

Lets create 2 arrays in JSON format with colors for nodes and links

colors_link <- c('green', 'blue')
colors_link_array <- paste0("[", paste0("'", colors_link,"'", collapse = ','), "]")

colors_node <- c('yellow', 'lightblue', 'red', 'black', 'brown')
colors_node_array <- paste0("[", paste0("'", colors_node,"'", collapse = ','), "]")

Next, insert that array into options:

opts <- paste0("{
link: { colorMode: 'source',
colors: ", colors_link_array ," },
node: { colors: ", colors_node_array ," }
}" )

And, finally plot graph:

plot( gvisSankey(datSK, from="From", to="To", weight="Weight",
options=list(
sankey=opts)))

Sample Image

Note, that in options colorMode is set to 'source' which means you would like to color links from originated nodes. Alternatively, set 'target' to color links for destinated nodes

EDIT: add description for multilevel sankeys

It is a bit tricky to find how to assign colors for multilevel sankeys.

We need to create other dateframe:

datSK <- data.frame(From=c(rep("A",3), rep("B", 3), rep(c("X", "Y", "Z"), 2 )),
To=c(rep(c("X", "Y", "Z"),2), rep("M", 3), rep("N", 3)),
Weight=c(5,7,6,2,9,4,3,4,5,6, 4,8))

Here we have to change only arrays of colors. Command to built plot is the same
Let's assume we want these colors for the nodes and links :

colors_link <- c('green', 'blue', 'yellow', 'brown', 'red')
colors_link_array <- paste0("[", paste0("'", colors_link,"'", collapse = ','), "]")

colors_node <- c('yellow', 'lightblue', 'red', 'black', 'brown', 'green', 'brown')
colors_node_array <- paste0("[", paste0("'", colors_node,"'", collapse = ','), "]")

Result would be :

Sample Image

The most trickiest part is to understand how these colors are assigned:

  1. Links are assigned in the order they appear in dataset (row_wise)

Sample Image


  1. For the nodes colors are assigned in the order plot is built.

    • From A to X, Y, Z - green
    • From X to M, N - blue
    • From Y to M, N - yellow
    • From Z to M, N - brown
    • From B to X, Y, Z - red

More detailed information on how to format sankey diagram : https://developers.google.com/chart/interactive/docs/gallery/sankey

Assigning node and link colors in R googleVis sankey chart

It has been a year, I do not know whether you still need the answer or not but this is what I found:

plot(gvisSankey(datSK, from="From", 
to="To", weight="Weight",
options=list(sankey="{
link: { colorMode: 'gradient'},
node: { colors: ['blue', 'blue', 'orange',
'green','orange', 'green',
'blue','orange','green']}
}")))

Google's Sankey Chart will assign the color based on the appearance order of the nodes. Here is how I decide the appearance order of the node. Basically I create a string of a list of node connections, split them, and extract the unique nodes, then assign the colors.

# Create a stringlist of node pairs
nodestringlist <- paste(datSK$From,datSK$To, collapse=' ')

# Split them up
nodestringvector <- strsplit(nodestringlist, split =' ')

# Find the unique nodes in order they appear
node_order <- unique(nodestringvector[[1]])
#output: "A1" "A2" "B2" "C2" "B1" "C1" "A3" "B3" "C3"

Is this what you want?

Sankey Diagram in R using GoogleVis

Hard to believe, but you need to change the order of columns in your data frame.

Data

df <- data.frame(First_Area=c('group1', 'group1', 'group1', 'group2', 'group2', 'group2', 'group2', 'group3', 'group3', 'group3', 'group3', 'group3', 'group4', 'group4', 'group4', 'group4', 'group5', 'group5', 'group5', 'group5', 'group6', 'group6', 'group6', 'group7', 'group7', 'group8', 'group8', 'group8', 'group8', 'group8', 'group9', 'group9', 'group9', 'group10', 'group10', 'group10', 'group11', 'group11', 'group11'), Second_Area=c('group2', 'group3', 'group12', 'group13', 'group5', 'group6', 'group7', 'group8', 'group9', 'group10', 'group11', 'group14', 'group15', 'group16', 'group17', 'group18', 'group19', 'group20', 'group21', 'group22', 'group23', 'group24', 'group25', 'group26', 'group27', 'group28', 'group29', 'group30', 'group31', 'group32', 'group33', 'group34', 'group35', 'group36', 'group37', 'group38', 'group39', 'group40', 'group41'), record_count=c(25000, 25000, 25000, 5555, 5555, 5555, 8335, 5556, 5556, 5556, 5556, 2776, 5555, 5555, 5555, 5555, 2500, 2500, 500, 55, 1851, 1851, 1853, 5000, 555, 1100, 500, 1500, 1500, 956, 1852, 1852, 1852, 1852, 1852, 1852, 1852, 1852, 1852))

Code

plot(
googleVis::gvisSankey(df, from="First_Area",
to="Second_Area", weight="record_count",
options=list(
height=1250,
sankey="{link:{color:{fill:'lightblue'}}}"
))
)

Result

1

googleVis sankey-diagrams not displaying correctly

Excluding the rows with a value of zero yields this result:

library(reshape2)
library(googleVis)
library(xlsx)

download.file("https://ben.epe.gov.br/downloads/Matriz%20Energ%c3%a9tica%20Nacional%20ab2014.xlsx", tf <- tempfile(fileext = ".xlsx"), mode = "wb")

a <- xlsx::read.xlsx(tf, sheetName = 'consolidada tep', rowIndex = 24:49 , colIndex=2:10, header=FALSE) # startRow=4
b <- as.matrix(xlsx::read.xlsx(tf, sheetName = 'consolidada tep', rowIndex = 2:3, colIndex=2:10, header=FALSE, stringsAsFactors=FALSE) )
b <- paste0(b[1,],b[2,])
colnames(a) <- b
c <- xlsx::read.xlsx(tf, sheetName = 'consolidada tep', rowIndex = 24:49, colIndex=1:1, header=FALSE, stringsAsFactors=FALSE) # startRow=4
a <- cbind(c,a)
a2 <- melt(a,id='X1')[,c(2,1,3)]
colnames(a2) <- c('source','target','value')

plot(gvisSankey(subset(a2, value > 0), from="source", to="target", weight="value0", options=list(height=500, sankey="{link:{color:{fill:'lightblue'}}}")))

Sample Image

Making a Sankey Diagram with googleVis in R

If I understand correctly you have 3 states: type, organization and team. Type is always the origin, team is the final destination and organization is at first a destination and then an origin.

In the second SQL statement you use "Type" again as the origin, when the origin should be "Organization".

Your SQL has to be modified to look like this:

BrewersDraft <- sqldf("SELECT Type, Organization, COUNT(Name) AS PLAYERS 
FROM df
GROUP BY 1,2
UNION ALL
SELECT Organization, (Tm) AS MLB_TEAM, COUNT(Name) AS PLAYERS
FROM df
GROUP BY 1,2")

How to make a googleVis multiple Sankey from a data.frame?

Function gvisSankey does accept mid-levels directly. These levels have to be coded in underlying data.

 source <- sample(c("NorthSrc", "SouthSrc", "EastSrc", "WestSrc"), 100, replace=T)
mid <- sample(c("NorthMid", "SouthMid", "EastMid", "WestMid"), 100, replace=T)
destination <- sample(c("NorthDes", "SouthDes", "EastDes", "WestDes"), 100, replace=T)
dummy <- rep(1,100) # For aggregation

Now, we'll reshape original data:

 library(dplyr)

datSM <- dat %>%
group_by(source, mid) %>%
summarise(toMid = sum(dummy) ) %>%
ungroup()

Data frame datSM summarises number of units from Source to Mid.

  datMD <- dat %>%
group_by(mid, destination) %>%
summarise(toDes = sum(dummy) ) %>%
ungroup()

Data frame datMD summarises number of units from Mid to Destination. This data frame will be added to the final data frame. Data frame need to be ungroup and have same colnames.

  colnames(datSM) <- colnames(datMD) <- c("From", "To", "Dummy")

As the datMD is appended as the last one, gvisSankey will recognise the middle step automatically.

  datVis <- rbind(datSM, datMD)

p <- gvisSankey(datVis, from="From", to="To", weight="dummy")
plot(p)

Here is the plot:
Multilevel Sankey



Related Topics



Leave a reply



Submit