Ggplot/Mapping Us Counties - Problems with Visualization Shapes in R

Using ggplot2 to Fill in Counties Based on FIPS Code

I'd call this a straight forward example of data wrangling.. you need to find where all your needed data is. That is, map_data("county") is missing the fips, then you need to google where to find it - maps::county.fips, check the format, create a data frame from it and then join it in and use it.

library(tidyverse)

ExampleFIPS <- c(19097, 17155, 50009, 27055, 39143, 55113, 44003, 55011, 19105, 46109, 19179, 55099, 51057)

maps::county.fips %>%
as.tibble %>%
extract(polyname, c("region", "subregion"), "^([^,]+),([^,]+)$") ->
dfips

map_data("county") %>%
left_join(dfips) ->
dall

dall %>%
mutate(is_example = fips %in% ExampleFIPS) %>%
ggplot(aes(long, lat, group = group)) +
geom_polygon(aes(fill=is_example), color="gray70") +
coord_map() +
scale_fill_manual(values=c("TRUE"="red", "FALSE"="gray90"))

This produces
US counties with a few highlighted

In R, How to manipulate (using manipulate pkg) the ggplot fill variable

For this to work you need to use aes_string in order to be able to use string choices in the picker function.

Look at this example:

#example data.table
set.seed(5)
dt <- data.table(a = runif(10),
b = runif(10),
fillvar1 = factor(sample.int(2,20, rep=T)),
fillvar2 = factor(sample.int(2,20, rep=T)),
fillvar3 = factor(sample.int(2,20, rep=T)))

Solution:

library(manipulate)
manipulate({
ggplot(dt, aes_string('a', 'b', colour=choose_var )) + geom_point()},
choose_var = picker('fillvar1','fillvar2','fillvar3'))

It is hard to upload a picture to show the interactivity but you will notice in the above code that every time you select a different value, the colours of the points will change accordingly.

create a map with the adapted size of states

Here's a very ugly first try to get you started, using the outlines from the maps package and some data manipulation from dplyr.

library(maps)
library(dplyr)
library(ggplot2)

# Generate the base outlines
mapbase <- map_data("state.vbm")

# Load the centroids
data(state.vbm.center)

# Coerce the list to a dataframe, then add in state names
# Then generate some random value (or your variable of interest, like population)
# Then rescale that value to the range 0.25 to 0.95

df <- state.vbm.center %>% as.data.frame() %>%
mutate(region = unique(mapbase$region),
somevalue = rnorm(50),
scaling = scales::rescale(somevalue, to = c(0.25, 0.95)))
df

# Join your centers and data to the full state outlines
df2 <- df %>%
full_join(mapbase)
df2

# Within each state, scale the long and lat points to be closer
# to the centroid by the scaling factor

df3 <- df2 %>%
group_by(region) %>%
mutate(longscale = scaling*(long - x) + x,
latscale = scaling*(lat - y) + y)
df3

# Plot both the outlines for reference and the rescaled polygons

ggplot(df3, aes(long, lat, group = region, fill = somevalue)) +
geom_path() +
geom_polygon(aes(longscale, latscale)) +
coord_fixed() +
theme_void() +
scale_fill_viridis()

Sample Image

These outlines aren't the best, and the centroid positions they shrink toward cause the polygons to sometimes overlap the original state outline. But it's a start; you can find better shapes for US states and various centroid algorithms.

Spatial Plot in R : how to plot the polygon and color as per the data to be visualized

I can't speak to why your code is not generating output - there are too many possible reasons - but is this what you are trying to achieve?

Sample Image

Code

library(rgdal)
library(ggplot2)
library(plyr)
library(RColorBrewer)
setwd("< directory with all your files >")

map <- readOGR(dsn=".",layer="ALRIS_tigcounty")
marriages <- read.csv("marriages.2012.csv",header=T,skip=3)
marriages <- marriages[2:16,]
marriages$County <- tolower(gsub(" ","",marriages$County))
marriages$Total <- as.numeric(as.character(marriages$Total))

data <- data.frame(id=rownames(map@data), NAME=map@data$NAME, stringsAsFactors=F)
data <- merge(data,marriages,by.x="NAME",by.y="County",all.x=T)
map.df <- fortify(map)
map.df <- join(map.df,data, by="id")

ggplot(map.df, aes(x=long, y=lat, group=group))+
geom_polygon(aes(fill=Total))+
geom_path(colour="grey50")+
scale_fill_gradientn("2012 Marriages",
colours=rev(brewer.pal(8,"Spectral")),
trans="log",
breaks=c(100,300,1000,3000,10000))+
theme(axis.text=element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank())+
coord_fixed()

Explanation

To generate a choropleth map, ultimately we need to associate polygons with your datum of interest (total marriages by county). This is a three step process: first we associate polygon ID with county name:

data <- data.frame(id=rownames(map@data), NAME=map@data$NAME, stringsAsFactors=F)

Then we associate county name with total marriages:

data <- merge(data,marriages,by.x="NAME",by.y="County",all.x=T)

Then we associate the result with the polygon coordinate data:

map.df <- join(map.df,data, by="id")

Your specific case has a lot of potential traps:

  1. The link you provided was to a pdf - utterly useless. But poking around a bit revealed an Excel file with the same data. Even this file needs cleaning: the data has "," separators, which need to be turned off, and some of the cells have footnotes, which have to be removed. Finally, we have to save as a csv file.
  2. Since we are matching on county name, the names have to match! In the shapefile attributes table, the county names are all lower case, and spaces have been removed (e.g., "Santa Cruz" is "santacruz". So we need to lowercase the county names and remove spaces:

    marriages$County <- tolower(gsub(" ","",marriages$County))

  3. The totals column comes in as a factor, which has to be converted to numeric:

    marriages$Total <- as.numeric(as.character(marriages$Total))

  4. Your actual data is highly skewed: maricopa county had 23,600 marriages, greenlee had 50. So using a linear color scale is not very informative. Consequently, we use a logarithmic scale:

    scale_fill_gradientn("2012 Marriages",
    colours=rev(brewer.pal(8,"Spectral")),
    trans="log",
    breaks=c(100,300,1000,3000,10000))+

Fixing problems with ggplot palette - how to create a gradient boxplot?

I'm not familiar with Tableau, but it looks like that not a gradient per se, but rather that there are points which are colored based on which quartile of the boxplot they're in. That can be done!

library(dplyr)
library(ggplot2)

data_frame(st = rep(state.name[1:5], each = 20),
inc = abs(rnorm(5*20))) %>%
group_by(st) %>%
mutate(bp = cut(inc, c(-Inf, boxplot.stats(inc)$stats, Inf), label = F)) %>%
ggplot(aes(st, inc)) +
stat_boxplot(outlier.shape = NA, width = .5) +
geom_point(aes(fill = factor(bp)), shape = 21, size = 4, alpha = 1) +
scale_fill_brewer(type = "div",
labels = c("1" = "Lowest outliers", "2" = "1st quartile",
"3" = "2nd quartile", "4" = "3rd quartile",
"5" = "4th quartile", "6" = "Highest outliers"))

Sample Image

ggmap map style repository? Now that CloudMade no longer gives out APIs

You can get a simple land - water contrast using the maps package:

Set the boundaries of the map with xlim and ylim.

library(maps)
library(ggplot2)

map <- fortify(map(fill = TRUE, plot = FALSE))

ggplot(data = map, aes(x=long, y=lat, group = group)) +
geom_polygon(fill = "ivory2") +
geom_path(colour = "black") +
coord_cartesian(xlim = c(137, 164), ylim = c(-14, 3.6)) +
theme(panel.background = element_rect(fill = "#F3FFFF"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank())

The map is a bit clunky, but high resolution maps are available in the mapdata package>

library(mapdata)
map <- fortify(map("worldHires", fill = TRUE, plot = FALSE))

ggplot(data = map, aes(x=long, y=lat, group = group)) +
geom_polygon(fill = "ivory2") +
geom_path(colour = "black") +
coord_cartesian(xlim = c(135, 165), ylim = c(-15, 0)) + # Papua New Guinea
theme(panel.background = element_rect(fill = "#F3FFFF"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) # Be patient

Or a single country can be selected.

map <- fortify(map("worldHires", fill = TRUE, plot = FALSE))

ggplot(data = subset(map, region == "Papua New Guinea"), aes(x=long, y=lat, group = group)) +
geom_polygon(fill = "ivory2") +
geom_path(colour = "black") +
theme(panel.background = element_rect(fill = "#F3FFFF"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank())


Related Topics



Leave a reply



Submit