Regression line for the entire data set together with regression lines based on groups
Try placing the colour, shape, linetype aesthetics not in the original call to ggplot2
You can then add the overall line with a different colour
set.seed(1)
library(plyr)
alldata <- ddply(data.frame(group = letters[1:5], x = rnorm(50)), 'group',
mutate, y=runif(1,-1,1) * x +rnorm(10))
ggplot(alldata,aes(y = y, x = x)) +
geom_point(aes(colour = group, shape = group), size = 3, alpha = .8) +
geom_smooth(method = "lm", se = FALSE, size = 1,
aes(linetype = group, group = group)) +
geom_smooth(method = "lm", size = 1, colour = 'black', se = F) +
theme_bw()
100 samples of 20 from the dataset and drawing regression lines along with population regression line
Using a loop:
n=100
for(i in 1:n){
df = grades[sample(1:nrow(grades), 20),]
g = g + geom_smooth(method = lm, data=df, color="red", size=0.5, alpha = 0)
}
plot(g)
Output:
I encourage you to mess with the aesthetics of it, adding a dashed line for example:
How to make regression based on grouped rows and loop over columns?
1) Use lmList
in nlme (which comes with R so you don't have to install it).
library(nlme)
regs <- lmList(cbind(y1, y2, y3) ~ x | group, dat)
giving an lmList
object having a component for each group. We show the component for group a
and the other groups are similar.
> regs$a
Call:
lm(formula = object, data = dat, na.action = na.action)
Coefficients:
y1 y2 y3
(Intercept) 0.2943 0.1395 0.4539
x 0.3721 -0.2206 -0.2255
2) Another approach is to perform one overall lm
giving an lm
object having the same coefficients as above.
lm(cbind(y1, y2, y3) ~ group + x:group + 0, dat)
3) We could also use one of several list comprehension packages. This gives a list of 9 components. The names of the components identify the combination used as does the call component (shown in the Call: line of the output) within each main component. Note t hat the current CRAN version is 0.1.0 but the code below relies on listcompr 0.1.1 which can be obtained from github until it is put on CRAN.
# install.github("patrickroocks/listcompr")
library(listcompr)
packageVersion("listcompr") # need version 0.1.1 or later
regs <- gen.named.list("{y}.{g}",
do.call("lm",
list(reformulate("x", y), quote(dat), subset = bquote(dat$group == .(g)))
), y = c("y1", "y2", "y3"), g = unique(dat$group)
)
If you don't mind that the Call: line in the output is less descriptive then it can be simplified to:
gen.named.list("{y}.{g}", lm(reformulate("x", y), dat, subset = group == g),
y = c("y1", "y2", "y3"), g = unique(dat$group))
Note
The input corrected from question which had two y2's.
set.seed(123)
dat <- data.frame(group=c(rep("a",10), rep("b",10), rep("c",10)),
x=rnorm(30), y1=rnorm(30), y2=rnorm(30), y3=rnorm(30))
ggplot2; single regression line when colour is coded for by a variable?
data.male <- read.table(header=TRUE,text="
mid_year mean_tc survey_type
2000 4 Community
2001 5 National
2002 5.1 Subnational
2003 4.3 National
2004 4.5 Community
2005 5.2 Subnational
2006 4.4 National")
- Use
aes(group=1)
in thegeom_smooth()
specification to ignore the grouping by survey type induced by assigning the colour mapping to survey type. (Alternatively, you can put the colour mapping intogeom_point()
rather than the overallggplot()
specification.) - If you want to specify colour you need to give it as the name of a variable in your data frame (i.e.,
survey_type
); if you want to change the name in the legend tocondition
you can do that in the colour scale specification (example below).
library(ggplot2); theme_set(theme_bw())
ggplot(data=data.male,aes(x=mid_year, y=mean_tc, colour=survey_type)) +
geom_point(shape=1) +
## use aes(group=1) for single regression line across groups;
## don't need to re-specify data argument
## set colour to black (from default blue) to avoid confusion
## with national (blue) points
geom_smooth(method=lm, na.rm = TRUE, fullrange= TRUE,
aes(group=1),colour="black")+
scale_colour_manual(name="condition",
values=c("red","blue","green"))
## in factor level order; probably better to
## specify 'breaks' explicitly ...
- Out of courtesy to colour-blind people I would suggest not using primary red/green/blue as your colour specifications (try
scale_colour_brewer(palette="Dark1")
instead).
Related Topics
R: Interactive Plots (Tooltips): Rcharts Dimple Plot: Formatting Axis
Replace Every Single Character at the Start of String That Matches a Regex Pattern
How to Change Factor Labels into String in a Data Frame
Concatenate Values Across Columns in Data.Table, Row by Row
How to Use a Character as Attribute of a Function
Dplyr . and _No Visible Binding for Global Variable '.'_ Note in Package Check
Using Both Color and Size Attributes in Hexagon Binning (Ggplot2)
Two Y Axis in Highcharter in R
Converting Date Column in Data Frame
Plot Table Objects with Ggplot
How to Run a Function Every Second
Ggsave Png Error with Larger Size
Refer to Range of Columns by Name in R