Using R to Analyze Balance Sheets and Income Statements
You are making the common mistake of confusing 'access to Yahoo or Google data' with 'everything I see on Yahoo or Google Finance can be downloaded'.
When R functions download historical stock price data, they almost always access an interface explicitly designed for this purpose as e.g. a cgi handler providing csv files given a stock symbol and start and end date. So this easy as all we need to do is form the appropriate query, hit the webserver, fetch the csv file an dparse it.
Now balance sheet information is (as far as I know) not available in such an interface. So you will need to 'screen scrape' and parse the html directly.
It is not clear that R is the best tool for this. I am aware of some Perl modules for the purpose of getting non-time-series data off Yahoo Finance but have not used them.
importing financial statements from getFin() to data.frame or data.table?
This should give you what you want.
require(quantmod)
setwd("C:/Users/your_path_here/downloads")
stocks <- c("AXP","BA","CAT","CSCO","CVX","DD","DIS","GE","GS","HD","IBM","INTC","JNJ","JPM","KO","MCD","MMM","MRK","MSFT","NKE","PFE","PG","T","TRV","UNH","UTX","V","VZ","WMT","XOM")
# equityList <- read.csv("EquityList.csv", header = FALSE, stringsAsFactors = FALSE)
# names(equityList) <- c ("Ticker")
for (i in 1 : length(stocks)) {
temp<-getFinancials(stocks[i],src="google",auto.assign=FALSE)
write.csv(temp$IS$A,paste(stocks[i],"_Income_Statement(Annual).csv",sep=""))
write.csv(temp$BS$A,paste(stocks[i],"_Balance_Sheet(Annual).csv",sep=""))
write.csv(temp$CF$A,paste(stocks[i],"_Cash_Flow(Annual).csv",sep=""))
write.csv(temp$IS$A,paste(stocks[i],"_Income_Statement(Quarterly).csv",sep=""))
write.csv(temp$BS$A,paste(stocks[i],"_Balance_Sheet(Quaterly).csv",sep=""))
write.csv(temp$CF$A,paste(stocks[i],"_Cash_Flow(Quaterly).csv",sep=""))
}
Importing 10-year history of company financials
This will get you 10 years worth of data, where it exists.
stocks <- c("AXP", "BA", "CAT", "CSCO")
for (s in stocks) {
names(urls) <- sprintf("http://financials.morningstar.com/ajax/exportKR2CSV.html?&t=%s", stocks)
lst <- lapply(urls, read.csv, header = TRUE, stringsAsFactors = FALSE, skip = 2)
lst1 <- lapply(lst, `[`, -12)
write.csv(lst1, file = "C:/Users/your_path/Desktop/files/data.csv", row.names = FALSE, col.names = FALSE, na = "")
}
Web scraping of key stats in Yahoo! Finance with R
I gave up on Excel a long time ago. R is definitely the way to go for things like this.
library(XML)
stocks <- c("AXP","BA","CAT","CSCO")
for (s in stocks) {
url <- paste0("http://finviz.com/quote.ashx?t=", s)
webpage <- readLines(url)
html <- htmlTreeParse(webpage, useInternalNodes = TRUE, asText = TRUE)
tableNodes <- getNodeSet(html, "//table")
# ASSIGN TO STOCK NAMED DFS
assign(s, readHTMLTable(tableNodes[[9]],
header= c("data1", "data2", "data3", "data4", "data5", "data6",
"data7", "data8", "data9", "data10", "data11", "data12")))
# ADD COLUMN TO IDENTIFY STOCK
df <- get(s)
df['stock'] <- s
assign(s, df)
}
# COMBINE ALL STOCK DATA
stockdatalist <- cbind(mget(stocks))
stockdata <- do.call(rbind, stockdatalist)
# MOVE STOCK ID TO FIRST COLUMN
stockdata <- stockdata[, c(ncol(stockdata), 1:ncol(stockdata)-1)]
# SAVE TO CSV
write.table(stockdata, "C:/Users/your_path_here/Desktop/MyData.csv", sep=",",
row.names=FALSE, col.names=FALSE)
# REMOVE TEMP OBJECTS
rm(df, stockdatalist)
Related Topics
Transparent Equivalent of Given Color
How to Include Rmarkdown File in R Package
Histogram with "Negative" Logarithmic Scale in R
How to Display Widgets Inline in Shiny
Finding Elements That Do Not Overlap Between Two Vectors
Extract Non Null Elements from a List in R
How to Draw Two Half Circles in Ggplot in R
How to Create an Edge List from a Matrix in R
How to Add an External Legend to Ggpairs()
How to Change a Single Value in a Data.Frame
Changing Title in Multiplot Ggplot2 Using Grid.Arrange
Arrange a Grouped_Df by Group Variable Not Working
Changing Font Size in R Datatables (Dt)
Ggplot: Remove Na Factor Level in Legend
Arithmetic Mean on a Multidimensional Array on R and Matlab: Drastic Difference of Performances