Read a text file in R line by line
Here is the solution with a for
loop. Importantly, it takes the one call to readLines
out of the for loop so that it is not improperly called again and again. Here it is:
fileName <- "up_down.txt"
conn <- file(fileName,open="r")
linn <-readLines(conn)
for (i in 1:length(linn)){
print(linn[i])
}
close(conn)
What is a good way to read line-by-line in R?
The example Josh linked to is one that I use all the time.
inputFile <- "/home/jal/myFile.txt"
con <- file(inputFile, open = "r")
dataList <- list()
ecdfList <- list()
while (length(oneLine <- readLines(con, n = 1, warn = FALSE)) > 0) {
myVector <- (strsplit(oneLine, " "))
myVector <- list(as.numeric(myVector[[1]]))
dataList <- c(dataList,myVector)
myEcdf <- ecdf(myVector[[1]])
ecdfList <- c(ecdfList,myEcdf)
}
close(con)
I edited the example to create two lists from your example data. dataList is a list where each item in the list is a vector of numeric values from each line in your text file. ecdfList is a list where each element is an ecdf for each line in your text file.
You should probably add some try() or trycatch() logic in there to properly handle situations where the ecdf can't be created because of nulls or some such. But the above example should get you pretty close. Good luck!
How to read a txt file line by line in R/Rstudio?
You can use readLines
function.
How to Read Certain Lines of A Data File Into R
Check this out:
con <- file("test1.txt", "r")
lines <- c()
while(TRUE) {
line = readLines(con, 1)
if(length(line) == 0) break
else if(grepl("^\\s*F{1}", line) && grepl("(0,0)", line, fixed = TRUE)) lines <- c(lines, line)
}
lines
# [1] "F 20160602 14:25:11.321 F7982D50 GET 156.145.15.85:37525 xqixh8sl AES \"/pcgc/public/Other/exome/fastq/PCGC0077248_HS_EX__1-06808__v3_FCC49HJACXX_L7_p1of1_P1.fastq.gz\" \"\" 3322771022 (0,0) \"1499.61 seconds (17.7 megabits/sec)\""
Pass the file stream to readLines
so that it can read it line by line. Use regular expression ^\\s*F{1}
to capture line starting with letter F
with possible white spaces where ^
denote the beginning of a string. Use fixed=T
to capture the exact match of (0,0)
. If both of the checks are TRUE
, append the result to lines.
Data:
D 20160602 14:15:43.559 F7982D62 Req Agr:131 Mra:0 Exp:0 Mxr:0 Mnr:0 Mxd:0 Mnd:0 Nro:0
D 20160602 14:15:43.559 F7982D62 Set Agr:130 Mra:0 Exp:0 Mxr:0 Mnr:0 Mxd:0 Mnd:0 Nro:0 I 20160602 14:15:43.559 F7982D62 GET 156.145.15.85:36773 xqixh8sl AES "/pcgc/public/Other/exome/fastq/PCGC0065109_HS_EX__1-04692__v3_FCAD2HMUACXX_L4_p1of1_P2.fastq.gz" ""
M 20160602 14:15:43.595 DOC1: F7982D62 Request for unencrypted meta data on encrypted transaction
M 20160602 14:15:48.353 DOC1: F7982D62 Transaction has been acknowledged at 722875647
F 20160602 14:15:48.398 F7982D62 GET 156.145.15.85:36773 xqixh8sl AES "/pcgc/public/Other/exome/fastq/PCGC0065109_HS_EX__1-04692__v3_FCAD2HMUACXX_L4_p1of1_P2.fastq.gz" "" 50725464 (4,32) "Remote Application: Session Aborted: Aborted by user interrupt"
M 20160602 14:15:48.780 DOC1: F7982D63 New download request D 20160602 14:15:48.780 F7982D63 META: 134 Path: /pcgc/public/CTD/exome/fastq/PCGC0033175_HS_EX__1-00304-01__v1_FCBC0RE4ACXX_L3_p32of96_P2.fastq.gz user: xqixh8sl pack: arg: feat: cE,s
F 20160602 14:25:11.321 F7982D50 GET 156.145.15.85:37525 xqixh8sl AES "/pcgc/public/Other/exome/fastq/PCGC0077248_HS_EX__1-06808__v3_FCC49HJACXX_L7_p1of1_P1.fastq.gz" "" 3322771022 (0,0) "1499.61 seconds (17.7 megabits/sec)"
How to read a table line by line - using R?
You can use the fast fread()
from data.table
.
By skip=
, you're setting the beginning of the read segment and by nrow=
, the number of rows to read.
Read Large File line by line in R without header
Maybe something like this can help you :
inputFile <- "foo.txt"
con <- file(inputFile, open = "r")
while (length(oneLine <- readLines(con, n = 1)) > 0) {
myLine <- unlist((strsplit(oneLine, ",")))
print(myLine)
}
close(con)
or with scan to avoid splitting as @MatthewPlourde
I use scan : I skip the header, and quiet = TRUE to not have message saying how many items have been
while (length(myLine <- scan(con,what="numeric",nlines=1,sep=',',skip=1,quiet=TRUE)) > 0 ){
## here I print , but you must have a process your line here
print(as.numeric(myLine))
}
Related Topics
R: How to Draw a Line with Multiple Arrows in It
Get the Column Number in R Given the Column Name
Reading Hdf Files into R and Converting Them to Geotiff Rasters
How to Change the Resolution of a Raster Layer in R
Package Dependencies When Installing from Source in R
Does Converting Character Columns to Factors Save Memory
How to Set Na.Rm to True Globally
Create Data Set from Clicks in Shiny Ggplot
How to Order Bars in Faceted Ggplot2 Bar Chart
How to Remove Na from Facet_Wrap in Ggplot2
Does the Ternary Operator Exist in R
Efficient Alternatives to Merge for Larger Data.Frames R
Marking Specific Tiles in Geom_Tile()/Geom_Raster()
Conditional Assignment of One Variable to the Value of One of Two Other Variables
Are Recursive Functions Used in R