How to read data with different separators?
I'd probably do this.
read.table(text = gsub(",", "\t", readLines("file.txt")))
V1 V2 V3 V4 V5
1 a 1 2 3 5
2 b 4 5 6 7
3 c 5 6 7 8
Unpacking that just a bit:
readLines()
reads the file into R as a character vector with one element for each line.gsub(",", "\t", ...)
replaces every comma with a tab, so that now we've got lines with just one kind of separating character.- The
text =
argument toread.table()
lets it know you are passing it a character vector to be read directly (rather than the name of a file containing your text data).
How to read a CSV file into R which uses two types of separators in the file?
A double-tap:
x1 <- read.csv("quux.csv", check.names = FALSE)
x2 <- read.csv2(text = x1[[1]], header = FALSE)
names(x2) <- unlist(read.csv2(text = names(x1)[1], header = FALSE))
cbind(x2, x1[,-1,drop=FALSE])
# car_brand car_model total
# 1 Toyota 9289 29781
# 2 Seat 20981 1610
# 3 Volkswagen 11140 904
# 4 Suzuki 11640 658
# 5 Renault 13075 647
# 6 Ford 15855 553
The use of check.names=FALSE
is required because otherwise names(x1)[1]
looks like "car_brand..car_model"
. While it can be parsed like this, I thought it better to parse the original text.
How do i read a .txt file into R with different separators, and run on lines?
You can solve this in a few different ways. One approach would be to import the data into a single column and then use tidyr::separate
or data.table::strsplit
to split the column at the appropriate places. Here's an example with tidyr
:
# Use a separator symbol that is unlikely to appear in the file,
# to read the data into a single column:
data <- read.table("filename.txt", sep = "^")
# First split the column at the @-sign, and then at the ": "-part:
library(tidyr)
data %>% separate(V1,
into = c("Date", "User"),
sep = " @") %>%
separate(User,
into = c("User", "Review"),
sep = ": ") -> data
# If you want to add back the @-sign to the usernames:
data$User <- paste("@", data$User, sep = "")
Python - Reading a data text file with different delimiters
Solution using pandas:
data = pd.read_csv('data.txt',
sep=";|:|,",
header=None,
engine='python')
This will write every value in a new column. Hope this could be helpful.
Read txt file with multiple separators
Replace each [ with a newline and each ] and comma with a space and then read it in:
txt <- '["201801",111],["201802",222],["201803",333]'
read.table(text = chartr("[],", "\n ", txt))
giving:
V1 V2
1 201801 111
2 201802 222
3 201803 333
Multiple Separators for the same file input R
Try this:
# dummy data
df <- read.table(text="
Name Name1 *XYZ_Name3_KB_MobApp_M-18-25_AU_PI ANDROID 2013-09-32 14:39:55.0 2013-10-16 13:58:00.0 0 218 4 93 1377907200000
Name Name2 *CCC_Name3_KB_MobApp_M-18-25_AU_PI ANDROID 2013-09-32 14:39:55.0 2013-10-16 13:58:00.0 0 218 4 93 1377907200000
", as.is = TRUE)
# replace "_" to "-"
df_V3 <- gsub(pattern="_", replacement="-", df$V3, fixed = TRUE)
# strsplit, make dataframe
df_V3 <- do.call(rbind.data.frame, strsplit(df_V3, split = "-"))
# output, merge columns
output <- cbind(df[, c(1:2)],
df_V3,
df[, c(4:ncol(df))])
Building on the comments below, here is another related option, but one which uses read.table
instead of strsplit
.
splitCol <- "V3"
temp <- read.table(text = gsub("-", "_", df[, splitCol]), sep = "_")
names(temp) <- paste(splitCol, seq_along(temp), sep = "_")
cbind(df[setdiff(names(df), splitCol)], temp)
Read csv in pandas with different separator (commas)
Use regex separator [,]+
- one or more ,
:
import pandas as pd
from pandas.compat import StringIO
temp=u"""iBG,6141.6,6141.6,,3.0,,,ic"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep="[,]+", header=None, engine='python')
print (df)
0 1 2 3 4
0 iBG 6141.6 6141.6 3.0 ic
Related Topics
How to Create a New Variable in a Data.Frame Based on a Condition
Partially Color Histogram in R
How to Plot a Heat Map on a Spatial Map
Can 'Ddply' (Or Similar) Do a Sliding Window
Different Robust Standard Errors of Logit Regression in Stata and R
Calculating Percentile of Dataset Column
Find Names of Columns Which Contain Missing Values
How to Manually Change the Key Labels in a Legend in Ggplot2
Specify Height and Width of Ggplot Graph in Rmarkdown Knitr Output
Asymmetric Expansion of Ggplot Axis Limits
How to Extend Letters Past 26 Characters E.G., Aa, Ab, Ac...
How to Pass Multiple Arguments to a Function as a Single Vector
Warning in Install.Packages: Unable to Move Temporary Installation
How to Merge Two Data Frames on Common Columns in R with Sum of Others
Varying Axis Labels Formatter Per Facet in Ggplot/R
Convert Sequence of Longitude and Latitude to Polygon via Sf in R