splitting comma separated mixed text and numeric string with strsplit in R
You have two main options:
(1) grep for the numbers, and extract those.
(2) split on the comma, then coerce to numeric and check for NA
s
I prefer the second
splat <- strsplit(x, ",")[[1]]
numbs <- !is.na(suppressWarnings(as.numeric(splat)))
c(paste(splat[!numbs], collapse=","), splat[numbs])
# [1] "name1, name2 and name3" " 0" " 1" " 2"
R strsplit problem (easy fix?)
Not sure what you're doing wrong because it works as advertised.
> x <- unlist(strsplit("1,2,5,6,10", ","))
> x
[1] "1" "2" "5" "6" "10"
> x[1]
[1] "1"
Keep in mind that strsplit
returns a list
.
Split comma delimited string
strsplit gives you back a list of the character vectors, so if you want it in a single vector, use unlist as well.
So,
unlist(strsplit(string, ","))
Split comma delimited string
strsplit gives you back a list of the character vectors, so if you want it in a single vector, use unlist as well.
So,
unlist(strsplit(string, ","))
Split unequally occurring comma-separated strings to columns and fill with missing values
Use read.table
:
read.table(text = as.character(df$x), sep = ",", as.is = TRUE, fill = TRUE,
na.strings = "")
giving:
V1 V2 V3
1 a b c
2 a <NA> <NA>
3 a b <NA>
Split on first comma in string
Here's what I'd probably do. It may seem hacky, but since sub()
and strsplit()
are both vectorized, it will also work smoothly when handed multiple strings.
XX <- "SoMeThInGrIdIcUlOuS"
strsplit(sub(",\\s*", XX, x), XX)
# [[1]]
# [1] "I want to split here"
# [2] "though I don't want to split elsewhere, even here."
strsplit from a using a space instead of a period
the error comes from the fact that data.frame coerces your character vector into a factor, which throws an error with strsplit
, as said in the documentation.
Either you can do
student.exam.data$Student <- strsplit(as.character(student.exam.data$Student), " ", fixed = TRUE)
Or
student.exam.data <- data.frame(Student,Math,Science,English, stringsAsFactors = FALSE)
student.exam.data$Student <- strsplit(student.exam.data$Student, " ", fixed = TRUE)
strsplit in R: How do I split one-column data separated by comma into multiple columns?
may be it was because of quotes,
try
raw_data <- read.csv("Raw_Data.csv", stringsAsFactors=FALSE, quotes="\"")
Split string on comma following a specific word
polishchuk's regex needs two modifications to make it work in R.
Firstly, the backslash needs escaping. Secondly, the call to strsplit
needs the argument perl = TRUE
to enable lookbehind.
strsplit(names, split = "\\.,|(?<=de)", perl = TRUE)
gives the answer Sacha asked for.
Notice though that this still includes a dot in de Jong's name, and it isn't extensible to alternatives like van, der, etc. I suggest the following alternative.
names <- "Jansen, A., Karel, A., Jong, A. de, Pietersen, K., Helsing, A. van"
#split on every comma
first_last <- strsplit(names, split = ",")[[1]]
#rearrange into a matrix with the first column representing last names,
#and the second column representing initials
first_last <- matrix(first_last, byrow = TRUE, ncol = 2)
#clean up: remove leading spaces and dots
first_last <- gsub("^ ", "", first_last)
first_last <- gsub("\\.", "", first_last)
#combine columns again
apply(first_last, 1, paste, collapse = ", ")
Related Topics
Generating Multiple Plots in Ggplot by Factor
Geom_Density to Match Geom_Histogram Binwitdh
Dealing with Spaces and "Weird" Characters in Column Names with Dplyr::Rename()
How to Calculate Mean of All Columns, by Group
Writing Data Frame to PDF Table
How to Return 5 Topmost Values from Vector in R
First Day of the Month from a Posixct Date Time Using Lubridate
Should I Avoid Programming Packages with Pipe Operators
How to Find Common Rows Between Two Dataframe in R
In R, How to Check If Two Variable Names Reference the Same Underlying Object
Change Plotly Chart Y Variable Based on Selectinput
Overlay Geom_Points() on Geom_Boxplot(Fill=Group)
Ggplot2 Draw Individual Ellipses But Color by Group
Ggplot: Order Bars in Faceted Bar Chart Per Facet
Calculating Peaks in Histograms or Density Functions