Error in Strsplit When Trying to Separate by a Comma

splitting comma separated mixed text and numeric string with strsplit in R

You have two main options:

(1) grep for the numbers, and extract those.

(2) split on the comma, then coerce to numeric and check for NAs

I prefer the second

splat <- strsplit(x, ",")[[1]]
numbs <- !is.na(suppressWarnings(as.numeric(splat)))

c(paste(splat[!numbs], collapse=","), splat[numbs])
# [1] "name1, name2 and name3" " 0" " 1" " 2"

R strsplit problem (easy fix?)

Not sure what you're doing wrong because it works as advertised.

> x <- unlist(strsplit("1,2,5,6,10", ","))
> x
[1] "1" "2" "5" "6" "10"
> x[1]
[1] "1"

Keep in mind that strsplit returns a list.

Split comma delimited string

strsplit gives you back a list of the character vectors, so if you want it in a single vector, use unlist as well.
So,

    unlist(strsplit(string, ","))

Split comma delimited string

strsplit gives you back a list of the character vectors, so if you want it in a single vector, use unlist as well.
So,

    unlist(strsplit(string, ","))

Split unequally occurring comma-separated strings to columns and fill with missing values

Use read.table:

read.table(text = as.character(df$x), sep = ",", as.is = TRUE, fill = TRUE,
na.strings = "")

giving:

  V1   V2   V3
1 a b c
2 a <NA> <NA>
3 a b <NA>

Split on first comma in string

Here's what I'd probably do. It may seem hacky, but since sub() and strsplit() are both vectorized, it will also work smoothly when handed multiple strings.

XX <- "SoMeThInGrIdIcUlOuS"
strsplit(sub(",\\s*", XX, x), XX)
# [[1]]
# [1] "I want to split here"
# [2] "though I don't want to split elsewhere, even here."

strsplit from a using a space instead of a period

the error comes from the fact that data.frame coerces your character vector into a factor, which throws an error with strsplit, as said in the documentation.

Either you can do

student.exam.data$Student <-  strsplit(as.character(student.exam.data$Student), " ", fixed = TRUE)

Or

student.exam.data <- data.frame(Student,Math,Science,English, stringsAsFactors = FALSE)
student.exam.data$Student <- strsplit(student.exam.data$Student, " ", fixed = TRUE)

strsplit in R: How do I split one-column data separated by comma into multiple columns?

may be it was because of quotes,

try

raw_data <- read.csv("Raw_Data.csv", stringsAsFactors=FALSE, quotes="\"")

Split string on comma following a specific word

polishchuk's regex needs two modifications to make it work in R.

Firstly, the backslash needs escaping. Secondly, the call to strsplit needs the argument perl = TRUE to enable lookbehind.

strsplit(names, split = "\\.,|(?<=de)", perl = TRUE)

gives the answer Sacha asked for.

Notice though that this still includes a dot in de Jong's name, and it isn't extensible to alternatives like van, der, etc. I suggest the following alternative.

names <- "Jansen, A., Karel, A., Jong, A. de, Pietersen, K., Helsing, A. van"
#split on every comma
first_last <- strsplit(names, split = ",")[[1]]
#rearrange into a matrix with the first column representing last names,
#and the second column representing initials
first_last <- matrix(first_last, byrow = TRUE, ncol = 2)
#clean up: remove leading spaces and dots
first_last <- gsub("^ ", "", first_last)
first_last <- gsub("\\.", "", first_last)
#combine columns again
apply(first_last, 1, paste, collapse = ", ")


Related Topics



Leave a reply



Submit