Replace String Pattern in sqldf
1) Assuming that only the forms shown in the question are allowed replace colons with minus signs and then replace minus followed by space with space, minus, space.
library(sqldf)
sqldf("select *, replace(replace([Col1], ':', '-'), '- ', ' - ') as New from df")
giving:
Col1 Col2 Col3 New
1 ten: end 5 10 ten - end
2 five: nb 7 11 five - nb
3 12:4 12 10 12-4
4 13:56 15 16 13-56
2) If we can assume that the only forms are number:number or character: character and that the second form contains no digits.
sqldf("select *,
case when strFilter(Col1, '0123456789') = ''
then replace(Col1, ':', ' -')
else replace(Col1, ':', '-')
end as New
from df")
giving:
Col1 Col2 Col3 New
1 ten: end 5 10 ten - end
2 five: nb 7 11 five - nb
3 12:4 12 10 12-4
4 13:56 15 16 13-56
3) This first checks for numbers:numbers and then checks for characters: characters where characters can only be numbers or lower case letters.
dig <- "0123456789"
diglet <- "0123456789abcdefghijklmnopqrstuvwxyz"
fn$sqldf("select *,
case when trim(Col1, '$dig') = ':'
then replace(Col1, ':', '-')
when trim(Col1, '$diglet') = ': '
then replace(Col1, ': ', ' - ')
else Col1 end as New
from df")
giving:
Col1 Col2 Col3 New
1 ten: end 5 10 ten - end
2 five: nb 7 11 five - nb
3 12:4 12 10 12-4
4 13:56 15 16 13-56
4) This one extracts the x:y and checks whether x and y are number and if so does the appropriate replacement and if no match it extracts x:yz where y is a space and if x and z are digits or lower case then it performs the appropriate replacement and otherwise returns Col1. dig
and diglet
are from above.
fn$sqldf("select *,
case when trim(substr(Col1, instr(Col1, ':')-1, 3), '$dig') = ':'
then replace(Col1, ':', '-')
when trim(substr(Col1, instr(Col1, ':') -1, 4), '$diglet') = ': '
then replace(Col1, ': ', ' - ')
else Col1 end as New
from df")
Note
The input in reproducible form is:
Lines <- "Col1,Col2,Col3
ten: end,5,10
five: nb,7,11
12:4,12,10
13:56,15,16"
df <- read.csv(text = Lines, as.is = TRUE, strip.white = TRUE)
If Column contains - at the end of a value, remove the - at the end - sqldf
I think you looking for rtrim
library(sqldf)
df_new<- sqldf("select ColB,rtrim(ColA,'-') as ColA from df")
ColB ColA
1 3 djdn
2 5 dn-dn
3 8 ndmc
4 9 nd-nd-md
Replace parts of a string in R
Using stringr
:
stringr::str_replace_all(input, "keyword(\\d)", "tiab LIKE '%keyword\\1%'")
# "((tiab LIKE '%keyword1%') OR (tiab LIKE '%keyword2%') AND (tiab LIKE '%keyword3%'))"
For your other example:
input <- "((ELISA) OR (antibody test) AND (blood))"
str_replace_all(input, "\\(([^\\(\\)]+)\\)", "(tiab LIKE '%\\1%')")
# "((tiab LIKE '%ELISA%') OR (tiab LIKE '%antibody test%') AND (tiab LIKE '%blood%'))"
How do I use regex in a SQLite query?
SQLite3 supports the REGEXP operator:
WHERE x REGEXP <regex>
http://www.sqlite.org/lang_expr.html#regexp
Applying Instr function in reverse order using sqldf
1) H2 backend If you use the H2 database backend to sqldf then the select statement in the question will work as written in the question:
library(sqldf)
library(RH2) # if H2 database is loaded sqldf will notice it and use it
sqldf("SELECT
substr(Quiz_answers, instr(Quiz_answers,'`',-1) + 1) AS HistoryAnswer
FROM data")
1a) H2 also supports the regexp_replace and regexp_like functions in a select statement.
2) SQLite backend If you are using the default SQLite backend to sqldf then trim off the first backtick from the left in which case the original second backtick becomes the first and we can use instr
. (Alternately we could use substr(Quiz_answers, 2)
in place of ltrim(...)
.)
library(sqldf) # if no other database is loaded sqldf uses the RSQLite backend
sqldf("SELECT
substr(Quiz_answers, instr(ltrim(Quiz_answers, '`'), '`') + 2) AS HistoryAnswers
from data")
2a) An alternative if the part between the backticks contains only a, digits, comma and backtick is the following
sqldf("select ltrim(Quiz_answers, '`a1234567890,') as HistoryAnswers from data")
(If in reality it could contain any letter but not an upper case letter it would still work if you replace a with all the letters of the alphabet in the ltrim
second argument.)
2b) Yet another approach with SQLite would be:
sqldf("select
case
when Quiz_answers like '%Positive' then 'Positive'
when Quiz_answers like '%Negative' then 'Negative'
else 'Neutral'
end as HistoryAnswers
from data")
This would also work under H2.
2c) or if it is OK to return +1, 0, -1 rather than Positive, Neutral, Negative then:
sqldf("select
(Quiz_answers like '%Positive') - (Quiz_answers like '%Negative') as HistoryAnswers
from data")
This would work in H2 if we cast the two booleans to int.
2d) Another approach is to reverse the string using reverse
like this:
sqldf("select
substr(Quiz_answers, length(Quiz_answers) - instr(reverse(Quiz_answers), '`') + 2)
as HistoryAnswers
from data")
Related Topics
R Column Check If Contains Value from Another Column
Aggregation Using Ffdfdply Function in R
Dplyr::Select One Column and Output as Vector
Hiding Personal Functions in R
Plotting the Average Values for Each Level in Ggplot2
Updating Column in One Dataframe with Value from Another Dataframe Based on Matching Values
Stl Decomposition of Time Series with Missing Values for Anomaly Detection
Differences Between %.% (Dplyr) and %>% (Magrittr)
How to Assign Output of Cat to an Object
Ordering Permutation in Rcpp I.E. Base::Order()
R Aggregate Data in One Column Based on 2 Other Columns
Using Ggplot for Scattering Dots
Speeding Up Julia's Poorly Written R Examples
Remove Spacing Around Plotting Area in R
Rmarkdown Directing Output File into a Directory
Get Selected Row from Datatable in Shiny App
Insert Portions of a Markdown Document Inside Another Markdown Document Using Knitr
How to Delete a Row from a Data.Frame Without Losing the Attributes