Agrep: Only Return Best Match(Es)

agrep: only return best match(es)

RecordLinkage package was removed from CRAN, use stringdist instead:

library(stringdist)

ClosestMatch2 = function(string, stringVector){

stringVector[amatch(string, stringVector, maxDist=Inf)]

}

Why does agrep in R not find the best match?

You can try adist (for generalized Levenshtein (edit) distance), with the following result ('height' from example1 best matches with height from example2 etc.):

adist(example1, example2)
[,1] [,2]
[1,] 0 1
[2,] 1 0

example2[apply(adist(example1, example2), 1, which.min)]
# [1] "height" "weight"

agrep function of R is not working for text matching

It is because of your max.distance parameter. see ?agrep.

for instance:

agrep("ms sharda stone crusher prop rupa devi",x,ignore.case=T,value=T,max.distance = 0.2, useBytes = FALSE)
"sharda stone crusher prop rupa"
agrep("ms sharda stone crusher prop rupa devi",x,ignore.case=T,value=T,max.distance = 0.25, useBytes = FALSE)
"sharda stone crusher prop roopa" "sharda stone crusher prop rupa"
agrep("ms sharda stone crusher prop rupa devi",x,ignore.case=T,value=T,max.distance = 9, useBytes = FALSE)
"sharda stone crusher prop rupa"
agrep("ms sharda stone crusher prop rupa devi",x,ignore.case=T,value=T,max.distance = 10, useBytes = FALSE)
"sharda stone crusher prop roopa" "sharda stone crusher prop rupa"

If you want only the closest match see:
best match

Use agrep to return a different variable

How about

personalfolders$DOBMatch <- lapply(personalfolders$DOB, function(y) allees2$PartPathMatch1[agrep(y, allees2$`Date Of Birth`, max.distance=1)])

agrep string matching in R

I have written a function for this, not the most optimized way to do it but this will do the task. the inputs are vectors not lists, hope this helps

stringMatch<-function(search.string,inputstring,pattern=" "){
stringsplit<-unlist(str_split(search.string,pattern))

firstletter<-c()
for(i in seq(1,length(stringsplit))){firstletter<-paste(firstletter,
substring(stringsplit[i],1,1),sep="")}
search.string.l<-tolower(search.string)
firstletter.l<-tolower(firstletter)

matchstring<-grep(paste("\\b",search.string.l,"\\b","|","\\b",firstletter.l,"\\b"
,sep=""),tolower(inputstring))
return(matchstring)
}

test1<-c('hello p','helbbo','hello test','HP')
search.string<-'HP'
[1] 4


Related Topics



Leave a reply



Submit