Remove Strings Found in Vector 1, from Vector 2

Remove strings found in vector 1, from vector 2

Try this,

sample1 <- c(".aaa", ".aarp", ".abb", ".abbott", ".abogado")
sample2 <- c("try1.aarp", "www.tryagain.aaa", "255.255.255.255", "onemoretry.abb.abogado")
paste0("(",paste(sub("\\.", "\\\\.", sample1), collapse="|"),")\\b")
# [1] "(\\.aaa|\\.aarp|\\.abb|\\.abbott|\\.abogado)\\b"
gsub(paste0("(",paste(sub("\\.", "\\\\.", sample1), collapse="|"),")\\b"), "", sample2)
# [1] "try1" "www.tryagain" "255.255.255.255" "onemoretry"

Explanation:

  • sub("\\.", "\\\\.", sample1) escapes all the dots. Since dots are special chars in regex.

  • paste(sub("\\.", "\\\\.", sample1), collapse="|") combines all the elements with | as delimiter.

  • paste0("(",paste(sub("\\.", "\\\\.", sample1), collapse="|"),")\\b") creates a regex like all the elements present inside a capturing group followed by a word boundary. \\b is a much needed one here . So that it would do an exact word match.

Comparing 2 vectors and removing elements from 2nd vector that are not found in the 1st - c++

If you are not allowed to sort the vectors then you can use the following approach as it is shown in the demonstrative program below.

#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <algorithm>

int main()
{
std::vector <std::string> animals =
{
"cat", "dog", "pig", "tiger", "monkey", "lion"
};

std::vector <std::string> someAnimals =
{
"dog", "mouse", "snake", "monkey", "cat"
};

auto not_present = [&animals]( const auto &s )
{
return
std::find( std::begin( animals ), std::end( animals ), s ) == std::end( animals );
};

someAnimals.erase( std::remove_if( std::begin( someAnimals ),
std::end( someAnimals ),
not_present ), std::end( someAnimals ) );

for ( const auto &s : someAnimals )
{
std::cout << s << ' ';
}
std::cout << '\n';

return 0;
}

The program output is

dog monkey cat 

Otherwise you can use std::binary_search for the sorted vectors as shown below.

#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <algorithm>

int main()
{
std::vector <std::string> animals =
{
"cat", "dog", "pig", "tiger", "monkey", "lion"
};

std::vector <std::string> someAnimals =
{
"dog", "mouse", "snake", "monkey", "cat"
};

std::sort( std::begin( animals ), std::end( animals ) );
std::sort( std::begin( someAnimals ), std::end( someAnimals ) );

auto not_present = [&animals]( const auto &s )
{
return
not std::binary_search( std::begin( animals ), std::end( animals ), s );
};

someAnimals.erase( std::remove_if( std::begin( someAnimals ),
std::end( someAnimals ),
not_present ), std::end( someAnimals ) );

for ( const auto &s : someAnimals )
{
std::cout << s << ' ';
}
std::cout << '\n';

return 0;
}

remove element by position in a vector string in c++

[Note: With the assumption 1003;2021-03-09;False;0;0;1678721F corresponding to a row inside std::vector<string>]


std::remove : Removes from the vector either a single element (position) or a range of elements ([first, last)).

In case std::vector<string> plan contains value False then it is removed.

  std::vector < std::string > plan =
{
"1003","2021-03-09","False","0;0","1678721F"

};

std::remove(plan.begin(),plan.end(),"False");

In your case you need to remove given sub-string from each row of the plan. You need to iterate through all the rows to remove given value using std::string::erase.

  std::vector < std::string > plan =
{
"1003;2021-03-09;False;0;0;1678721F",
"1005;2021-03-05;False;0;0;1592221D",
"1005;2021-03-06;False;0;0;1592221D",
"1003;2021-03-07;False;0;0;1592221D",
"1003;2021-03-08;False;0;0;1592221D",
"1004;2021-03-09;False;0;0;1592221D",
"1004;2021-03-10;False;0;0;1592221D",
"1001;2021-03-11;False;0;0;1592221D"};

for (auto & e:plan)
{
//As position of False;0;0; is at a fixed index, i.e: from index:16, 10 characters are removed
e.erase (16, 10);
}

To generalize, You can make use of std::String::find to find a sub-string and erase it.

void removeSubstrs(string& s, string p) { 
string::size_type n = p.length();
for (string::size_type i = s.find(p);
i != string::npos;
i = s.find(p))
s.erase(i, n);
}

int
main ()
{

std::vector < std::string > plan =
{
"1003;2021-03-09;False;0;0;1678721F",
"1005;2021-03-05;False;0;0;1592221D",
"1005;2021-03-06;False;0;0;1592221D",
"1003;2021-03-07;False;0;0;1592221D",
"1003;2021-03-08;False;0;0;1592221D",
"1004;2021-03-09;False;0;0;1592221D",
"1004;2021-03-10;False;0;0;1592221D",
"1001;2021-03-11;False;0;0;1592221D"};

for (auto & e:plan)
{
removeSubstrs (e, ";False;0;0");
}

for (auto e:plan)
std::cout << e << std::endl;

return 0;
}

how may I remove elements in a string vector by elements in another string vector

setdiff is asymmetric, as the help page warns about (though subtly).

This works as you expect,

> setdiff(c("a","b","c","d"),c("a","c"))
[1] "b" "d"

A simple function works either way,

setdiff2 <- function(x,y){
d1 <- setdiff(x,y)
d2 <- setdiff(y,x)
if(length(d2) > length(d1))
return(d2)
else
return(d1)
}

> setdiff2(c("a","c"), c("a","b","c","d"))
[1] "b" "d"

Removing strings from a string vector, from a substring

Use std::remove_if and search for a 1 in the string (live example):

clauses.erase(
std::remove_if(clauses.begin(), clauses.end(),
[](const std::string &s) {return s.find('1') != std::string::npos;}
),
clauses.end()
);

If you don't have C++11 for the lambda, a normal function, or functor, or Boost lambda, or whatever floats your boat, will work as well.

Remove entries from string vector containing specific characters in R

We can use grep to find out which values in y match the pattern in x and exclude them using !%in%

y[!y %in% grep(paste0(x, collapse = "|"), y, value = T)]

#[1] "kot" "kk" "y"

Or even better with grepl as it returns boolean vectors

y[!grepl(paste0(x, collapse = "|"), y)]

A concise version with grep using invert and value parameter

grep(paste0(x, collapse = "|"), y, invert = TRUE, value = TRUE)
#[1] "kot" "kk" "y"


Related Topics



Leave a reply



Submit