How to Insert the Value in a Sorted Vector

inserting element to a sorted vector and keeping elements sorted

To keep your vector sorted all the time, you should always insert new elements into proper position. As you want to pop elements in ascending order and vector provides only pop_back() method you should sort elements in descending order. so first you need to find proper position and then insert there:

typedef std::vector<int> ints;

void insert( ints &cont, int value ) {
ints::iterator it = std::lower_bound( cont.begin(), cont.end(), value, std::greater<int>() ); // find proper position in descending order
cont.insert( it, value ); // insert before iterator it
}

What's the most efficient way to insert an element into a sorted vector?

The task consists of two steps: finding the insert-position with binary_search and inserting with Vec::insert():

match v.binary_search(&new_elem) {
Ok(pos) => {} // element already in vector @ `pos`
Err(pos) => v.insert(pos, new_elem),
}

If you want to allow duplicate elements in your vector and thus want to insert already existing elements, you can write it even shorter:

let pos = v.binary_search(&new_elem).unwrap_or_else(|e| e);
v.insert(pos, new_elem);

But: be aware that this has a runtime complexity of O(n). To insert into the middle, the vector has to move every element right of your insert-position one to the right.

So you shouldn't use it to insert more than a few elements into a vector, which isn't tiny in size. Particularly, you shouldn't use this method to sort a vector, as this insertion sort algorithm runs in O(n²).

A BinaryHeap might be a better choice in such a situation. Each insert (push) has a runtime complexity of just O(log n) instead of O(n). You can even convert it into a sorted Vec with into_sorted_vec(), if you so desire. You can also continue to use the heap instead of converting it.

Updating one entry of a sorted vector

Use std::rotate to move the element to its new position. If you really think it won’t move far, it might be faster to search linearly for the new position (or apply a hybrid approach by checking at doubling distances from the old position to find a bound for the upper_bound).

How to insert a value into a sorted Vector in a single pass?

user5402's answer is quite a pretty way to do it, but it falls prey to an efficiency problem described in the Data.Vector documentation. Specifically, once it has found the insertion spot, and is copying blindly, it no longer forces the values to actually be read from the source vector. Instead, it fills up the destination vector with thunks that, when forced, read from the source vector.

Original solution

Note: This is the first solution I came up with. It is pretty easy to understand, but it does not play well with the stream fusion system in vector, which can lead to relatively poor performance. The solutions below are better in general.

One solution, as explained in the documentation, is to use the monadic indexM operation to perform these blind reads. This forces the read to be performed, but does not force the read value. Thus it copies a pointer (possibly a pointer to a thunk) from the source vector to the destination vector. For greatest efficiency, everything below should be replaced with its unsafe variant (unsafeIndexM, unsafeIndex, and unsafeWrite in particular).

{-# Language ScopedTypeVariables #-}
module Insert where
import qualified Data.Vector as V
import Data.Vector (Vector)
import qualified Data.Vector.Mutable as MV
import Data.Vector.Mutable (MVector)
import Control.Monad.ST

insertElem :: forall a . Ord a => a -> Vector a -> Vector a
insertElem e v = V.create act
where
act :: forall s . ST s (MVector s a)
act = do
mv <- MV.new (V.length v + 1)
let
start :: Int -> ST s (MVector s a)
start i
| i == V.length v ||
e <= v V.! i = MV.write mv i e >> finish i
| otherwise = MV.write mv i (v V.! i) >> start (i+1)
finish :: Int -> ST s (MVector s a)
finish i
| i == V.length v = return mv
| otherwise = do
V.indexM v i >>= MV.write mv (i+1)
finish (i+1)
in start 0

insertElemInt :: Int -> Vector Int -> Vector Int
insertElemInt = insertElem

Note that naming the act action and using ScopedTypeVariables should not actually be necessary, but I found them tremendously helpful in tracking down my mistakes.

Stream-based solutions

The above code won't work well with stream fusion, because indices fly all over the place. The following approach should fuse properly, and still avoid creating unnecessary thunks. This is the first time I've ever touched stream fusion code, so it may be that some things could be improved. The only problem with this first stream-based version is that if it does fuse, then the step function for the input stream will be run twice on one of the elements. This is normally not a problem, but if the step function is extremely expensive (rare), it could be. I therefore give an alternative that should work better in that case. I'm not sure just when which of these will be better in practice, so I'm including both.

With either of these stream-based solutions, the code

testEnum :: Word -> Word -> Word -> Word
testEnum ins low high = V.product $
insertElem ins $ V.enumFromStepN low 1 (fromIntegral high)

will compile into loops that run in constant space, never actually creating a vector at all.

Step one seed twice

{-# Language ScopedTypeVariables #-}
module Insert where

import Data.Vector (Vector)
import Data.Word (Word)
import qualified Data.Vector.Fusion.Stream.Monadic as S
import qualified Data.Vector.Generic as G
import Data.Vector.Fusion.Util (Id(..))

-- To check on unboxing and such
insertElemWord :: Word -> Vector Word -> Vector Word
insertElemWord = insertElem

{-# INLINE insertElem #-}
insertElem :: forall a . Ord a => a -> Vector a -> Vector a
insertElem a v = G.unstream (insertElemS a (G.stream v))

{-# INLINE [1] insertElemS #-}
insertElemS :: forall a . Ord a => a -> S.Stream Id a -> S.Stream Id a
insertElemS e (S.Stream step (state::s) size) = S.Stream step' (state, False) (size + 1)
where
{-# INLINE [0] step' #-}
step' :: (s, Bool) -> Id (S.Step (s, Bool) a)
step' (s, True) = Id $ case unId (step s) of
S.Yield a s' -> S.Yield a (s', True)
S.Skip s' -> S.Skip (s', True)
S.Done -> S.Done
step' (s, False) = Id $ case unId (step s) of
S.Yield a s' ->
if e <= a
then S.Yield e (s, True)
else S.Yield a (s', False)
S.Skip s' -> S.Skip (s', False)
S.Done -> S.Yield e (s, True)

One pass exactly, never repeating a lookup/step

{-# Language ScopedTypeVariables #-}
module Insert where

import Data.Vector (Vector)
import Data.Word (Word)
import qualified Data.Vector.Fusion.Stream.Monadic as S
import qualified Data.Vector.Generic as G
import Data.Vector.Fusion.Util (Id(..))

data Status s a = Pre s | During s a | Post s | End

insertElemWord :: Word -> Vector Word -> Vector Word
insertElemWord = insertElem

{-# INLINE insertElem #-}
insertElem :: forall a . Ord a => a -> Vector a -> Vector a
insertElem a v = G.unstream (insertElemS a (G.stream v))

{-# INLINE [1] insertElemS #-}
insertElemS :: forall a . Ord a => a -> S.Stream Id a -> S.Stream Id a
insertElemS e (S.Stream step (state::s) size) = S.Stream step' (Pre state) (size+1)
where
{-# INLINE [0] step' #-}
step' :: Status s a -> Id (S.Step (Status s a) a)
step' (Post s) = Id $ case unId (step s) of
S.Yield a s' -> S.Yield a (Post s')
S.Skip s' -> S.Skip (Post s')
S.Done -> S.Done
step' (Pre s) = Id $ case unId (step s) of
S.Yield a s'
| e <= a -> S.Yield e (During s' a)
| otherwise -> S.Yield a (Pre s')
S.Skip s' -> S.Skip (Pre s')
S.Done -> S.Yield e End
step' (During s a) = Id (S.Yield a (Post s))
step' End = Id S.Done


Related Topics



Leave a reply



Submit