Opencv Tracking Using Optical Flow

OpenCV tracking using optical flow

As you write, cv::goodFeaturesToTrack takes an image as input and produces a vector of points which it deems "good to track". These are chosen based on their ability to stand out from their surroundings, and are based on Harris corners in the image. A tracker would normally be initialised by passing the first image to goodFeaturesToTrack and obtaining a set of features to track. These features could then be passed to cv::calcOpticalFlowPyrLK as the previous points, along with the next image in the sequence and it will produce the next points as output, which then become input points in the next iteration.

If you want to try to track a different set of pixels (rather than features generated by cv::goodFeaturesToTrack or a similar function), then simply provide these to cv::calcOpticalFlowPyrLK along with the next image.

Very simply, in code:

// Obtain first image and set up two feature vectors
cv::Mat image_prev, image_next;
std::vector<cv::Point> features_prev, features_next;

image_next = getImage();

// Obtain initial set of features
cv::goodFeaturesToTrack(image_next, // the image
features_next, // the output detected features
max_count, // the maximum number of features
qlevel, // quality level
minDist // min distance between two features
);

// Tracker is initialised and initial features are stored in features_next
// Now iterate through rest of images
for(;;)
{
image_prev = image_next.clone();
feature_prev = features_next;
image_next = getImage(); // Get next image

// Find position of feature in new image
cv::calcOpticalFlowPyrLK(
image_prev, image_next, // 2 consecutive images
points_prev, // input point positions in first im
points_next, // output point positions in the 2nd
status, // tracking success
err // tracking error
);

if ( stopTracking() ) break;
}

opencv how to track objects after optical flow?

I'm not sure LK is the best algorithm, since it computes the motion of a sparse set of corner-like points, and tracking behaves usually better from a dense optical flow result (such as Farneback or Horn Schunck). After computing the flow, as a first step, you can do some thresholding on its norm (to retain the moving parts), and try to extract connected regions from this result. But be warned that your tasks is not going to be easy if you don't have a model of the object you want to track.

On the other hand, if you are primarily interested in tracking and a bit of interactivity is acceptable, you can have a look at the camshift sample code to see how to select and track an image region based on its appearance.

--- EDIT ---

If your camera is static, then use background subtraction instead. Using OpenCV 2.4 beta, you have to look for the class BackgroundSubtractor and its subclasses in the video module documentation.

Note also that optical flow can be realtime (or not very far) with good choices of parameters, and also with GPU implementation. On windows, you can use flowlib from TU Graz/Gpu4Vision group. OpenCV also has some GPU dense optical flow, for example the class gpu::BroxOpticalFlow.

--- EDIT 2 ---

Joining single-pixel detections into big objects is a task called connected component labelling. There is a fast algorithm for that, implemented in OpenCV. So this gives you a pipeline which is:

  • motion detection (pix level) ---> connected comp. labeling ---> object tracking (adding motion information, possible trajectories for Kalman filtering...).

But we'll have to stop here, because we'll soon be far beyond the scope of your initial question ;-)

Does tracking mean nothing but linking optical flow vectors?

Tracking defines a computer vision task with the goal to track a given object, region of interesset or point in consequtive images of a video sequence. A very simple method would be based on the motion vectors estimated by an optical flow estimation method.
However, this would produce good results at very cooperative enviromental conditions. It would fail e.g. if the object get occluded. Recent state-of-the-art method are more robust and e.g. based on Kalman-Filter, Particel Filter or PHD filter technologies. This Survey on Object Tracking or Object tracking:A surveygives you an better overwiev of the challenges and solution of current object tracking methods.

Optical flow and finger tracking

The 4th parameter of calcOpticalFlowPyrLK (here track) will contain the calculated new positions of input features in the second image (here nGmask).

In the simple case, you can estimate the centroid separately of fingers and track where you can infer to the movement. Making decision can be done from the direction and magnitude of the vector pointing from fingers' centroid to track's centroid.

Furthermore, complex movements can be considered as time series, because movements are consisting of some successive measurements made over a time interval. These measurements could be the direction and magnitude of the vector mentioned above. So any movement can be represented as below:

("label of movement", time_series), where
time_series = {(d1, m1), (d2, m2), ..., (dn, mn)}, where
di is direction and mi is magnitude of the ith vector (i=1..n)

So the time-series consists of n * 2 measurements (sampling n times), that's the only question how to recognize movements?

If you have prior information about the movement, i.e. you know how to perform a circular movement, write an a letter etc. then the question can be reduced to: how to align time series to themselves?

Here comes the well known Dynamic Time Warping (DTW). It can be also considered as a generative model, but it is used between pairs of sequences. DTW is an algorithm for measuring similarity between two temporal sequences which may vary in time or speed (such in our case).

In general, DTW calculates an optimal match between two given time series with certain restrictions. The sequences are warped non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension.

Opencv Optical flow tracking: stop condition

You can try a bidirectional confidenze measure of your track points.
Therefore estimate the feature positions from img0 to img1 and than the tracked positions backwards from img1 to img0. If the double tracked features near the original ( distance should be less than 1 or 0.5 pixel) than they are successfully tracked. This is a little bit more relyable than the SSD which is used by the status flag of opencv's plk. If a certain amount of features could not been tracked the event raises.

opencv- vehicle tracking using optical flow

I might be going a bit over the line here but I would suggest you to check out OpenTLD. OpenTLD (aka Predator) is one of the most efficient tracking algorithm. Zdenek Kalal has implemented OpenTLD in MATLAB. George Nebehay has made a very efficient C++ OpenCV port of OpenTLD.

It's very easy to install and tracking is really efficient.

OpenTLD uses Median Flow Tracker to track and implements PN learning algorithm. In this YouTube Video, Zdenek Kalal shows the use of OpenTLD.

If you just want to implement a Median Flow Tracker, follow this link https://github.com/gnebehay/OpenTLD/tree/master/src/mftracker

If you want to use it in Python, I have made a Median Flow Tracker and also made a Python port of OpenTLD. But python port isn't much efficient.

OpenCV - Feature Matching vs Optical Flow

I would like to add a few thoughts about that theme since I found this a very interesting question too.
As said before Feature Matching is a technique that is based on:

  • A feature detection step which returns a set of so called feature points. These feature points are located at positions with salient image structures, e.g. edge-like structures when you are using FAST or blob like structures if you are using SIFT or SURF.

  • The second step is the matching. The association of feature points extracted from two different images. The matching is based on local visual descriptors, e.g. histogram of gradients or binary patterns, that are locally extracted around the feature positions. The descriptor is a feature vector and associated feature point pairs are pairs a minimal feature vector distances.

Most feature matching methods are scale and rotation invariant and are robust for changes in illuminations (e.g caused by shadow or different contrast). Thus these methods can be applied to image sequences but are more often used to align image pairs captured from different views or with different devices.The disadvantage of Feature Matching methods is the difficulty of defining where the feature matches are spawn and that the feature pair (which in a image sequence are motion vectors) are in general very sparse. In addition the subpixel accuracy of matching approaches are very limited as most detector are fine-graded to integer positions.

From my experience the main advantage of feature matching approaches is that they can compute very large motions/ displacements.

OpenCV offers some feature matching methods but there are a lot of more recent, faster and more accurate approaches available online e.g.:

  • DeepMatching which relies on deep learning and are often used to initialize optical flow methods to help them deal with long-range motions.
  • Stereoscann which is a very fast approach at its origin proposed for visual odometry.

Optical flow methods in contrast rely on the minimization of the brightness constancy and additional constrain e.g. smoothness etc. Thus they derive motion vector based on spatial and temporal image gradients of a sequence of consecutive frames. Thus they are more suited image sequences rather than image pairs that are captured from very different view points. The main challenges in the estimation of motion with optical flow vectors are large motions, occlusion, strong illumination changes and changes of the appearance of the objects and mostly the low runtime. However optical flow methods can be highly accurate and compute dense motion fields which respect to shared motion boundaries of the objects in a scene.

However, the accuracy of different optical flow methods is very different. Local methods such as the PLK (Lucas Kanade) are in general less accurate but allow to compute pre selected motion vectors only and can thus be very fast. (In the recent years we have done some research to improve the accuracy of the local approach, see here for further information).

The main OpenCV trunk offers global approaches such as the Farnback. But this is a quite outdated approach. Try the OpenCV contrib trunk which more recent methods. But to get an good overview of the most recent methods take a look at the public optical flow benchmarks. Here you will find code and implementations as well e.g.:

  • MPI-Sintel optical flow benchmark
  • KITTI 2012 optical flow benchmark. Both offer links e.g. to git's or source code for some newer methods. Such as FlowFields.

But from my point of view I would not on an early stage reject a specific approach matching or optical flow. Try as much as possible available online implementations and see what is the best for your application.



Related Topics



Leave a reply



Submit