Priority Queue with Limited Space: Looking for a Good Algorithm

priority queue with limited space: looking for a good algorithm

Array based heaps seem ideal for your purpose. I am not sure why you rejected them.

You use a max-heap.

Say you have an N element heap (implemented as an array) which contains the N smallest elements seen so far.

When an element comes in you check against the max (O(1) time), and reject if it is greater.

If the value coming in is lower, you modify the root to be the new value and sift-down this changed value - worst case O(log N) time.

The sift-down process is simple: Starting at root, at each step you exchange this value with it's larger child until the max-heap property is restored.

So, you will not have to do any deletes which you probably will have to, if you use std::priority_queue. Depending on the implementation of std::priority_queue, this could cause memory allocation/deallocation.

So you can have the code as follows:

  • Allocated Array of size N.
  • Fill it up with the first N elements you see.
  • heapify (you should find this in standard text books, it uses sift-down). This is O(N).
  • Now any new element you get, you either reject it in O(1) time or insert by sifting-down in worst case O(logN) time.

On an average, though, you probably will not have to sift-down the new value all the way down and might get better than O(logn) average insert time (though I haven't tried proving it).

You only allocate size N array once and any insertion is done by exchanging elements of the array, so there is no dynamic memory allocation after that.

Check out the wiki page which has pseudo code for heapify and sift-down: http://en.wikipedia.org/wiki/Heapsort

Algorithm for Broken Priority Queue

Yes, I believe you're correct.

A priority-queue is typically implemented as a heap:

A heap is a specialized [binary-]tree-based data structure ... the keys of parent nodes are less than or equal to those of the children and the lowest key is in the root node (min heap).

broken_min can be implemented in O(1) since one of the children will be the second largest element, so we can just check both.

But insert will take Ω(log n), which isn't o(log n) (but would indeed be O(log n)).

I doubt you'll find a priority-queue implementation that could satisfy the given running times, but saying it's impossible would be irresponsible.

Note: I assume you meant little-o (a strict upper bound, as opposed to big-O, which is a greater-or-equal upper bound) given the tag, although big-O is used most of them time, and little-o is almost never used (as far as I've seen).

Need Idea to customize an Algorithm in Data Structure Using Priority Queue

As rafalio noted, the size of your queue will increase continually because an average web page contains more than 1 outbound link and you will not reach closure over the whole internet :-)

But rather than searching the top N links from each page as rafalio suggests, I would instead recommend that you set a global maximum cap on the priority queue (say 30,000, for example). This way, as it begins to grow, each new link you traverse (the next highest priority link in the queue) will have accumulated greater and greater relative priority to everything else encountered thus far. You could be almost certain that by the time the queue reaches the cap, the highest priority item in it will be more important than the top N links on the current page (or any arbitrary page you could visit next).

Keep in mind that if your priority queue is array-backed, the insertion and removal ops will be O(n). A sorted linked list will also have O(n) insertion time. A binary-heap-backed priority queue should give the best overall performance for large n, with no worse than O(log n) on both insert and remove, and O(1) on min/max item lookup.

Why can't a priority queue wrap around like an ordinary queue?

The most common priority queue implementation is a binary heap, which would not benefit from wrapping around. You could create a priority queue that's implemented in a circular buffer, but performance would suffer.

It's important to remember than priority queue is an abstract data structure. It defines the operations, but not the implementation. You can implement priority queue as a binary heap, a sorted array, an unsorted array, a binary tree, a skip list, a linked list, etc. There are many different ways to implement a priority queue.

Binary heap, on the other hand, is a specific implementation of the priority queue abstract data type.

As for stack vs queue: in actuality, stacks and queues are just specializations of the priority queue. If you consider time as the priority, then what we call a queue (a FIFO data structure), is actually a priority queue in which the oldest item is the highest priority. A stack (a LIFO data structure) is a priority queue in which the newest item is the highest priority.

Stable k-largest elements algorithm

Use the heap-based algorithm for finding the k largest value, i.e. use a min heap (not a max heap) that never exceeds a size of k. Once it exceeds that size, keep pulling the root from it to restore it to a size of k.

At the end the heap's root will be k largest value. Let's call it m.

You could then scan the original input again to collect all values that are at least equal to m. This way you'll have them in their original order.

When that m is not unique, you could have collected too many values. So check the size of the result and determine how much longer it is than k. Go backwards through that list and mark the ones that have value m as deleted until you have reached the right size. Finally collect the non-deleted items.

All these scans are O(n). The most expensive step is the first one: O(nlogk).

Priority queue with dynamic item priorities

I would suggest first trying the head-in approach, to update a priority:

  • delete the item from the queue
  • re-insert it with the new priority

In C++, this could be done using a std::multi_map, the important thing is that the object must remember where it is stored in the structure to be able to delete itself efficiently. For re-insert, it's difficult since you cannot presume you know anything about the priorities.

class Item;

typedef std::multi_map<int, Item*> priority_queue;

class Item
{
public:
void add(priority_queue& queue);
void remove();

int getPriority() const;
void setPriority(int priority);

std::string& accessData();
const std::string& getData() const;

private:
int mPriority;
std::string mData;

priority_queue* mQueue;
priority_queue::iterator mIterator;
};

void Item::add(priority_queue& queue)
{
mQueue = &queue;
mIterator = queue.insert(std::make_pair(mPriority,this));
}

void Item::remove()
{
mQueue.erase(mIterator);
mQueue = 0;
mIterator = priority_queue::iterator();
}

void Item::setPriority(int priority)
{
mPriority = priority;
if (mQueue)
{
priority_queue& queue = *mQueue;
this->remove();
this->add(queue);
}
}

High level description of a priority queue with adjustable priority

import java.util.HashMap;
import java.util.NoSuchElementException;

public class Heap<Key extends Comparable<Key>> {
private Key[] heap;
private int maxN, n;
private HashMap<Key, Integer> map;
@SuppressWarnings("unchecked")
public Heap(int maxN) {
if(maxN < 0 ) throw new IllegalArgumentException();
this.maxN = maxN;
n = 0;
heap = (Key[]) new Comparable[maxN];
map = new HashMap<>(maxN);
}

boolean isEmpty() {
return n == 0;
}

boolean insert(Key e) {
if(n +1 > maxN) throw new IllegalArgumentException("maximum capacity reached " + maxN);
heap[n] = e;
map.put(e,n);
int i = n++;
swim(i);
return true;
}

Key extractMin() {
if(n == 0) throw new NoSuchElementException("Priority queue underflow ");
Key min = heap[0];
swap(0, n-1);
map.remove(min);
n--;
sink(0);
return min;
}

void delete(Key e){
if(!map.containsKey(e)) return; //throw new NoSuchElementException(e+" does not exist ");
int j = map.get(e);
swap(j, n-1);
map.remove(e);
n--;
if(!swim(j))
sink(j);
}

void decreasePriority(Key e){
if(map.containsKey(e)){
int j = map.get(e);
swim(j);
}
else insert(e);
}

private void swap(int i, int j) {
Key t = heap[i];
heap[i] = heap[j];
heap[j] = t;
map.replace(heap[i], i);
map.replace(heap[j], j);
}
private boolean swim(int j){
boolean change = false;
int parent;
while( (parent = (j-1)/2 ) >= 0){
if(heap[j].compareTo(heap[parent]) < 0){
swap(j,parent);
j = parent;
change = true;
}
else break;
}
return change;
}
private void sink(int j){
while(j <= n/2 - 1){
int leftChild = j*2 + 1, rightChild = leftChild + 1, s;
if(rightChild >= n)
s = leftChild;
else
s = heap[leftChild].compareTo(heap[rightChild]) < 0 ? leftChild : rightChild;
if(heap[j].compareTo(heap[s]) > 0){
swap(j,s);
j = s;
}
else break;
}
}

@Override
public String toString() {
String res = "[";
int i;
for (i = 0; i < n-1; i++){
res += heap[i] + ", ";
}
res += heap[i]+"]";
return res;
}
}


Related Topics



Leave a reply



Submit