Hadoop MapReduce – word count (improve)

About Map Reduce Code


 1.Ordering with Map Reduce

  (A) Binary Search

we are going to make a map reduce program which return N numbers of   keywords from the top rank (ordered by number of appears)
Hadoop support beautiful sorting Library which is called PriorityQueue and by calling peek you can get keyword on the last of the pool.bellow code remove items from the queue from the last until the size of queue meets the required top number that user request

public static void insert(PriorityQueue queue, String item, Long lValue, int topN) {
  ItemFreq head = (ItemFreq)queue.peek();

  // 큐의 원소수가 topN보다 작거나 지금 들어온 빈도수가 큐내의 최소 빈도수보다 크면
  if (queue.size() < topN || head.getFreq() < lValue) {
    ItemFreq itemFreq = new ItemFreq(item, lValue);
    // 일단 큐에 추가하고 
    queue.add(itemFreq);
    // 큐의 원소수가 topN보다 크면 가장 작은 원소를 제거합니다.
    // if (queue.size() > topN && head != null && head.getFreq() < lValue) {
    if (queue.size() > topN) {
        queue.remove();
    }
  }
}

(B)

Leave a Reply

Your email address will not be published. Required fields are marked *