Paginating Lucene Search Results


Use IndexSearcher.searchAfter
/**
 * useSearcherAfter, need client record the returned last ScoreDoc
 * lastBottom, and pass it in next round.
 */
private void useSearcherAfter(DirectoryReader indexReader,
    IndexSearcher searcher, int pageSize) throws IOException {
  Query query = new TermQuery(new Term("title", "java"));
  // query = new MatchAllDocsQuery();
  ScoreDoc lastBottom = null;
  while (true) {
    TopDocs paged = null;
    paged = searcher.searchAfter(lastBottom, query, null, pageSize);
    if (paged.scoreDocs.length == 0) {
      // no more data, break;
      break;
    }
    ScoreDoc[] scoreDocs = paged.scoreDocs;
    for (ScoreDoc scoreDoc : scoreDocs) {
      Utils.printDoc(searcher.doc(scoreDoc.doc), "id", "title");
    }

    lastBottom = paged.scoreDocs[paged.scoreDocs.length - 1];
  }
}

Skip Previous Docs
Not good at performance and memory usage.
private void skipPreviousRows(DirectoryReader indexReader,
    IndexSearcher searcher, int pageStart, int pageSize)
    throws IOException {
  Query query = new TermQuery(new Term("title", "java"));
  int pageEnd = pageStart - 1 + pageSize;
  TopDocs hits = searcher.search(query, pageEnd);

  for (int i = pageStart - 1; i < pageEnd; i++) {
    int docId = hits.scoreDocs[i].doc;

    // load the document
    Document doc = searcher.doc(docId);
    Utils.printDocAndExplain(doc, searcher, query, docId, "id", "title");
  }
}

In Solr4.7, we can do deep paging with cursorMark
Solr Deep Pagination Problem Fixed in Solr-5463
Sorting, Paging, and Deep Paging in Solr
http://solr1:8080/solr/select?q=accesstime:[* TO NOW-5YEAR/DAY]&sort=accesstime desc, contentid asc&sort=accesstime desc,id asc&rows=1000&start=0&cursorMark=*

http://solr1:8080/solr/select?q=accesstime:[* TO NOW-5YEAR/DAY]&sort=accesstime desc, contentid asc&sort=accesstime desc,id asc&rows=1000&start=0&cursorMark=AoJ42tmu%2FZ4CKTQxMDMyMzEwMw%3D%3D

Labels

adsense (5) Algorithm (69) Algorithm Series (35) Android (7) ANT (6) bat (8) Big Data (7) Blogger (14) Bugs (6) Cache (5) Chrome (19) Code Example (29) Code Quality (7) Coding Skills (5) Database (7) Debug (16) Design (5) Dev Tips (63) Eclipse (32) Git (5) Google (33) Guava (7) How to (9) Http Client (8) IDE (7) Interview (88) J2EE (13) J2SE (49) Java (186) JavaScript (27) JSON (7) Learning code (9) Lesson Learned (6) Linux (26) Lucene-Solr (112) Mac (10) Maven (8) Network (9) Nutch2 (18) Performance (9) PowerShell (11) Problem Solving (11) Programmer Skills (6) regex (5) Scala (6) Security (9) Soft Skills (38) Spring (22) System Design (11) Testing (7) Text Mining (14) Tips (17) Tools (24) Troubleshooting (29) UIMA (9) Web Development (19) Windows (21) xml (5)