r1 - 29 Mar 2007 - 22:52:50 - OferNaveYou are here: OSAF >  PyLucene Web  >  APIDocs > APIDocsExtendingLuceneClasses

Extending Lucene Classes

   Many areas of the Lucene API expect the programmer to provide their own
   implementation or specialization of a feature where the default is
   inappropriate. For example, text analyzers and tokenizers are an area
   where many parameters and environmental or cultural factors are calling
   for customization.

   PyLucene enables this by providing Java extension points listed below
   that serve as proxies for Java to call back into the Python
   implementations of these customizations.

   To learn more about this topic, please refer to the PyLucene paper
   included earlier.

   Unless otherwise documented, passing the Python extension instance
   where a wrapped Java instance returned by PyLucene is normally expected
   is sufficient for the Python extension instance to be wrapped by Java
   for its use.

   Each extension point below enumerates the methods that a Python class
   needs to implement in order to be functioning as an 'extension' of the
   corresponding Java Lucene class.

   . org.apache.lucene.analysis.Analyzer extension point:
         TokenStream tokenStream(fieldName, reader)

   . org.apache.lucene.analysis.CharTokenizer extension point:
         boolean isTokenChar(char)
         char normalize(char)

     In order to instantiate such a custom char tokenizer, the additional
     charTokenizer() factory method defined on
     org.apache.lucene.analysis.TokenStream instances needs to be invoked
     with the Python extension instance.

   . org.apache.lucene.analysis.TokenFilter extension point:
         Token next()

     In order to instantiate such a custom token filter, the additional
     tokenFilter() factory method defined on
     org.apache.lucene.analysis.TokenStream instances needs to be invoked
     with the Python extension instance.

   . org.apache.lucene.analysis.TokenStream extension point:
         Token next()

   . org.apache.lucene.queryParser.QueryParser extension point:
         Query getBooleanQuery(super, clauses)
         Query getFieldQuery(super, fieldName, queryText, slop=None)
         Query getFuzzyQuery(super, fieldName, termText, minSimilarity)
         Query getPrefixQuery(super, fieldName, termText)
         Query getRangeQuery(super, fieldName, part1, part2, inclusive)
         Query getWildcardQuery(super, fieldName, termText)

     The 'super' argument is provided to invoke the default Java
     implementation of these methods as needed. 

     In order to instantiate such a custom query parser, the additional
     queryParser() factory method defined on
     org.apache.lucene.analysis.Analyzer instances needs to be invoked
     with the Python extension instance.

     Please refer to the AdvancedQueryParserTest.py and
     CustomQueryParser.py 'Lucene in Action' samples for more details.

   . org.apache.lucene.search.Filter extension point:
         BitSet bits(indexReader)

   . org.apache.lucene.search.FilteredTermEnum extension point:
         float difference()
         boolean termCompare(term)
         boolean endEnum()
         void setEnum(termEnum)

   . org.apache.lucene.search.HitCollector extension point:
         void collect(docNum, score)

   . org.apache.lucene.search.ScoreDocComparator extension point:
         int compare(scoreDoc0, scoreDoc1)
         int sortType()
         Comparable sortValue(ScoreDoc i)

     Please refer to the DistanceComparatorSource.py and
     DistanceSortingTest.py 'Lucene in Action' samples for more details on
     writing custom sorting code in Python.

   . org.apache.lucene.search.SortComparator extension point:
         ScoreDocComparator newComparator(indexReader, fieldName)
         Comparable getComparable(termText)

     Please refer to the DistanceComparatorSource.py and
     DistanceSortingTest.py 'Lucene in Action' samples for more details on
     writing custom sorting code in Python.

   . org.apache.lucene.search.SortComparatorSource extension point:
         ScoreDocComparator newComparator(indexReader, fieldName)

     Please refer to the DistanceComparatorSource.py and
     DistanceSortingTest.py 'Lucene in Action' samples for more details on
     writing custom sorting code in Python.

   . org.apache.lucene.search.Searchable extension point:
         void close()
         int docFreq(term)
         Document doc(n)
         int maxDoc()
         void searchAll(query, filter, hitCollector)
         TopDocs search(query, filter, n)
         TopFieldDocs searchSorted(query, filter, n, sort)
         Query rewrite(query)
         Explanation explain(query, docNum)

   . org.apache.lucene.search.Similarity extension point:
         float coord(overlap, maxOverlap)
         float idf(term, searcher)
         float idf(terms, searcher)
         float idf(docFreq, numDocs)
   float lengthNorm(fieldName, numTokens)
         float queryNorm(sumOfSquaredWeights)
         float sloppyFreq(distance)
         float tf(freq)

   . org.apache.lucene.search.highlight.Formatter extension point:
         string highlightTerm(originalText, tokenGroup)

   . org.apache.lucene.store.Directory extension point:
         void close();
         IndexOutput createOutput(name)
         void deleteFile(name)
         boolean fileExists(name)
         long fileLength(name)
         long fileModified(name)
         string[] list()
         Lock makeLock(String name)
         IndexInput openInput(name)
         void renameFile(from, to)
         void touchFile(name)

   . org.apache.lucene.store.IndexInput extension point:
         void close(isClone)
   long length()
         string read(length, pos)
         void seek(pos)

     Because IndexInput instances may be cloned, the close() method takes
     an extra argument in python telling whether a clone is being closed.

   . org.apache.lucene.store.IndexOutput extension point:
         void close()
         long length()
         void write(string)
         void seek(pos)

   . org.apache.lucene.store.Lock extension point:
         boolean isLocked()
         boolean obtain()
         boolean obtain(lockWaitTimeout)
         void release()

   . java.io.Reader extension point:
         void close()
         string read(len)

   . java.lang.Comparable extension point:
         int compareTo(object)

   . java.lang.Runnable extension point:
         void run()
Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
Open Source Applications Foundation
Except where otherwise noted, this site and its content are licensed by OSAF under an Creative Commons License, Attribution Only 3.0.
See list of page contributors for attributions.