r1 - 29 Mar 2007 - 22:51:57 - OferNaveYou are here: OSAF >  PyLucene Web  >  APIDocs > APIDocsPythonExtensions

Pythonic Extensions to the Java API

Java is a very verbose language. Python, on the other hand, offers many syntactically attractive constructs for iteration, property access, etc... As the Java Lucene samples from the 'Lucene in Action' book were ported to Python, PyLucene received a number of pythonic extensions listed here:

Hits as a List

Iterating search hits is a very common operation. Hits instances are iterable in Python. Two values are returned for each iteration, the zero-based number of the document in the Hits instance and the document instance itself.

The Java loop:

        for (int i = 0; i < hits.length(); i++) {
            Document doc = hits.doc(i);
            System.out.println(hits.score(i) + " : " + doc.get("title"));
        }

is better written in Python:

        for i, doc in hits:
            print hits.score(i), ':', doc['title']

Hits instances partially implement the Python 'list' protocol.

The Java expressions:

        hits.length()
        doc = hits.get(i)

are better written in Python:

        len(hits)
        doc = hits[i]

InderReader? as a List

Similarly, IndexReader? instances partially implement the 'list' protocol and can be iterated over for their documents.

The Java expressions:

        indexReader.maxDoc()
        indexReader.document(i)

are better written in Python:

        len(indexReader)
        indexReader[i]

The Java loop:

        for (int i = 0; i < indexReader.maxDoc(); i++) {
            Document doc = indexReader.document(i);
            ...
        }

is better written in Python:

        for i, doc in indexReader:
            ...

Document as a Dict

Document instances have fields whose values can be accessed through the dict and attribute protocol.

The Java expressions:

        doc.get("title")
        doc.getField("title")
        doc.removeField("title")

are better written in Python:

        doc['title']
        doc.title
        del doc.title

Document instances can be iterated over for their fields

The Java loop:

        Enumeration fields = doc.fields();
        while (fields.hasMoreElements()) {
            Field field = (Field) fields.nextElement();
            ...
        }

is better written in Python:

        for field in doc:
            ...
Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
Open Source Applications Foundation
Except where otherwise noted, this site and its content are licensed by OSAF under an Creative Commons License, Attribution Only 3.0.
See list of page contributors for attributions.