Pythonic Extensions to the Java API
Java is a very verbose language. Python, on the other hand, offers
many syntactically attractive constructs for iteration, property
access, etc... As the Java Lucene samples from the 'Lucene in Action'
book were ported to Python, PyLucene received a number of pythonic
extensions listed here:
Hits as a List
Iterating search hits is a very common operation. Hits instances are
iterable in Python. Two values are returned for each iteration, the
zero-based number of the document in the Hits instance and the
document instance itself.
The Java loop:
for (int i = 0; i < hits.length(); i++) {
Document doc = hits.doc(i);
System.out.println(hits.score(i) + " : " + doc.get("title"));
}
is better written in Python:
for i, doc in hits:
print hits.score(i), ':', doc['title']
Hits instances partially implement the Python 'list' protocol.
The Java expressions:
hits.length()
doc = hits.get(i)
are better written in Python:
len(hits)
doc = hits[i]
InderReader? as a List
Similarly,
IndexReader? instances partially implement the 'list'
protocol and can be iterated over for their documents.
The Java expressions:
indexReader.maxDoc()
indexReader.document(i)
are better written in Python:
len(indexReader)
indexReader[i]
The Java loop:
for (int i = 0; i < indexReader.maxDoc(); i++) {
Document doc = indexReader.document(i);
...
}
is better written in Python:
for i, doc in indexReader:
...
Document as a Dict
Document instances have fields whose values can be accessed through
the dict and attribute protocol.
The Java expressions:
doc.get("title")
doc.getField("title")
doc.removeField("title")
are better written in Python:
doc['title']
doc.title
del doc.title
Document instances can be iterated over for their fields
The Java loop:
Enumeration fields = doc.fields();
while (fields.hasMoreElements()) {
Field field = (Field) fields.nextElement();
...
}
is better written in Python:
for field in doc:
...