Class AbstractDocumentCollection
- java.lang.Object
-
- it.unimi.di.big.mg4j.document.AbstractDocumentSequence
-
- it.unimi.di.big.mg4j.document.AbstractDocumentCollection
-
- All Implemented Interfaces:
DocumentCollection,DocumentSequence,SafelyCloseable,FlyweightPrototype<DocumentCollection>,Closeable,AutoCloseable
- Direct Known Subclasses:
ConcatenatedDocumentCollection,FileSetDocumentCollection,JavamailDocumentCollection,JdbcDocumentCollection,SimpleCompressedDocumentCollection,SubDocumentCollection,TRECDocumentCollection,WikipediaDocumentCollection,ZipDocumentCollection
public abstract class AbstractDocumentCollection extends AbstractDocumentSequence implements DocumentCollection, SafelyCloseable
An abstract,safely closeableimplementation of a document collection.This class provides ready-made implementation of the
iterator()method. Concrete subclasses are encouraged to provide optimised, reuse-oriented versions of theiterator()method. Note that since this implementation usesdocument()to implement the iterator, creating two iterators concurrently will usually lead to unpredictable results.As a commodity, the
ensureDocumentIndex(long)method can be called whenever it is necessary to check that a document index is not out of range.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classAbstractDocumentCollection.PropertyKeysSymbolic names for common properties of aDocumentCollection.
-
Field Summary
-
Fields inherited from interface it.unimi.di.big.mg4j.document.DocumentCollection
DEFAULT_EXTENSION
-
-
Constructor Summary
Constructors Constructor Description AbstractDocumentCollection()
-
Method Summary
Modifier and Type Method Description protected voidensureDocumentIndex(long index)Checks that the index is correct (between 0, inclusive, andDocumentCollection.size(), exclusive), and throws anIndexOutOfBoundsExceptionif the index is not correct.DocumentIteratoriterator()Returns an iterator over the sequence of documents.static voidmain(String[] arg)static voidprintAllDocuments(DocumentSequence seq)Prints all documents in a given sequence.StringtoString()-
Methods inherited from class it.unimi.di.big.mg4j.document.AbstractDocumentSequence
close, filename, finalize, load
-
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface it.unimi.di.big.mg4j.document.DocumentCollection
copy, document, metadata, size, stream
-
Methods inherited from interface it.unimi.di.big.mg4j.document.DocumentSequence
close, factory, filename
-
-
-
-
Method Detail
-
ensureDocumentIndex
protected void ensureDocumentIndex(long index)
Checks that the index is correct (between 0, inclusive, andDocumentCollection.size(), exclusive), and throws anIndexOutOfBoundsExceptionif the index is not correct.- Parameters:
index- the index to be checked.
-
iterator
public DocumentIterator iterator() throws IOException
Description copied from interface:DocumentSequenceReturns an iterator over the sequence of documents.Warning: this method can be safely called just one time. For instance, implementations based on standard input will usually throw an exception if this method is called twice.
Implementations may decide to override this restriction (in particular, if they implement
DocumentCollection). Usually, however, it is not possible to obtain two iterators at the same time on a collection.- Specified by:
iteratorin interfaceDocumentSequence- Returns:
- an iterator over the sequence of documents.
- Throws:
IOException- See Also:
DocumentCollection
-
printAllDocuments
public static void printAllDocuments(DocumentSequence seq) throws IOException
Prints all documents in a given sequence.- Parameters:
seq- the sequence.- Throws:
IOException
-
-