Issue - I was created the extended class of [org.apache.tika.parser.microsoft.ExcelExtractor] same as below to extract the embedded documents of excel(*.xls) file. public class CustomExcelExtractor extends ExcelExtractor { private CustomAbstractPOIFSExtractor poi; public CustomExcelExtractor(ParseContext context, Metadata metadata, Path outputDir) { super(context, metadata); poi = new CustomAbstractPOIFSExtractor(); } @Override public void parse( DirectoryNode root, XHTMLContentHandler xhtml, Locale locale) throws IOException, SAXException, TikaException { // Extract embedded documents for (Entry entry : root) { if (entry.getName().startsWith("MBD") && entry instanceof DirectoryEntry) { try { poi.extractEmbeddedOfficeDoc((DirectoryEntry) entry, null, xhtml, embeddedCnt); } catch (TikaException e) { // ignore parse errors from embedded documents } } } } private class CustomAbstractPOIFSExtractor { private TikaConfig config = TikaConfig.getDefaultConfig(); /** * Handle an office document that's embedded at the POIFS level */ protected void extractEmbeddedOfficeDoc( DirectoryEntry dir, String resourceName, XHTMLContentHandler xhtml, int embeddedCnt) throws IOException, SAXException, TikaException { if (dir.hasEntry("Package")) { return; } // It's regular OLE2: POIFSDocumentType type = POIFSDocumentType.detectType(dir); try { if (type == POIFSDocumentType.WORDDOCUMENT) { FileOutputStream fos = new FileOutputStream(new File("test.doc")); HWPFDocument document = new HWPFDocument((DirectoryNode) dir); document.write(fos); document.close(); } else if (type == POIFSDocumentType.POWERPOINT) { FileOutputStream fos = new FileOutputStream(new File("test.ppt")); HSLFSlideShowImpl document = new HSLFSlideShowImpl((DirectoryNode) dir); document.write(fos); document.close(); // After call this method, I cannot continue to extract embedded documents } } catch (Exception ex) { ex.printStackTrace(); } } } } Code (markup): - [extractEmbeddedOfficeDoc] method will write the stream of embedded documents to files. When I call [HSLFSlideShowImpl.close()] method to close the stream of "test.ppt" document, I cannot continue to loop to extract the other embedded documents. The exception will be occurred. java.lang.IndexOutOfBoundsException: Block 1079 not found at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getBlockAt(NPOIFSFileSystem.java:486) at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:169) at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:142) at org.apache.poi.poifs.filesystem.NDocumentInputStream.readFully(NDocumentInputStream.java:257) at org.apache.poi.poifs.filesystem.NDocumentInputStream.readUShort(NDocumentInputStream.java:305) at org.apache.poi.poifs.filesystem.DocumentInputStream.readUShort(DocumentInputStream.java:182) at org.apache.poi.hssf.record.RecordInputStream$SimpleHeaderInput.readRecordSID(RecordInputStream.java:115) at org.apache.poi.hssf.record.RecordInputStream.readNextSid(RecordInputStream.java:198) at org.apache.poi.hssf.record.RecordInputStream.<init>(RecordInputStream.java:132) at org.apache.poi.hssf.record.RecordInputStream.<init>(RecordInputStream.java:120) at org.apache.poi.hssf.record.RecordFactoryInputStream.<init>(RecordFactoryInputStream.java:184) at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:491) at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:348) at extractor.CustomExcelExtractor$CustomAbstractPOIFSExtractor.extractEmbeddedOfficeDoc(CustomExcelExtractor.java:324) at extractor.CustomExcelExtractor.parse(CustomExcelExtractor.java:119) at extractor.CustomOfficeParser.parse(CustomOfficeParser.java:76) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132) at extractor.ExtractEmbeddedByTika.extract(ExtractEmbeddedByTika.java:37) at main.ExcelEmbeddedtExtractor.main(ExcelEmbeddedtExtractor.java:61) Caused by: java.lang.IndexOutOfBoundsException: Unable to read 512 bytes from 552960 in stream of length -1 at org.apache.poi.poifs.nio.ByteArrayBackedDataSource.read(ByteArrayBackedDataSource.java:42) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getBlockAt(NPOIFSFileSystem.java:484) ... 18 more Code (markup): Additional Information - The exception was NOT occurred after I called [HWPFDocument.close()] method. Now, I would like to fix this issue but I don't know the root cause of this issue. Please help me! Thanks in advance.