TableCodec isn't properly indexable due to buffering done by OpenCSV reader. #3440

cmnbroad · 2017-08-14T18:36:28Z

TableCodec has traditionally taken advantage of a quirk of the htsjdk implementation of tabix indexing, where the input stream being indexed was closed and then reopened in between reading of the header and subsequent feature indexing. That quirk had several failure modes (see samtools/htsjdk#393 and samtools/htsjdk#943). These are fixed in samtools/htsjdk#906, and the stream is no longer closed by htsjdk.

However, TableCodec required a modification in order to remain indexable with these fixes, due to its use of a CSV reader (indirectly through TableReader) that buffers input, which thwarts feature-by-feature indexing. We should find a better long term fix for this; either finding a way to prevent OpenCSV from buffering, or possibly using a different CSV implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TableCodec isn't properly indexable due to buffering done by OpenCSV reader. #3440

TableCodec isn't properly indexable due to buffering done by OpenCSV reader. #3440

TableCodec isn't properly indexable due to buffering done by OpenCSV reader. #3440

TableCodec isn't properly indexable due to buffering done by OpenCSV reader. #3440

Comments