Tabix
The Tabix module provides support for working with Tabix-indexed files.
vcfpy.tabix.TabixFile
- class vcfpy.tabix.TabixFile(*, filename: Path | str, index: Path | str | None = None)[source]
Provides easy access for reading tabix files.
- fetch(*, reference: str | None = None, start: int | None = None, end: int | None = None, region: str | None = None) TabixFileIter[source]
Fetch iterator for given region.
vcfpy.tabix.TabixFileIter
- class vcfpy.tabix.TabixFileIter(*, index: TabixIndex, reference: str, start: int | None = None, end: int | None = None, bgzf_file: BgzfReader | None = None)[source]
Allows for easy iteration over a tabix file.
vcfpy.tabix.TabixIndex
- class vcfpy.tabix.TabixIndex(format: FileFormat, col_seq: int, col_beg: int, col_end: int, meta: bytes, skip: int, indices: dict[str, SequenceIndex], num_no_coord: int | None = None)[source]
Index as read from Tabix files and relevant after reading the index.
- col_beg: int
Column for begin position.
- col_end: int
Column for end position.
- col_seq: int
Column for sequence name.
- format: FileFormat
Format of underlying file.
- indices: dict[str, SequenceIndex]
Per-sequence indices.
- meta: bytes
Meta character.
- num_no_coord: int | None = None
Optional number of unmapped reads.
- skip: int
Lines to skip at the beginning.
Tabix Data Structures
- class vcfpy.tabix.Chunk(beg: int, end: int)[source]
Chunk.
- beg: int
Begin virtual offset.
- end: int
End virtual offset.
- class vcfpy.tabix.Bin(number: int, chunks: list[Chunk])[source]
Bin with chunks.
- number: int
Bin number.
Tabix Utilities
- vcfpy.tabix.read_index(path_tbi: Path | str) TabixIndex[source]
Read tabix index from given path.
- Parameters:
path_tbi – path to the tabix index file
- Returns:
the read index
- vcfpy.tabix.reg2bins(beg: int, end: int) list[int][source]
Get list of bins that may overlap a region [beg, end).
Based on the UCSC binning scheme used by tabix.
- Parameters:
beg – 0-based start position (inclusive)
end – 0-based end position (exclusive)
- Returns:
list of bin numbers that may overlap the region (in reverse order)