Parser

The parser module provides low-level parsing functionality for VCF files.

vcfpy.parser.Parser

class vcfpy.parser.Parser(stream: TextIOWrapper, path: Path | str | None = None, record_checks: Iterable[Literal['FORMAT', 'INFO']] | None = None)[source]

Class for line-wise parsing of VCF files

In most cases, you want to use vcfpy.reader.Reader instead.

Parameters:
  • streamfile-like object to read from

  • path (str) – path the VCF is parsed from, for display purposes only, optional

header

header, once it has been read

parse_header(parsed_samples: list[str] | None = None)[source]

Read and parse vcfpy.header.Header from file, set into self.header and return it

Parameters:

parsed_samples (list) – list of str for subsetting the samples to parse

Returns:

vcfpy.header.Header

Raises:

vcfpy.exceptions.InvalidHeaderException in the case of problems reading the header

parse_line(line: str)[source]

Parse the given line without reading another one from the stream

parse_next_record()[source]

Read, parse and return next vcfpy.record.Record

Returns:

next VCF record or None if at end

Raises:

vcfpy.exceptions.InvalidRecordException in the case of problems reading the record

print_warn_summary()[source]

If there were any warnings, print summary with warnings

record_checks

checks to perform, can contain ‘INFO’ and ‘FORMAT’

samples

vcfpy.header.SamplesInfos with sample information; set on parsing the header

vcfpy.RecordParser

class vcfpy.parser.RecordParser(header: Header, samples: SamplesInfos, record_checks: Iterable[Literal['FORMAT', 'INFO']] | None = None)[source]

Helper class for parsing VCF records

header

Header with the meta information

parse_line(line_str: str) Record | None[source]

Parse line from file (including trailing line break) and return resulting Record

record_checks

The checks to perform, can contain ‘INFO’ and ‘FORMAT’

samples

SamplesInfos with sample information

vcfpy.HeaderParser

class vcfpy.parser.HeaderParser[source]

Helper class for parsing a VCF header

parse_line(line: str) HeaderLine[source]

Parse VCF header line (trailing ‘ ‘ or ‘ ‘ is ignored)

param str line:

str with line to parse

param dict sub_parsers:

dict mapping header line types to appropriate parser objects

returns:

appropriate HeaderLine parsed from line

raises:

vcfpy.exceptions.InvalidHeaderException if there was a problem parsing the file

sub_parsers

Sub parsers to use for parsing the header lines

Parser Utilities

vcfpy.parser.parse_field_value(field_info: FieldInfo, value: str | bool) bool | int | float | str | list[bool | int | float | str | None] | None[source]

Parse value according to field_info

vcfpy.parser.convert_field_value(type_: Literal['Integer', 'Float', 'Flag', 'Character', 'String'], value: str) bool | int | float | str | None[source]

Convert atomic field value according to the type