class vcfpy.Record(CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO, FORMAT, calls)[source]

Represent one record from the VCF file

Record objects are iterators of their calls

ALT = None

A list of alternative allele records of type AltRecord

CHROM = None

A str with the chromosome name


A list of strings for the FILTER column


A list of strings for the FORMAT column

ID = None

A list of the semicolon-separated values of the ID column

INFO = None

An OrderedDict giving the values of the INFO column, flags are mapped to True

POS = None

An int with a 1-based begin position

QUAL = None

The quality value, can be None

REF = None

A str with the REF value


Add label to FILTER if not set yet, removing PASS entry if present

add_format(key, value=None)[source]

Add an entry to format

The record’s calls data[key] will be set to value if not yet set and value is not None. If key is already in FORMAT then nothing is done.


Return affected start position in 0-based coordinates

For SNVs, MNVs, and deletions, the behaviour is based on the start position and the length of the REF. In the case of insertions, the position behind the insert position is returned, yielding a 0-length interval together with affected_start()


Return affected start position in 0-based coordinates

For SNVs, MNVs, and deletions, the behaviour is the start position. In the case of insertions, the position behind the insert position is returned, yielding a 0-length interval together with affected_end()

begin = None

An int with a 0-based begin position

call_for_sample = None

A mapping from sample name to entry in self.calls

calls = None

A list of genotype Call objects

end = None

An int with a 0-based end position


Return True if it is a SNV


class vcfpy.Call(sample, data, site=None)[source]

The information for a genotype callable

By VCF, this should always include the genotype information and can contain an arbitrary number of further annotation, e.g., the coverage at the variant position.

called = None

whether or not the variant is fully called

data = None

an OrderedDict with the key/value pair information from the call’s data

gt_alleles = None

the allele numbers (0, 1, …) in this calls or None for no-call


Return the actual genotype bases, e.g. if VCF genotype is 0/1, could return (‘A’, ‘T’)


Return character to use for phasing


The type of genotype, returns one of HOM_REF, HOM_ALT, and HET.

is_filtered(require=None, ignore=None)[source]

Return True for filtered calls

  • ignore (iterable) – if set, the filters to ignore, make sure to include ‘PASS’, when setting, default is ['PASS']
  • require (iterable) – if set, the filters to require for returning True

Return True for heterozygous calls


Return boolean indicating whether this call is phased


Return True for non-hom-ref calls

plodity = None

the number of alleles in this sample’s call

sample = None

the name of the sample for which the call was made

site = None

the Record of this Call


class vcfpy.AltRecord(type_=None)[source]

An alternative allele Record

Currently, can be a substitution, an SV placeholder, or breakend


Return str with representation for VCF file

type = None

String describing the type of the variant, could be one of SNV, MNV, could be any of teh types described in the ALT header lines, such as DUP, DEL, INS, …


class vcfpy.Substitution(type_, value)[source]

A basic alternative allele record describing a REF->AltRecord substitution

Note that this subsumes MNVs, insertions, and deletions.

value = None

The alternative base sequence to use in the substitution




class vcfpy.BreakEnd(mate_chrom, mate_pos, orientation, mate_orientation, sequence, within_main_assembly)[source]

A placeholder for a breakend

mate_chrom = None

chromosome of the mate breakend

mate_orientation = None

orientation breakend’s mate

mate_pos = None

position of the mate breakend

orientation = None

orientation of this breakend

sequence = None

breakpoint’s connecting sequence


Return string representation for VCF

within_main_assembly = None

bool specifying if the breakend mate is within the assembly (True) or in an ancillary assembly (False)


class vcfpy.SingleBreakEnd(orientation, sequence)[source]

A placeholder for a single breakend


class vcfpy.SymbolicAllele(value)[source]

A placeholder for a symbolic allele

The allele symbol must be defined in the header using an ALT header before being parsed. Usually, this is used for succinct descriptions of structural variants or IUPAC parameters.

value = None

The symbolic value, e.g. ‘DUP’