Records

vcfpy.Record

class vcfpy.Record(CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO, FORMAT, calls)[source]

Represent one record from the VCF file

Record objects are iterators of their calls

ALT = None

A list of alternative allele records of type AltRecord

CHROM = None

A str with the chromosome name

FILTER = None

A list of strings for the FILTER column

FORMAT = None

A list of strings for the FORMAT column

ID = None

A list of the semicolon-separated values of the ID column

INFO = None

An OrderedDict giving the values of the INFO column, flags are mapped to True

POS = None

An int with a 1-based begin position

QUAL = None

The quality value, can be None

REF = None

A str with the REF value

add_filter(label)[source]

Add label to FILTER if not set yet, removing PASS entry if present

add_format(key, value=None)[source]

Add an entry to format

The record’s calls data[key] will be set to value if not yet set and value is not None. If key is already in FORMAT then nothing is done.

affected_end

Return affected start position in 0-based coordinates

For SNVs, MNVs, and deletions, the behaviour is based on the start position and the length of the REF. In the case of insertions, the position behind the insert position is returned, yielding a 0-length interval together with affected_start()

affected_start

Return affected start position in 0-based coordinates

For SNVs, MNVs, and deletions, the behaviour is the start position. In the case of insertions, the position behind the insert position is returned, yielding a 0-length interval together with affected_end()

begin = None

An int with a 0-based begin position

call_for_sample = None

A mapping from sample name to entry in self.calls

calls = None

A list of genotype Call objects

end = None

An int with a 0-based end position

is_snv()[source]

Return True if it is a SNV

vcfpy.Call

class vcfpy.Call(sample, data, site=None)[source]

The information for a genotype callable

By VCF, this should always include the genotype information and can contain an arbitrary number of further annotation, e.g., the coverage at the variant position.

called = None

whether or not the variant is fully called

data = None

an OrderedDict with the key/value pair information from the call’s data

gt_alleles = None

the allele numbers (0, 1, …) in this calls or None for no-call

gt_bases

Return the actual genotype bases, e.g. if VCF genotype is 0/1, could return (‘A’, ‘T’)

gt_phase_char

Return character to use for phasing

gt_type

The type of genotype, returns one of HOM_REF, HOM_ALT, and HET.

is_filtered(require=None, ignore=None)[source]

Return True for filtered calls

Parameters:
  • ignore (iterable) – if set, the filters to ignore, make sure to include ‘PASS’, when setting, default is ['PASS']
  • require (iterable) – if set, the filters to require for returning True
is_het

Return True for heterozygous calls

is_phased

Return boolean indicating whether this call is phased

is_variant

Return True for non-hom-ref calls

plodity = None

the number of alleles in this sample’s call

sample = None

the name of the sample for which the call was made

site = None

the Record of this Call

vcfpy.AltRecord

class vcfpy.AltRecord(type_=None)[source]

An alternative allele Record

Currently, can be a substitution, an SV placeholder, or breakend

serialize()[source]

Return str with representation for VCF file

type = None

String describing the type of the variant, could be one of SNV, MNV, could be any of teh types described in the ALT header lines, such as DUP, DEL, INS, …

vcfpy.Substitution

class vcfpy.Substitution(type_, value)[source]

A basic alternative allele record describing a REF->AltRecord substitution

Note that this subsumes MNVs, insertions, and deletions.

value = None

The alternative base sequence to use in the substitution

vcfpy.SV

vcfpy.SV

vcfpy.BreakEnd

class vcfpy.BreakEnd(mate_chrom, mate_pos, orientation, mate_orientation, sequence, within_main_assembly)[source]

A placeholder for a breakend

mate_chrom = None

chromosome of the mate breakend

mate_orientation = None

orientation breakend’s mate

mate_pos = None

position of the mate breakend

orientation = None

orientation of this breakend

sequence = None

breakpoint’s connecting sequence

serialize()[source]

Return string representation for VCF

within_main_assembly = None

bool specifying if the breakend mate is within the assembly (True) or in an ancillary assembly (False)

vcfpy.SingleBreakEnd

class vcfpy.SingleBreakEnd(orientation, sequence)[source]

A placeholder for a single breakend

vcfpy.SymbolicAllele

class vcfpy.SymbolicAllele(value)[source]

A placeholder for a symbolic allele

The allele symbol must be defined in the header using an ALT header before being parsed. Usually, this is used for succinct descriptions of structural variants or IUPAC parameters.

value = None

The symbolic value, e.g. ‘DUP’