Alignment methods¶
-
biscot.Alignment.
get_leftmost_label
(label_list, channel, reference_map)[source]¶ Extracts all label positions from a label list ids and reference maps and returns the one that has the minimum position on the anchor
- Parameters
label_list (list(int)) – List of label ids
channel (int (1 or 2)) – Enzyme channel to consider to extract label position
reference_map (dict(integer: Map)) – Dict containing anchor Map objects
- Returns
Returns the label id that satisfies label_position = min(all_label_positions)
- Return type
int
-
biscot.Alignment.
get_rightmost_label
(label_list, channel, reference_map)[source]¶ Extracts all label positions from a label list ids and reference maps and returns the one that has the maximum position on the anchor
- Parameters
label_list (list(int)) – List of label ids
channel (int (1 or 2)) – Enzyme channel to consider to extract label position
reference_map (dict(int, Map)) – Dict containing anchor Map objects
- Returns
Returns the label id that satisfies label_position = max(all_label_positions)
- Return type
int
Parses two Alignments objects and returns anchor map label ids for which both contig maps are aligned to
-
biscot.Alignment.
line_to_alignment
(line, channel)[source]¶ Converts an xmap line to an Alignment object
- Parameters
line (str) – A line of an xmap file
channel (int) – Enzyme channel to consider
- Returns
An alignment object
- Return type
-
biscot.Alignment.
parse_xmap
(reference_maps_dict, xmap_1_path, xmap_2_path, deleted_xmap_records, xmap_two_enzymes_path='', only_confirmed_positions=False)[source]¶ Parses from one to three xmaps and converts lines to Alignment objects
- Parameters
reference_maps_dict (dict(int, Map)) – Dict containing anchor Map obecjts
xmap_1_path (str) – Path to the first xmap_file
xmap_2_path (str) – Path to the second xmap file
deleted_xmap_records (dict(int, Alignment)) – Dict containing Alignment objects that were deleted due to a larger alignment being found
xmap_two_enzymes_path (str, optional) – Path to the 2-enzyme xmap file, defaults to “”
only_confirmed_positions (bool, optional) – If True, only alignments contained in xmap_1 or xmap_2 AND in xmap_2enzymes will be conserved, defaults to False
-
biscot.Alignment.
print_agp
(reference_maps_dict, key_dict, deleted_xmap_records, contigs_map_dict)[source]¶ Searches for shared labels between two Alignment objects and calls the correct function
- Parameters
reference_maps_dict (dict(int, Map)) – Dict containing anchor maps
key_dict (dict((int, int, int): (str, int, int, int))) – Dict containing the correspondance between contigs and contig maps
deleted_xmap_records (dict(integer: Alignment)) – Dict containing smaller alignments that weren’t retained when parsing xmaps
contigs_map_dict (dict(int, Map)) – Dict containing contig maps
- Returns
List containing contig maps that were used to build current scaffold
- Return type
list(int)
-
biscot.Alignment.
print_agp_line_no_intersection
(aln_1, aln_2, previous_ref_end, previously_scaffolded_maps, key_dict, previous_part_number)[source]¶ Prints an AGP line, formatted following the AGP2 standard. Used when two contig maps don’t share anchor labels.
- Parameters
aln_1 (Alignment) – Alignment that has the smallest reference_start
aln_2 (Alignment) – Alignment that has the highest reference_start
previous_ref_end (int) – Current position in the scaffold being built
previously_scaffolded_maps (list(int)) – Contig map ids that were previously used to build the current scaffold
key_dict (dict((int, int, int), (str, int, int, int))) – Dict containing the correspondance between contigs and contig maps
previous_part_number (int) – Current AGP line id
- Returns
Current position in the scaffold being build after applying the changes
- Return type
int
-
biscot.Alignment.
print_agp_line_with_intersection
(aln_1, aln_2, previous_ref_end, previously_scaffolded_maps, contigs_map_dict, previous_part_number, key_dict)[source]¶ Prints a line formatted by following the AGP2 standard. Used when two contig maps share labels.
- Parameters
aln_1 (Alignment) – Alignment that has the smallest reference_start
aln_2 (Alignment) – Alignment that has the highest reference_start
previous_ref_end (int) – Current position in the scaffold that is being built
previously_scaffolded_maps (list(int)) – Contig map ids that were previously used to build the current scaffold
contigs_map_dict (dict(int, Map)) – Dict containing contig maps
previous_part_number (int) – Id of the previous line
key_dict (dict((int, int, int), (str, int, int, int))) – Dict containing the correspondance between contigs and contig maps
- Returns
Current position in the scaffold after applying the changes
- Return type
int
-
biscot.Alignment.
print_gap_line
(aln_1, aln_2, reference_id, previous_reference_end, previous_part_number, key_dict)[source]¶ Prints an AGP line, formatted following the AGP2 standard. Used to print an ‘N’ line.
- Parameters
aln_1 (Alignment) – Alignment that has the smallest reference_start
aln_2 (Alignment) – Alignment that has the highest reference_start
reference_id (int) – Id of the anchor map
previous_reference_end (int) – Current position in the scaffold being built
previous_part_number (int) – Current AGP line id
key_dict (dict((int, int, int), (str, int, int, int))) – Dict containing the correspondance between contigs and contig maps
- Returns
Current position in the scaffold being built after applying changes
- Return type
int
-
biscot.Alignment.
solve_alignment_containment
(reference_maps_dict, contigs_map_dict, key_dict)[source]¶ Calls the contained alignment solver function for each alignment couple
- Parameters
contained_alignments – Tuple containing the contained alignment (second position) and the large alignment (first position)
contigs_map_dict (dict(int, Map)) – Dict containing contig maps
key_dict (dict((int, int, int), (str, int, int, int))) – Dict containing correspondance between contigs and contig maps
-
biscot.Alignment.
solve_containment
(aln_couple, reference_maps_dict, contig_maps_dict, key_dict)[source]¶ - Tries to integrate a small map into a larger one.Let’s consider a Map 1 that is aligned on the reference from position 1 to 100 and a Map 2 that is aligned on the reference from position 25 to 75.The goal of this function is to break alignment of Map 1 into two alignments (1-25 and 75-100).
- Parameters
aln_couple (tuple(Alignment, Alignment)) – Two Alignment objects. The first one being the ‘small alignment’ and the second, the ‘large alignment’
reference_maps_dict (dict(int, Map)) – Dict of anchor Map objects
contig_maps_dict (dict(int, Map)) – Dict of contig Map objects
key_dict (dict((int, int, int), (str, int, int, int))) – Dict containing the correspondance between Map objects and actual sequences
-
biscot.Alignment.
write_unplaced_contigs
(key_dict, contigs_sequence_dict, scaffolded_maps)[source]¶ Incorporates contigs that weren’t scaffolded into the AGP file
- Parameters
key_dict (dict((int, int, int), (str, int, int, int))) – Dict containing the correspondance between contigs and contig maps
contigs_sequence_dict (dict(str, str)) – Dict containing fasta sequences
scaffolded_maps (list(int)) – List containing contig map ids that were scaffolded