BAC contig assembler is a computational tool for assembling contig of minimally overlapping BAC clones across genomic sequence of interest for mouse and human genomes. Resulting minimal tiling path BAC contig covers up to 99% of region with average 1.3x genomic representation and BAC density of 7-8 BAC clones per 1MB of sequence.
BAC contig assembly is based on WU-BLAST search of genomic sequence against TIGR BAC End Sequences Database . Availability of both BAC end sequences allow one to map clone onto genomic sequence with the help of BLAST algorithm. Using TIGR database results in retrieving redundant set of BAC clones mapped to the sequence (Fig.1A). BAC contig assembler algorithm uses this redundant set of clones to assemble minimal tiling path BAC contig which is comprised of non redundant subset of minimally overlapping clones (Fig.1B).
Fig1A.
Fig1B.

Fig.1 BAC contig assembly strategy.
A) Mapping all BAC clones from TIGR database onto
genomic region of interest. B) Minimal tiling path BAC contig assembled
with BAC_assembler.pl Perl program.
BAC clones are mapped with WU-BLAST and represented by
thin horizontal bars joining short thick bars which are the position of
BAC end sequences in the query sequence.
Taking into account the average BAC vector insert size
of 150-200kb we use only those BAC clones which end sequences are
mapped within 300kb from each other allowing for short gaps of
uncertain size within genomic assembly.
X axis represents nucleotides of genomic sequence.
Visualization is implemented with Genome Cryptographer
software.
BAC contig assembler program allows for construction of minimal tiling
path contigs across regions of mouse and human genome assemblies
generated by UCSC Genome Project
.
Following BAC libraries may be used for contig construction:
==========================================================
query name "name_of_the_sequence"
UCSC assembly
RPCI11_R-135P13 18295 179497 8 123718296 chr8:123718296-123879498
1506910454 1507071656
RPCI11_R-580G9 130099 302656 8 123830100 chr8:123830100-124002657
1507022258 1507194815
CIT-HSP-D_3218D9 288803 323966 8 123988804 chr8:123988804-124023967
1507180962 1507216125
RPCI11_R-755E24 383461 564016 8 124083462 chr8:124083462-124264017
1507275620 1507456175
RPCI11_R-37N22 563889 745693 8 124263890 chr8:124263890-124445694
1507456048 1507637852
RPCI11_R-701E22 702226 893282 8 124402227 chr8:124402227-124593283
1507594385 1507785441
.......
1. the query length: 10000000 bp
2. the number of BACs: 79
3. mean value of BACs overlap.: 34.31818 %
4. coverage: 95.90789 %
5. genome representation: 1.3250137 x
==========================================================

Fig. 2. Assembly affect on contig construction. Thick
bars represent BAC end sequences mapped onto query sequence. In some
cases the length of homology is far more than the one of BAC end
sequence which might be due to presence of repeated stretches of unique
DNA in local region of genomic assembly. These clones are not
considered for contig assembly.
Is it possible to
assemble library specific contig?
BAC contig assembler program allows one to construct both library specific and mixed contigs. See Section 3 on submission page.
How can I fill in gaps in a contig?
Absence of sequence in a library causes gaps in a library specific contig. In this case gaps might be filled in with clones from other libraries. Click Yes in Section 7 on submission page to have redundant set of BACs mapped to the sequence of interest. Use "All BAC clones mapped to sequence_name" and "List of all BAC clones mapped to sequence_name" files to insert necessary clones into minimal tiling path contig.
No contig has been assembled across genomic sequence?
There might be several reasons for absence of any
clones spanning the region of interest: (i) high repeat content of
sequence; (ii) there's no sequence available in the range of
coordinates (you may check your sequence with the help of UCSC Genome
browser. Note, that coordinates are freeze specific! ); (iii) low
quality of assembly in the genomic region; (iv) the region is
recalcitrant to cloning which leads to its absence in genomic libraries.