[go: nahoru, domu]

Skip to content
martinghunt edited this page Oct 20, 2015 · 3 revisions

Task: fixstart

This fixes the start position of each contig to be at a dnaA gene (if found), otherwise at the gene predicted by prodigal that is nearest the middle of the contig. Matches are found to dnaA genes by running promer. Circlator comes with a default set of dnaA genes (made with get_dnaa), but the user can specify an alternative FASTA file of genes instead.

Usage and options

The general usage is

circlator fixstart [options] <assembly.fasta> <outprefix>

There are the following options:

  • --genes_fa FILENAME: FASTA file of nucleotide sequences of genes to search for to use as start point.
  • --ignore FILENAME: absolute path to file of contig names to not change. One contig name per line. By default, the start position of every input contig will be changed.
  • --min_id FLOAT: minimum percent identity of promer match to dnaA gene. Default: 70

Output files

The output file of rearranged contigs is called outprefix.fasta and logging information is written to outprefix.log. An example log file is:

  [fixstart] id       break_point  gene_name  gene_reversed  new_name  skipped
  [fixstart] contig1  -            -          -              -         skipped
  [fixstart] contig2  1234567      dnaa_1     no             -         -
  [fixstart] contig3  1000         prodigal   yes            -         -

Contig1 was skipped because it was named using the --ignore option. Contig2 had a match to the gene dnaa_1, starting at position 1234567 and so was rearranged to start at that position. Contig3 had no match to a dnaA gene, so a gene predicted by prodigal was used - the match was on the reverse strand and the contig was rearranged so that it starts with that gene on its forward strand. The new start point of the contig is at position 1000. The column new_name is not currently used by Circlator and can be ignored.

Clone this wiki locally