[go: nahoru, domu]

Skip to content

v1.6.0 - One egg for many baskets

Compare
Choose a tag to compare
@fpusan fpusan released this 10 Sep 07:39
· 125 commits to master since this release
21ce1ff

New features

  • The script restart.pl has been removed. Project restart is now achieved by calling SqueezeMeta.pl --restart -p <project_name>. The flags -step <STEP> --force-overwrite can be added to this call in order to restart the pipeline from a specific step.
  • Users can now control whether the source of bin taxonomy is the LCA algorithm from SqueezeMeta, or the taxonomic assignment performed by CheckM. This can be controlled with the flag -taxbinmode. Options are s (SqueezeMeta only, default), c (CheckM), s+c (SqueezeMeta, missing ranks will be completed with CheckM taxonomy when possible) or c+s (CheckM, missing ranks will be completed with SqueezeMeta taxonomy when possible).
  • Users can now control the minimum percentage of genes from the same taxa needed in order to taxonomically annotate a contig. This can be done with the flag -consensus .
  • sqm_longreads.pl will now consider partial hits completely contained inside a long read as valid hits. Before, partial hits were only considered valid if they occurred at the beginning or end of the reads. This has a noticeable impact in the annotation percentages. The old behaviour can be reinstated with the flags -n or -nopartialhits.
  • sqm2pavian.pl now works with results from sqm_reads.pl and sqm_longreads.pl.
  • Added the option --filter to sqm_mapper.pl. When this flag is present, the script will filter a set of input sequences, returning only the ones that did not map to the reference.
  • SQMtools: SQM objects now track the length, abundance, mapped bases, coverage and coverage per million reads of bins. The corresponding matrices can be found under the SQM$bins list. When running subsetContigs, these values will be updated taking in consideration only the contigs from each bin that were selected.
  • SQMtools: added the subsetSamples function to generate subsetted SQM objects containing only the requested samples.
  • SQMtoools: added the plotBins function to generate barcharts with the distribution of bins across samples.
  • SQMtools: unmapped reads for functions are no longer tracked, since it led to inconsistent results in some cases (see #442). This also affects the tables generated by sqm2tables.py.
  • SQMtools: added the mostVariable function, which will return the most variable rows (based on their coefficient of variation) from a data.frame or matrix. The interface is otherwise similar to the mostAbundant function.
  • SQMtools: SQM objects now track the coverage per million of reads of orfs, contigs, bins and functions. Each can be accessed inside the corresponding list under the cpm name. "cpm" is also a valid count option for plotFunctions and plotBins.

Minor changes / bugfixes

  • SQMtools will from now on follow the same version numbers as the corresponding SqueezeMeta releases.
  • Updated DIAMOND version to 2.0.15.
  • Fixed a bug when adding taxonomic assignments to bins, in which a lack of consensus in a high level prevented looking for consensus at deeper levels.
  • Fixed a bug in which data.table may make DAStool crash if it was called with a very high number of threads.
  • Fixed a bug in which both reads of a pair were counted as mapped even if only one of them actually mapped to the reference. This had little impact in real datasets, but is corrected now.
  • Fixed a bug in which custom arguments passed to bowtie2 with -mapping_options conflicted in some cases with the --very-sensitive-local option that we use by default when calling bowtie2. --very-sensitive-local is now skipped when the user provides custom arguments to bowtie2.
  • Fixed an uncommon issue in which contigs could end up being assigned to more than one bin after restarting the pipeline.
  • Fixed a bug in sqm_longreads.pl when using several input files from the same sample.
  • loadSQM now removes redundant info from the orfs and contigs tables when loading a project into SQMtools resulting in less memory usage.
  • Fixed a bug in which loading a project with loadSQM could randomly caused an error.
  • We no longer provide a PDF manual for SQMtools. The documentation for each function can still be accessed from the R terminal or RStudio.

Compatibility Changes

  • Results generated by previous versions of SqueezeMeta will not load into SQMtools 1.6.0 (which corresponds to SqueezeMeta release 1.6.0). Running 19.getcontigs.pl /path/to/project will make a project generated with SqueezeMeta v1.5 compatible with the new version of SQMtools.