Releases: MultiQC/MultiQC


22 Jun 13:34
Contains fixes of multiple bugs collected after the last release along with a few minor improvements.

MultiQC fixes

  • Fix the re_contents search patterns when pattern is found in the middle of the file. Fixes finding logs from several Picard submodules, like CollectRnaSeqMetrics and CollectWgsMetrics in some cases (#2610)
  • Fixes the run_modules option use when the module anchor doesn't match the module entry point ID (e.g. DRAGEN and dragen) (#2633)
  • Fix use of custom search patterns for custom content (#2647)
  • Fix plot export with export_plots: true or --export (#2637)
  • Correctly handle old-style label sections in x_lines or y_lines in line plot configs (#2648)
  • Fix disabling sort_rows in custom content by subclassing TableConfig from ValidatedConfig and use deprecated (#2604)
  • When user provides a search pattern dictionary in config, recursively update instead of replacing (#2620)
  • Fix config update when dict replaced with list, e.g. a search_patterns item is a list that's replaced with a dict (c388178)

MultiQC updates

  • Add unit tests for core and some modules (see picard or samtools), as well as codecov report (#2624)
    • Now MultiQC checks if every module does something productive with the provided test data in test-data.
    • For modules with many submodules (picard, dragen), additionally check if every submodule parses the expected number of samples from test-data files.
    • Users can put module tests in test subfolders, e.g. https://github.com/MultiQC/MultiQC/tree/main/multiqc/modules/picard/tests
    • Use pytest for all core unit tests (#2623)
    • Move unit tests from the test-data repo into tests folder (#2622)
  • Plot config validation:
    • Validate line plot x_lines, x_bands, etc. with a Pydantic model, including label subsections (#2648)
    • Validate line plot series and extra_series with a Pydantic model (#2573)
    • Validate table config (#2604)
    • Make the "unrecognised field" error a warning
    • Rename deprecated plot config fields in internal modules (#2636)
  • Show progress bar for exporting flat plot images (#2639)
  • Better error message for incorrect run_modules (#2635)
  • Increase flat plots sample number threshold to 1000 (#2615)
  • Small speed-up of the line block iterator (#2588)
  • Update README logos for better compatibility (#2603)
  • Docs: don't use raw markdown links (#2642)
  • Allow to override showlegend for line config plots. Default to not-show for large datasets to avoid bloated legends (#2615)
  • Show error message if failed to parse custom content header (d736846)
  • Load every found config file once 422b39b

Module fixes and updates

  • Picard
    • Fix finding CollectRnaSeqMetrics and CollectWgsMetrics logs by fixing the re_contents search patterns (#2610)
  • biobambam2
    • Fix parsing markdups logs
    • Coverage histograms: fix duplicated label suffix (#2619)
    • Fix the gc_metrics submodule (#2629)
    • vc_metrics: pre-filter numbers can be zero (#2618)
  • FastQC
    • Default to showlegend: false, as we don't distinguish the sample colors, unless fastqc_config: status_checks: false' is set (#2615)
  • BBTools
    • Fix incorrect calculation of % Q30 Bases (#2628)
  • Samtools
    • markdup: resolve inconsistent non-optical pair duplicate variable name in samtools markdup module (#2626)
  • NanoStat
    • Support different Q cutoffs (#2645)
  • Salmon
    • Fix ignored parsed library_types when its type is list (#2617)
  • UMI-tools
    • Improve extract plots (#2614)
  • BCL Convert
    • Fix 'pecent' typo (#2612)


31 May 18:30
Bug fix release. Two main issues are fixed:

  • Fixed running the same module twice with path_filters (e.g. trimmed vs. raw FastQC),
  • The raw data report_saved_raw_data is re-added in multiqc_data.json by default.

MultiQC fixes

  • Fix running the same module multiple times in the report (e.g. trimmed vs. raw FastQC) (#2592)
  • Preserve report_saved_raw_data in multiqc_data.json by keeping preserve_module_raw_data: false by default (#2591)
  • Table headers: do not set namespace to None when there is a single namespace (#2590)
  • Re-enable falling back to flat plots for large datasets (#2580)
  • Reset in multiqc.run(*) to allow running it twice interactively (#2598)
  • Fix scatter plot in --flat mode when there are categorical axes (#2600)
  • Fix hiding table column with all empty values in custom content (#2599)
  • Table "Copy" button: include headers (#2594)

Module fixes and updates

    • Underscore attributes captured by lambdas to avoid wiping them after the module is finished (#2581)
  • Cell Ranger
    • Handle missing vdj_annotation and vdj_enrichment sections (#2579)
  • fgbio
    • Fix links in fgbio.md (#2586)
  • Custom content
    • Support DOI for custom content (#2582)


17 May 11:10
Fix running as a Nextflow job

This bug fix release addresses the file search problem when MultiQC is executed as a typical Nextflow job. See #2575 for detail.


15 May 15:04
Choose a tag to compare

Highlights - notebooks and performance

Version 1.22 brings some major behind-the-scenes refactoring to MultiQC. This unlocks a number of new features, such as the ability to use MultiQC as a Python library in scripts / notebooks, and run-time validation of plot config attributes.

This release also introduces some huge performance improvements thanks to @rhpvorderman.
Compared to v1.21, a typical v1.22 run is 53% faster and has a 6x smaller peak-memory footprint - well worth updating! 🏃🏻‍♂️ 💨

Finally, support for the depreciated HighCharts plotting library is fully removed in v1.22, bringing to a close a long standing project to migrate to Plotly.

For more information, please see the upcoming MultiQC release blog article on the Seqera website: https://seqera.io/blog/

MultiQC updates

  • Remove the highcharts template and Highcharts and Matplotlib dependencies (#2409)
  • Remove CSP.txt and the linting check, move the script that prints missing hashes under scripts. Admins of servers with Content Security Policy can use it to print missing hashes when they install a new MultiQC version with: python scripts/print_missing_csp.py --report full_report.html (#2421)
  • Do not maintain change log between releases (#2427)
  • Use native clipboard API (#2419)
  • Profile runtime: visualize per-module memory and run time (#2548, #2547)
  • Refactoring for performance:
    • Search file blocks rather than individual lines for faster results (#2513)
    • Refactor file content search for a 40% speed increase (#2505)
    • Sort filepatterns for faster searching (#2506)
    • Use array.array for in-memory plot data, stream to render Jinja and dump JSON to reduce memory requirement (#2515)
    • Speed up all modules by caching spectra.scale and using sets instead of lists (#2509)
    • Stream json data to a file to save 30% of the memory (#2510)
    • Do replace_nan in place rather than creating a new object (#2529)
    • Use gzip rather than lzstring for compression and decompression of the plot data (#2504)
    • Use gzip level 6 for faster json compression (#2553)
    • Clean up module raw data after running each module, significantly reduces the memory footprint (#2551)
  • Refactoring for interactivity and validation:
    • Top-level functions for MultiQC use as a library (#2442)
    • Pydantic models for plots and datasets (#2442)
    • Validating plot configs with Pydantic (#2534)
    • Use dataclasses for table and violin columns (#2546)
    • Break up the main run function into submodules (#2446)
    • Deprecate multiqc.utils.config and multiqc.utils.report in favour of multiqc.config and multiqc.report (#2542)
    • Static typing of the report and config modules (#2445)
    • Add type hints into core codebase (#2434)
    • Consistent config options: rename decimalPlaces to tt_decimals (#2451)
    • Remove encoding and shebang headers from module files (#2425)
    • Refactor line plot categories: keep boolean throughout the code, and data points as pairs for simplicity (#2418)
  • Fixes:
    • Fix error when using default sort (#2544)
    • Do not attempt to render flat plot when no data (#2490)
    • Fix export plots with --export and always export data (#2489)
    • Fix: make sure modify lambda not present in JSON dump (#2455)
    • Enable --export even when writing interactive plots (#2444)
    • Replace NaN with null in exported JSON (#2432)
    • Fix y_minrange option (#2415)
  • Reduce report size: exclude plot data for sections in remove_sections (#2460)
  • Add ge and le to cond_formatting_rules (#2494)
  • CI: use uv pip (#2352)
  • Lint check for use of f["content_lines"] (#2485)
  • Allow to set style of line graph (lines or lines+markers) per plot (#2413)
  • Add CMD to Dockerfile so a default run without any parameters displays the --help (#2279)

New modules

  • Hostile (#2501)
    • New module: Hostile is a short and long host reads removal tool
  • Sequali (#2441)
    • New module: Sequali Universal sequencing QC

Module updates

  • Adapter Removal
    • Standardize module names: use the came case (#2433)
  • Bamdst
    • Fix chromosome reports when contig data labels are missing (#2479)
    • Fix for the case when chromosomes.report is not provided (#2477)
    • Stress file name requirements for chromosomes report (#2478)
  • BBTools
    • Set missing values to None for bbmap qahist (#2411)
  • Bcftools
    • Stats: add multialleic sites column (#2414)
  • BCL Convert
    • Show message when no undetermined reads instead of error (#2526)
    • Fix for absent index reads (#2511)
    • Add all file types to sources (#2456)
  • Busco
    • Fix barplot colors (#2453)
  • Cell Ranger
    • Fix parsing antibody tab without antibody_treemap_plot (#2525)
  • Cutadapt
    • Speed up module by caching parsing versions (#2528)
    • Add ploidy estimation table (#2496)
  • fastp
    • When could not parse sample name from command (i.e. stdin), use filename and proceed (#2536)
  • FastQC
    • Skip per tile sequence quality section in FastQC reports for better performance (#2552)
    • Fix a ZeroDivisionError error (#2462)
    • Fix memory leak to make 7 times faster and use 10 times less memory (#2552)
    • Do not keep intermediate data in memory to reduce memory footprint further (#2516 )
    • Add option to ignore FastQC quality thresholds (#2486)
  • goleft indexcov
    • Work correctly even if no valid contigs in input (#2540)
  • mosdepth
    • Fix absolute coverage plot (#2488)
  • nonpareil
    • Change write_data_file label to be consistent with other modules (#2472)
  • Picard
    • WgsMetrics: coverage plot: show % based ≥x, not >x (#2473)
    • CrosscheckFingerprints: support multiple files, preserve sample order in heatmap (#2454)
  • qc3C
    • Fix detecting sample name for relative path (#2502)
  • QualiMap
    • BamQC: when trimming long tails, keep at least 20x (#2431)
  • Samtools
    • Add support for markdup (#2254)
    • Add violin multiple datasets & samtools flagstat percentage switch (#2430)
  • Space Ranger
    • fix for missing genomic_dna section (#2429)
  • xengsort
    • Fix parsing long files (do no use content_lines) (#2484)

New Contributors

MultiQC version 1.21

28 Feb 13:55
Choose a tag to compare


Box plot

Added a new plot type: box plot. It's useful to visualise a distribution when you have a set of values for each sample.

from multiqc.plots import box
            "sample 1": [4506, 4326, 3137, 1563, 1730, 3254, 2259, 3670, 2719, ...],
            "sample 2": [2145, 2011, 3368, 2132, 1673, 1993, 6635, 1635, 4984, ...],
            "sample 3": [1560, 1845, 3247, 1701, 2829, 2775, 3179, 1724, 1828, ...],
            "title": "Iso-Seq: Insert Length",

Note the difference with the violin plot: the box plot visualises the distributions of many values within one sample, whereas the violin plot shows the distribution of one metric across many samples.


The setup.py file has been superseded by pyproject.toml for the build configuration.
Note that now for new modules, an entry point should be added to pyproject.toml instead of setup.py, e.g.:

afterqc = "multiqc.modules.afterqc:MultiqcModule"


The heatmap plot now supports passing a dict as input data, and also supports a zlab
parameter to set the label for the z-axis:

from multiqc.plots import heatmap
            "sample 1": {"sample 2": 0, "sample 3": 1},
            "sample 2": {"sample 1": 0, "sample 3": 0},
            "sample 3": {"sample 1": 1, "sample 2": 0, "sample 3": 1},
            "title": "Sample comparison",
            "zlab": "Match",

MultiQC updates

  • New plot type: box plot (#2358)
  • Add "Export to CSV" button for tables (#2394)
  • Replace setup.py with pyproject.toml (#2353)
  • Heatmap: allow a dict dicts of data (#2386)
  • Heatmap: add zlab config parameter. Show xlab, ylab, zlab in tooltip (#2387)
  • Warn if run_modules contains a non-existent module (#2322)
  • Catch non-hashable values (dicts, lists) passed as a table cell value (#2348)
  • Always create JSON even when MegaQC upload is disabled (#2330)
  • Use generic font family for Plotly (#2368)
  • Use a padded span with nowrap instead of   before suffixes in table cells (#2395)
  • Refactor: fix unescaped regex strings (#2384)


  • Pin the required Plotly version and add a runtime version check (#2325)
  • Bar plot: preserve the sample order (#2339)
  • Bar plot: fix inner gap in group mode (#2321)
  • Violin: filter Inf values (#2380)
  • Table: Fix use of the no_violin (ex-no_beeswarm) table config flag (#2376)
  • Heatmap: prevent from parsing numerical sample names (#2349)
  • Work around call of full_figure_for_development to avoid Kaleido errors (#2359)
  • Auto-generate plot id when pconfig=None (#2337)
  • Fix: infinite dmax or dmin fail JSON dump load in JavaScript (#2354)
  • Fix: dump pconfig for MegaQC (#2344)

New modules

  • IsoSeq
    • Iso-Seq contains the newest tools to identify transcripts in PacBio single-molecule sequencing data (HiFi reads). cluster and refine commands are supported.
  • Space Ranger
    • Works with data from 10X Genomics Visium. Processes sequencing reads and images created using
      the 10x Visium platform to generate count matrices with spatial information.
    • New MultiQC module parses Space Ranger quality reports.

Module updates

  • bcl2fastq: fix the top undetermined barcodes plot (#2340)
  • DRAGEN: add few coverage metrics in general stats (#2341)
  • DRAGEN: fix showing the number of found samples (#2347)
  • DRAGEN: support gvcf_metrics (#2327)
  • fastp: fix detection of JSON files (#2334)
  • HTSeq Count: robust file reading loop, ignore .parquet files (#2364)
  • Illumina InterOp Statistics: do not set 'scale': False as a default (#2350)
  • mosdepth: fix regression in showing general stats (#2346)
  • Picard: Crosscheck Fingerprints updates (#2388)
    • add a heatmap for LOD scores besides a table
    • if too many pairs in table, skip those with Expected status
    • use the warn status for Inconclusive
    • add a separate sample-wise table instead of general stats
    • sort tables by status, not by sample name
    • add a column "Best match" and "Best match LOD" in tables
    • hide the LOD Threshold column
  • PURPLE: support v4.0.1 output without version column (#2366)
  • Samtools: support new coverage command (#2356)
  • UMI-tools: support new extract command (#2296)
  • Whatshap: make robust when a stdout is appended to TSV (#2361)

New Contributors

Full Changelog: v1.20...v1.21

MultiQC version 1.20

12 Feb 12:59
Choose a tag to compare


New plotting library

MultiQC v1.20 comes with totally new plotting code for MultiQC reports. This is a huge change to the report output. We've done our best to maintain feature parity with the previous plotting code, but please do let us know if you spot any bugs or changes in behaviour by creating a GitHub issue.

This change comes with many improvements and new features, and paves the way for more in the future. To find out more, read the associated blog post.

For now, you can revert to the previous plotting code by using the highcharts report template (multiqc --template highcharts). This will be removed in v1.21.

Note that there are several plotting configuration options which have been removed:

  • click_func
  • cursor
  • tt_percentages (use tt_suffix: "%")
  • Bar plot:
    • use_legend (automatically hidden if there is only 1 category)
  • Line plot:
    • labelSize
    • xDecimals, yDecimals (automatic if all values can be cast to int)
    • xLabelFormat, yLabelFormat (use tt_label)
    • pointFormat
  • Heatmap:
    • datalabel_colour
    • borderWidth

Moved GitHub and docker repositories

The v1.20 release is also the first release we've had since we moved the MultiQC repositories. Please note that the code is now at MultiQC/MultiQC (formerly ewels/MultiQC) and the same for the Docker repository. The GitHub repo should automatically redirect, but it's still good to update any references you may have.

MultiQC updates

  • Support Plotly as a new backend for plots (#2079)
    • The default template now uses Plotly for all plots
    • Added a new plot type violin (replaces beeswarm)
    • Moved legacy Highcharts/Matplotlib code under an optional template highcharts
  • Move GitHub repository to MultiQC organisation (#2243)
  • Update all GitHub actions to their latest versions (#2242)
  • Update docs to work with Astro 4 (#2256)
  • Remove unused dependency on future library (#2258)
  • Fix incorrect scale IDs caught by linting (#2272)
  • Docs: fix missing v prefix in docker image tags (#2273)
  • Unicode file reading errors: attempt to skip non-unicode characters (#2275)
  • Heatmap: check if value is numeric when calculating min and max (#2276)
  • Add filesearch_file_shared config option, remove unnecessary per-module shared flags in search patterns (#2227)
  • Use alternative method to walk directory using pathlib (#2277)
  • Export config.output_dir in MegaQC JSON (#2287)
  • Drop support for module tags (#2278)
  • Pin Pillow package, wrap add_logo in try-except (#2312)
  • Custom content: support multiple datasets (#2291)
  • Configuration: fix reading config.output_fn_name and --filename (#2314)

New modules

  • Bamdst (#2161)
    • Bamdst is a lightweight tool to stat the depth coverage of target regions of bam file(s).
  • MetaPhlAn (#2262)
    • MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.
  • MEGAHIT (#2222)
    • MEGAHIT is an ultra-fast and memory-efficient NGS assembler
  • Nonpareil (#2215)
    • Estimate metagenomic coverage and sequence diversity.

Module updates

  • Bcftools: order variant depths plot categories (#2289)
  • Bcftools: add missing self.ignore_samples in stats (#2288)
  • BCL Convert: add index, project names to sample statistics and calculate mean quality for lane statistics. (#2261)
  • BCL Convert: fix duplicated yield for 3.9.3+ when the yield is provided explicitly in Quality_Metrics (#2253)
  • BCL Convert: handle samples with zero yield (#2297)
  • Bismark: fix old link in Bismark docs (#2252)
  • Bismark: fix old link in docs (#2252)
  • Cutadapt: support JSON format (#2281)
  • HiFiasm: account for lines with no asterisk (#2268)
  • HUMID: add cluster statistics (#2265)
  • mosdepth: add additional summaries to general stats #2257 (#2257)
  • Picard: fix using multiple times in report: do not pass module.anchor to self.find_log_files (#2255)
  • QualiMap: address NBSP as thousands separators (#2282)
  • Seqera Platform CLI: updates for v0.9.2 (#2248)
  • Seqera Platform CLI: handle failed tasks (#2286)

New Contributors

Full Changelog: v1.19...v1.20

MultiQC version 1.19

18 Dec 09:49
Choose a tag to compare


An early Christmas present 🎁 Happy holidays everyone! 🎄

This release is mostly bugfixes and minor additions, whilst we lay the groundwork for some bigger updates coming in the new year. Still, there are plenty of goodies in here. Enjoy!

See the full changes in this release here: v1.18...v1.19

MultiQC updates

  • Add missing table id in DRAGEN modules, and require id in plot configs in strict mode (#2228)
  • Config table_columns_visible and table_columns_name: support flat config and table_id as a group (#2191)
  • Add sort_samples: false config option for bar graphs (#2210)
  • Upgrade the jQuery tablesorter plugin to v2 (#1666)
  • Refactor pre-Python-3.6 code, prefer f-strings over .format() calls (#2224)
  • Allow specifying default sort columns for tables with defaultsort (#1667)
  • Create CODE_OF_CONDUCT.md (#2195)
  • Add .cram to sample name cleaning defaults (#2209)

MultiQC bug fixes

  • Re-add run into the multiqc namespace (#2202)
  • Fix the "square": True flag to scatter plot to actually make the plot square (#2189)
  • Fix running with the --no-report flag (#2212)
  • Fix guessing custom content plot type: do not assume first row of a bar plot data are sample names (#2208)
  • Fix detection of changed specific module in Changelog CI (#2234)

Module updates

  • BCLConvert: fix mean quality, fix count-per-lane bar plot (#2197)
  • deepTools: handle missing data in plotProfile (#2229)
  • Fastp: search content instead of file name (#2213)
  • GATK: square the BaseRecalibrator scatter plot (#2189)
  • HiC-Pro: add missing search patterns and better handling of missing data (#2233)
  • Kraken: fix UnboundLocalError (#2230)
  • Kraken: fixed column keys in genstats (#2205)
  • QualiMap: fix BamQC for global-only stats (#2207)
  • Picard: add more search patterns for MarkDuplicates, including MarkDuplicatesSpark (#2226)
  • Salmon: add library_types, compatible_fragment_ratio, strand_mapping_bias to the general stats table (#1485)

New Contributors

Full Changelog: v1.18...v1.19

MultiQC Version 1.18

17 Nov 14:37
Choose a tag to compare


Better configs

As of this release, you can now set all of your config variables via environment variables! (see docs).

Better still, YAML config files can now use string interpolation to parse environment variables within strings (see docs), eg:

  - Contact E-mail: !ENV "${NAME:info}@${DOMAIN:example.com}"

Picard refactoring

In this release, there was a significant refactoring of the Picard module.
It has been generalized for better code sharing with other Picard-based software, like Sentieon and Parabricks.
As a result of this, the standalone Sentieon module was removed: Sentieon QC files will be interpreted directly as Picard QC files.

If you were using the Sentieon module in your pipelines, make sure to update any places that reference the module name:

  • MultiQC command line (e.g. replace --module sentieon with --module picard).
  • MultiQC configs (e.g. replace sentieon with picard in options like run_modules, exclude_modules, module_order).
  • Downstream code that relies on names of the files in multiqc_data or multiqc_plots saves (e.g., multiqc_data/multiqc_sentieon_AlignmentSummaryMetrics.txt becomes multiqc_data/multiqc_picard_AlignmentSummaryMetrics.txt).
  • Code that parses data files like multiqc_data/multiqc_data.json.
  • Custom plugins and templates that rely on HTML anchors (e.g. #sentieon_aligned_reads becomes #picard_AlignmentSummaryMetrics).
  • Also, note that Picard fetches sample names from the commands it finds inside the QC headers (e.g. # net.sf.picard.analysis.CollectMultipleMetrics INPUT=Szabo_160930_SN583_0215_AC9H20ACXX.bam ... -> Szabo_160930_SN583_0215_AC9H20ACXX), whereas the removed Sentieon module prioritized the QC file names. To revert to the old Sentieon approach, use the use_filename_as_sample_name config flag.

MultiQC updates

  • Config can be set with environment variables, including env var interpolation (#2178)
  • Try find config in ~/.config or $XDG_CONFIG_HOME (#2183)
  • Better sample name cleaning with pairs of input filenames (#2181)
  • Software versions: allow any string as a version tag (#2166)
  • Table columns with non-numeric values and now trigger a linting error if scale is set (#2176)
  • Stricter config variable typing (#2178)
  • Remove position:absolute CSS from table values (#2169)
  • Fix column sorting in exported TSV files from a matplotlib linegraph plot (#2143)
  • Fix custom anchors for kraken (#2170)
  • Fix logging spillover bug (#2174)

New Modules

  • Seqera Platform CLI (#2151)
    • Seqera Platform CLI reports statistics generated by the Seqera Platform CLI.
  • Xenome (#1860)
    • A tool for classifying reads from xenograft sources.
  • xengsort (#2168)
    • xengsort is a fast xenograft read sorter based on space-efficient k-mer hashing

Module updates

  • fastp: add version parsing (#2159)
  • fastp: correctly parse sample name from --in1/--in2 in bash command. Prefer file name if not fastp.json; fallback to file name when error (#2139)
  • Kaiju: fix division by zero error (#2179)
  • Nanostat: account for both tab and spaces in v1.41+ search pattern (#2155)
  • Pangolin: update for v4: add QC Note , update tool versions columns (#2157)
  • Picard: Generalize to directly support Sentieon and Parabricks outputs (#2110)
  • Sentieon: Removed the module in favour of directly supporting parsing by the Picard module (#2110)
    • Note that any code that relies on the module name needs to be updated, e.g. -m sentieon will no longer work
    • The exported plot and data files will be now be prefixed as picard instead of sentieon, etc.
    • Note that the Sentieon module used to fetch the sample names from the file names by default, and now it follows the Picard module's logic, and prioritizes the commands recorded in the logs. To override, use the use_filename_as_sample_name config flag

MultiQC Version 1.17

17 Oct 15:39
The one with the new logo


  • Introducing the new MultiQC logo!
  • Adding support for Python 3.12 and dropping support for Python 3.7
  • New --require-logs to fail if expected tool outputs are not found
  • Rename --lint to --strict
  • Modules should now use ModuleNotFoundError instead of UserWarning when no logs are found
  • 2 new modules and updates to 9 modules.

MultiQC updates

  • Add CI action changelog.yml to populate the changelog from PR titles, triggered by a comment @multiqc-bot changelog (#2025, #2102, #2115)
  • Add GitHub Actions bot workflow to fix code linting from a PR comment (#2082)
  • Use custom exception type instead of UserWarning when no samples are found. (#2049)
  • Lint modules for missing self.add_software_version (#2081)
  • Strict mode: rename config.lint to config.strict, crash early on module or template error. Add MULTIQC_STRICT=1 (#2101)
  • Matplotlib line plots now respect xLog: True and yLog: True in config (#1632)
  • Fix matplotlib linegraph and bargraph for the case when xmax < xmin in config (#2124)
  • Add --require-logs flag to error out if requested modules not used (#2109)
  • Fixes for python 3.12
    • Replace removed distutils (#2113)
    • Bundle lzstring (#2119)
  • Drop Python 3.6 and 3.7 support, add 3.12 (#2121)
  • Just run CI on the oldest + newest supported Python versions (#2074)
  • /// New logo
  • Set name and anchor for the custom content "module" #2131
  • Fix use of shutil.copytree when overriding existing template files in tmp_dir (#2133)

New Modules

  • Bracken
    • A highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.
  • Truvari (#1751)
    • Truvari is a toolkit for benchmarking, merging, and annotating structural variants

Module updates

  • Dragen: make sure all inputs are recorded in multiqc_sources.txt (#2128)
  • Cellranger: Count submodule updated to parse Antibody Capture summary (#2118)
  • fastp: parse unescaped sample names with white spaces (#2108)
  • FastQC: Add top overrepresented sequences table (#2075)
  • HiCPro: Fix parsing scientific notation in hicpro-ashic. Thanks @Just-Roma (#2126)
  • HTSeq Count: allow counts files with more than 2 columns (#2129)
  • mosdepth: fix prioritizing region over global information (#2106)
  • Picard: Adapt WgsMetrics to parabricks bammetrics outputs (#2127)
  • Picard: MarkDuplicates: Fix parsing mixed strings/numbers, account for missing trailing 0 (#2083, #2094)
  • Samtools: Add MQ0 reads to the Percent Mapped barplot in Stats submodule (#2123)
  • WhatsHap: Process truncated input with no ALL chromosome (#2095)

MultiQC Version 1.16

22 Sep 14:35
Highlight: Reporting software versions

New in v1.16 - software version information can now automatically parsed from log output where available, and added to MultiQC in a standardised manner. It's shown in the MultiQC report next to section headings and in a dedicated report section, as well as being saved to multiqc_data. Where version information is not available in logs, it can be submitted manually by using a new special file type with filename pattern *_mqc_versions.yml. There's the option of representing groups of versions, useful for a tool that uses sub-tools, or pipelines that want to report version numbers per analysis step.

There are a handful of new config scopes to control behaviour: software_versions, skip_versions_section, disable_version_detection, versions_table_group_header.
See the documentation for more (writing modules, supplying stand-alone)

Huge thanks to @pontushojer for the contribution (#1927). This idea goes way back to issue #290, made in 2016! 🎉

MultiQC updates

  • Removed simplejson unused dependency (#1973)
  • Give config custom_plot_config priority over column-specific settings set by modules
  • When exporting plots, make a more clear error message for unsupported FastQC dot plot (#1976)
  • Fixed parsing of plot_type: "html" data in json custom content
  • Replace deprecated pkg_resources
  • Fix the module groups configuration for modules where the namespace is passed explicitly to general_stats_addcols. Namespace is now always appended to the module name in the general stats (2037).
  • Do not call sys.exit() in the multiqc.run() function, to avoid breaking interactive environments. #2055
  • Fixed the DOI exports in multiqc_data to include more than just the MultiQC paper (#2058)
  • Fix table column color scaling then there are negative numbers (1869)
  • Export plots as static images and data in a ZIP archive. Fixes the issue when only 10 plots maximum were downloaded due to the browser limitation.

New Modules

  • Bakta
    • Rapid and standardized annotation of bacterial genomes, MAGs & plasmids.
  • mapDamage
    • mapDamage2 is a computational framework written in Python and R, which tracks and quantifies DNA damage patterns among ancient DNA sequencing reads generated by Next-Generation Sequencing platforms.
  • Sourmash
    • Quickly search, compare, and analyze genomic and metagenomic data sets.

Module updates

  • BcfTools
    • Stats: fix parsing multi-sample logs (#2052)
  • Custom content
    • Don't convert sample IDs to floats (#1883)
    • Make DRAGEN module use fn_clean_exts instead of hardcoded file names. Fixes working with arbitrary file names (#1994)
  • FastQC:
    • fix UnicodeDecodeError when parsing fastqc_data.txt: try latin-1 or fail gracefully (#2024)
  • Kaiju:
    • Fix UnboundLocalError on outputs when Kanju was run with the -e flag (#2023)
  • Kraken
    • Parametrize top-N through config (#2060)
    • Fix bug where ranks incorrectly assigned to tabs (#1766).
  • Mosdepth
    • Add X/Y relative coverage plot, analogous to the one in samtools-idxstats (#1978)
    • Added the perchrom_fraction_cutoff option into the config to help avoid clutter in contig-level plots
    • Fix a bug happening when both region and global coverage histograms for a sample are available (i.e. when mosdepth was run with --by, see mosdepth docs). In this case, data was effectively merged. Instead, summarise it separately and add a separate report section for the region-based coverage data.
    • Do not fail when all input samples have no coverage (#2005).
  • NanoStat
    • Support new format (#1997).
  • RSeQC
    • Fix max() arg is an empty sequence error (#1985)
    • Fix division by zero on all-zero input (#2040)
  • Samtools
    • Stats: fix "Percent Mapped" plot when samtools was run with read filtering (#1972)
  • Qualimap
    • BamQC: Include % On Target in General Stats table (#2019)
  • WhatsHap
    • Bugfix: ensure that TSV is only split on tab character. Allows sample names with spaces (#1981)

New Contributors

Full Changelog: v1.15...v1.16