[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EXP] add an Index class that loads directly from a screed-loadable file #1783

Open
wants to merge 1 commit into
base: latest
Choose a base branch
from

Conversation

ctb
Copy link
Contributor
@ctb ctb commented Jan 6, 2022

This class makes the following command work:

sourmash search podar-ref/1.fa.sig podar-ref/?.fa

by wrapping FASTA/FASTQ files in an Index subclass that generates the "right" MinHash sketches dynamically.

Proof of concept/thought experiment for now. Relevant to #1647, although it does the dirty work via Index rather than via a SourmashSignature.

Options I considered and may try --

  • can implement as a container (e.g. a zip container or whatever) that tracks selects and builds compatible MinHash when needed - WHAT I DID
  • can provide an individual Signature-style wrapper that overloads .minhash
  • can provide load_signature???
  • can subclass MinHash to be a sequence-containing-class that dynamically produces a MinHash from a sequence

other thoughts for this PR

should probably add options for (a) multiple fasta files/one signature per, (b) overriding name.

this could connect to sketch lists, too.

@codecov
Copy link
codecov bot commented Jan 6, 2022

Codecov Report

Merging #1783 (8a36bc5) into latest (73aeb15) will increase coverage by 6.19%.
The diff coverage is 33.84%.

Impacted file tree graph

@@            Coverage Diff             @@
##           latest    #1783      +/-   ##
==========================================
+ Coverage   83.44%   89.64%   +6.19%     
==========================================
  Files         113       88      -25     
  Lines       12145     8504    -3641     
  Branches     1614     1627      +13     
==========================================
- Hits        10134     7623    -2511     
+ Misses       1752      620    -1132     
- Partials      259      261       +2     
Flag Coverage Δ
python 89.64% <33.84%> (-0.47%) ⬇️
rust ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/sourmash/screed_index.py 25.86% <25.86%> (ø)
src/sourmash/sourmash_args.py 92.21% <100.00%> (-0.49%) ⬇️
src/core/src/index/storage.rs
src/core/src/sketch/nodegraph.rs
src/core/src/sketch/hyperloglog/mod.rs
src/core/src/encodings.rs
src/core/src/sketch/hyperloglog/estimators.rs
src/core/src/lib.rs
src/core/src/errors.rs
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 73aeb15...8a36bc5. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant