-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using mappy across multiple processes #125
Comments
This will be technically difficult. It is possible to access memory from different processes using shared memory, but implementing that in minimap2 will be nontrivial. Can you use multiple threads? You can create one mappy.Aligner object and call Aligner.map on different threads like (I have not tested this part, though): aligner = mappy.Aligner(fn_index)
# then in each thread
thr_buf = mappy.ThreadBuffer()
for hit in aligner.map(seq, buf=thr_buf):
... |
I figured this might be quite an undertaking if it were not already built into minimap2/mappy. I am using python's I will try some tests to see if this is feasible. Thanks or the pointer and use case for |
I have tested the interface and it works quite well indeed. I will note that mixing multiprocessing with multithreading was quite a headache, but it is indeed possible. For others that might be trying this, I had to open all multiprocess objects (Pipes in my case) and start all processes before opening any threading objects. While it would be nice to have a version that could share state across processes I think this is a sufficient workaround and so I would consider this issue resolved. |
I am working on making the incorporation of
mappy
into tombo (for nanopore modified base detection) and I am having some issues usingmappy
on larger genomes across many cores. My main issue is that right now I am opening a newmappy.Aligner
object in each python process (viamultiprocess
module). For larger genomes, this leaves a large memory footprint. I wanted to be safe about using this object across multiple processes, so I opened a newAligner
in each new pythonmultiprocess
, but I am wondering if there is an existing solution that might allow for the in memory minimap2 index to be shared across multiple python processes via themappy
API to help decrease this memory footprint.The text was updated successfully, but these errors were encountered: