[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Details of mmseqs filterresult #807

Open
RobinEllison opened this issue Jan 24, 2024 · 0 comments
Open

Details of mmseqs filterresult #807

RobinEllison opened this issue Jan 24, 2024 · 0 comments

Comments

@RobinEllison
Copy link

Expected

  • understand the filtering rules of mmseqs filterresult
  • understand the output dumped by mmseqs filterresult
  • have a basic understand of the format produced by mmseqs filterresult. It looks like alignment format.

Current Behavior

When i executed mmseqs filterresult db db db_clu db_clu_nr --diff 2 to get only the 2 most divergent sequences in each cluster, it seems:

  • The unaligned seqs (one cluster has only one seq member) was discarded.
  • When the cluster has only 2 seq members, only the not representative seq was kept.
  • When the cluster has more than 3 seq members, 1 or 2 seq members which were not representative seq were kept.

Steps to Reproduce (for bugs)

Take the whole protein of human as example

mmseqs createdb human_protein.fasta human_protein_db

mmseqs linclust human_protein_db human_protein_db_clu

mmseqs filterresult human_protein_db human_protein_db human_protein_db_clu human_protein_db_clu_filter --diff 2

Your Environment

Include as many relevant details about the environment you experienced the bug in.

  • Git commit used (The string after "MMseqs Version:" when you execute MMseqs without any parameters): 14.7e284
  • Operating system and version: Ubuntu 22.04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant