[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird self alignment #10

Open
gt1 opened this issue Aug 2, 2017 · 1 comment
Open

Weird self alignment #10

gt1 opened this issue Aug 2, 2017 · 1 comment

Comments

@gt1
Copy link
gt1 commented Aug 2, 2017

Hi,

I see the following somewhat weird data (SAM format) coming out of minimap2:

L0/46/0_12879	256	L0/46/0_12879	1	0	8963M116I28M116D203M110I9M1I15M1D22M110D3412M	*	0	0	*	*	tp:A:S	cm:i:15	s1:i:126	NM:i:454	ms:i:24372	AS:i:24744	nn:i:0
L0/46/0_12879	256	L0/46/0_12879	1	0	8963M116D28M116I203M110D9M1D15M1I22M110I3412M	*	0	0	*	*	tp:A:S	cm:i:15	s1:i:127	NM:i:454	ms:i:24372	AS:i:24744	nn:i:0

This is weird for two reasons:

  1. This describes an alignments of a read against itself running from end to end, but clearly not the optimal alignment between the two regions specified.
  2. There are two versions.

minimap2 was run with as

minimap2/minimap2 -ax ava-pb dup.fasta dup.fasta

where dup.fasta contains a single read (synthetic E. coli, though I think this should not matter).

The single read in dup.fasta is

>L0/46/0_12879
AGTCCTGCGGAAAGCGCCAGGGCGAGGATTGCTGTGGCAGGTTTACGTAATTGCATATCCAACTCCTTTATCTCTCTGCG
TTAAGAACGCACTGGAATACCCGTTGTGAGTGTTTTGTGTTGTTACGTCTGCAACTTTATTGTGCAGTGTGTGCCTGTTA
GGGAAGGTGCGAATAAGCTGGGGAAATTCTTCTCGGCTGACTCAGTCATTTCATTTCTTCATGTTTGAGCGATTTTTTCT
CCCGTAAATGCCTTGAATCAGCCTATTTAGACCGTTTCTTCGCCATTTAAGGCGTTATCCCCAGTTTTTAGTGAGATCTC
TCCCACTGACGTATCATTTGGTCCGCCCGAAGACAGGTTGGGCCAGCGTGAATAACATCGCCAGTTGGTTATCGTTTTTC
AGCAACCCCTTCGGTATCTGGCTTTCACGTAAGCCGAACTGTCGCTTGATGATGCGAAATGGGTGCTCCACCCCTGGCCC
GGATGCTGGGCTTTCATGTATTCGATGTTGATGGCCGTTTTGTTCTTGCGTGGAATGCTGTTTCAAGGTACTACCTTGCC
GGGGCCGCTCGGCGATCAGCCAGTCCAATATCCACCTCGGCCAGCTCCTCGCGCTGTGGCGCCCCTTGGTAGCCGGCATC
GGCTTAGACAAATTGCTCCTCTCCATGCAGCCAGATTACCCAGCTGATTGAAGGTCATGCTCGTTGGCCGCGAGTGGTGA
CCAGGCTGTGGGTCAGGCCACTCTTGGCATCGACACCAATGTGGGCCTTCATGCCAAAGTGCCACTGATTGCCTTTCTTG
GTCTGATGCATCTCCGGACTCGCGTTGCTGCTCTTTGTTCTTGGTCGAGCTGGGTGCCTCAATGATGGTGGCATCGACCA
AGGTGCCTTGAGTCATCATGACGCCTGCCTTCGGACCAGCCAGTCGATTGATGGTCTTGAACAATTGGCGGGCCAGTTGG
ATGCTGCTCCAGCAGGTGGCGGAAATTCATGATGGTGGTGCGGTCCGGCAAGGCGCTATCCAGGGATAACCGGGCAAACA
GACGCATGGAGGCGATTTCGTACAGAGCATCTTCCATCGCGCCATCGCTCAGGTTGTATCCAATGCTGCATGCAGTGAAT
GCGTAGCATGGTTTCCAGCGGAAAAGGTCGCCGGTCACATTACCAGCCTTGGGGTAAAACGGCTCGCTGACTTCCACCAT
GTGTTTTGCCATGGCAGAATCTGCTCCATGCGGGACAAGAAAATCTCTTTTCTGGTCTGAACGGCGCTTACTGCTGAATT
CACTGTCGGCGAAGGTAAGTTGATGACTCATGATGAACCCTGTTACTATGGCTCCAGATGACAAACATGATCTCATATCA
GGGACTTGTTCGCACCTTCCTTAGGTAACATTTAGTTTGGCTAAATGTAAAGATATTGCTGTTTTATTGTTTGTTTTTGC
GAGATGCGCCGCACCATTCCGAAGCAAAATTCTTAAAATGCACTCTTTTAGTGCTACCGCTGGATTACTGTGGTGCAACT
AGGTTGTACTGATGCTGTTTCAGGGTTGCCTTGTATAACAAAGCAATAGATCGTGCCAAAGTTGGATAGGAAATATGTTA
TCCGGATAATGCACTGATGCCGCATCCGGTGAGCGTGGCCGAAATATGGGATGTATTCCGGCACGATAAGAAGGGATTAT
TTACGTCGCTGACGGCAGACTCATCAACACAGCAGCAAAACCAAAACAATGCCGTCAGCACCCACAGTCGGACCAGTTGC
CGAGTACGTGCGTGATGGTGTGAGTTACCGGTGGTCGGCGTACGTTAGTGGTTAACACCTCGCGGGTGAACTGCGGGATC
ATCGCCTGAATTTCTCACCCTGCGGGCCAATCACCGCCGTAATGCCGTTGTTGGTGCTGCGCAACAGTGGGCAGCGCCAG
CTCCAGCGCACGCATTCGCGCCATCTGGAAGTGTTGCCATGGACCAATAGGTTTACCAAACCACGCATCCGTTGGAGATA
GTCAGCAGATAGTCGGTATCCGGGCGGAAGTTATCGCGCACTTGCTCGCCGAGAATGATCTCGTAGCAAATAGCCGCAGT
AAGCTCAATACCATTTGCCGACAGCGGCGGCTGGATATATGGCCCACGGCTGAACGACGACATCGGCAGATCAAAGAACG
GATGCTAACGGACGCAGAATCGACTCAGCGGGACAAACTCGCCAAACGGCACCAGATGGTTTTTGTTATAGCGATCGGCT
GAGTTCGTAGCTGTACGGGCGCACCTTTACCCAGCGTGATGATGGTCGTTGTAGGTATCGTAGCGGTTCTGCTTATTGAT
GACGCGCCTCGACAATCCCGGTTACCAGCGAGCTACCTTTATCACGCAACTCACCGTCCAGTTGCTTTGAGGAACGGTTG
CTGGTTAATTTCCAGATGCGGTTGATCGCCGACTCCGGCCAGATAATCAACGATGATTTGGCTCCATCAGCGGTGCCGTT
GACGTTGTAGTAAATCTTCAGCGTATTAAGAAGCTGGCCTTCGTCCCATTTCAGCGATTGCGGAATATCGCCCTGAACCA
TCGAAACCTGAATGGTTTTCTCCGGTTGTGGGGTAAACCCACTGGATGAACGTCAGACGGGAAGGGAGACGGCAAACATG
CACGACGGCCACCACCAGCTGGACGCCAGTTGCGTTTGACCAACGCCAGTGCCAGCAGGCCACTAACCATCATCAGCAGG
AAGTTAATGGCTTCCACGCCCATTATCGGTGCCAGCCCTTTTAACGGGACCATCAATCTGGCTATAGTCCGAACTTGTAA
CCACGGGTGAAGCCGGTCAAGTACCCAACCGCGCACGAAACTCGGTCACTTGCCAGAGGGCAGGGGGCGGCAATCGCTAC
GCGCAGCCAGGTGGTTTTCGGCCACAGACGCGACAGCACGCCAGCAAACAGTCCGGTATACAGCGACAAATACGCCGCCA
GCTGCACCACCAGGAAGATGTTAACCGGGCCAGACATTCCGCCAAAGGTCGCGATGCTGACATAGACCCAGTTAATACCG
CTGCCAAAGAGGCCAAATCCCCAGCAAAAGGCCAATAGCGGCAGACTGGAGTGGACGGCGGTTAAAGGTCAACGCCTGGC
AAGCCCCATCAGCGAAATAATCCGCCGCAGGCCAGACGTTCGTAAGGAGAGAAGGCCCATGCGCTTCCGCAGGCACCGAA
TAATAACGCCAGCAGCAGGCGAATGCGCTGGCGTTTCAATTACATGAGGCAAAAGCCATGTAAGTATATCTATCCAGTTT
CGGTTTATTCATCCAGCTTCGGCTGGGGTGAGTATCCGGGATTTTGACATGAACCTGAATAATACGCCGACTGTCGGCCG
TCGCCACTTTGAACTGGATAACCGTCGATGTCGATAGTTTCGCCACGCGCCGGAAAGATCCCAAATGCCTGCATCACCAG
ACCACCGATAGTCGTCGACTTCTTCATCGCTAAAGTGGGTGCCGAACGCTTCGTTGAAGTCTTCAATGGAAGCCCAGTGC
GCGTACGGTCCAGGTATGACGACTCAGCTGACGGAAGTCGATATCATCTTCTTCGTCATACTCGTCTTCTAATACTCACG
CAACAATCAGTTCCAGGATGTCTTCAATGGTCACCAGACCGGAAACCCCATCGAATTCGTCAATAACGTATCGCCATGTG
GTAACAGCTGAGAGCGAAACTCTTTCAGCATCCGGCTACGCGCTTACTTTCAGGAAGCGACAACCGCCTGACGTAACACT
TTGTCCATGCTGAAGGCTTCAGCATCGCTGCGCATAAACGGCAGCAAGTTCGTTTCGCCATCAGAATCCCTATCAATGTG
ATCTTTGTCTTCGCTAATCATCCGGGAAGACGTAGAGTGGGCGGACTCGTATAGATGACATCAAGACATTCGGTCCACGC
GTCTGGTTGCGTTTCAGGGTAATCATGCCTGGGAGCGGGGGATCATGATGTCGCGAGACGCGTTGGTCTGCGATGTCCAT
CACCCCTCTCGAGCATATCGCGCGTATCTTCGTCGATATAGGTCGTTCTGCCCGGAATCACGGATCAGCGCCAGCGTTCG
TCACGGTTTTTCGGTTCCACCGTTGGATAAAGTTGGCTGCAGTAACAGGGAGAAAAATCCCCTTTCTTGTTGCTTATCGT
GTCACTACTGTGTGAATTGTCGTCGCTCATGGCGTGTATGGGTTCTCATGTTAGTTAATCAAAACGCCGTCGTTAATCAC
CAACGGCGGGGACGTCTGCCAGTCAAATGCCTGGCAATGTATTCTTTCTCGGCAATGTACGGATCCTCATAGCCCAGAGC
AAGCATAATCTCTGTTTCGAGGGCTTCCATTTCTGTCTGCTTCGTCATCTTCGATGTGATCGCTAACCTAACAAATGCAG
ACTGCCGTGCACCACCATATGCGCCCAGTCGCGCCTCCAGTGGTTTGCCTTGCGTCCTGAGCTTCCTTCTCAACCACTGT
ACGGCAGATAACCAGATCGCCCAGTAGCGACATATTCCAGTGCCAGGCGGCACTTCAAACGGGAAGGAGAGCACGTTGGT
CGGCTTATCCTTACCGCGATAGGTCAGATTCAGACTGTGGCTTTCGGCGGTATCGACCACGCTGAATCGTCACTTCCGAT
TCTCCTGAAACTGCGGGATCACCGCATTCAGCCATGTCTGAAACTGGCTCTCTTCGCGGTAACCCGGAATTATCTTCACA
TGCCAGTGCTAAATCGAGGATCACCTGACTCATTTTTGTTCCTCTGTTCTTCGCGCTTGCTTCTGCTGCCAGCGCCGCTT
TTCGTTTTTGTCTCGGCTTCTTCCCATGGCTTCATAGGCGTTAACGATACTGCGCCACCACAGGGTGACGAACCACGTCT
TCGCTGTGGAAGAAGTTAAAGACTGATCTCTTCGAACATCGGCCAGCACTTCGATGGCGTGACGTAAGCCTGATTTAGTA
TTACGCGGCCAGGTCCGATCTGTGTGACGTCGCCGGTGGATAACCGCTTTTGAGTTAAAACCGATACGGGTCAGGAACAT
CTTCATCTGTTCGATGGTGGTGTTGCTGGCTCTCATCGAGAATGATTAAACGCGTCGTTCAGCGTACGACCACGCATATA
GGCCATGCGGTGCGACTTCAACTAACGTTGCCGCTCAATCAGTTTCTCGACTTTCTCAAAGCCCAGCATTTCAAACAGCG
CGTCGTACAGCGGGCGCAGATACGGGTCTACTTTCTGGCTTAAATCGCCAGGCGAGGAAGCCCAGTTTTCACCGGCTTCT
ACTGCCGGACGAGTCAGCAGAATACGGCGAATTTCCTCGACGCTCCAGGGCATCAACTGCCGCAGCCACTGCCAGGTAGG
TTTTACCCGTACCCGCCGGGCCAACGCCGAAGGTAATGTCATGGGTCGAGAATATTGGCGATGTACTGCGCCTAGGTTTG
CGTGCGCGGCTTAAAATTACGCCGCGTTTGGTTTTGATATTGACCGCTTTGCCGTACTCCGGCACGCTCTCCGCGCTGTC
TGCTCCAGGACCACGCGCTTCTGTAATTCGACAACGGTAGGCAATCTCGTTCCGGTTCGATATCCTGAATCTGACCCGCG
CATCGGGGTCAGTGATCGACATACAGGCTACGCCAGAATGTCTGACCGCAGCGGGACGCAAATCGGACGGCCGTGTCAGT
TTAAAGTGGTTATCGCGGCGATTGATCTCGATGCCGAGACGGCGTTCGAGCTGCTTGATGTTGTCATCAAACGGGCCGCA
CAGGCTCAACAGACGCGCATTGTCTGCTGGCTCCAGGGTTGATTTCGCGAGTGTCTATGTTCAAACCGTCCTCTTATCTG
TATGCCGCCGGAAGCTGAACATTCACCGGCCTATAAGGAAATTATTCACGCCACAGGAAAAAGGCGCAAGCGATTGCAAT
ATAAGATGGGGATAAAGAGAGAAAAAACAAGGCCCGACCGGAACGGCAGGCCTGAGAATTACGGCTGATAATAACCCACG
CCAAAGGTCGTTTTCTTTGACGGGTACGGGCAATCACTGATTCCCGGTGTTTCTGCCACGCGCCAGACCCATTTCATCTT
CAGTACGCACCACTTTACCGCGCAGAGAGTTCGGGTAGACGTCGGTAATTTCTACATCGACGAATTTACCGATCATATCC
GGCGTGCCTTCGAAGTTGACCACGCGGTTATTTTCCGTACGCCCGGAAAGCTCGCATGATGCTCTTCACGCGATGTACCT
TCCTACCAGATATACGCGGGTGGTGCCGAGCATCCGGCGGCTCCACGCCATCGCTTGCTGATTAATTGCGCTCTTGCATG
AATATACAGACGCTGCTTCTTCTCTTCTTCCGGAACATCATCAACCATATCGGCGGCCTGGATGTACCCGGACGTGCAGA
TAAGATAAAGTGTAGCTCATGTCGAAATTGACGTCGGCAATCAGCTTCATCGTTTTTCTCGAAGTCTTCGGTGGTTTCGC
CATGGGAAGCCAACGTATGAAATCAGAACTGATCTGAATATCTGGACGCGCCGCACGCAGTTTACGGATGATCGCTTTGT
ACTCCAGCGCCGTAATGGGTACGGCCCATCAGGTTCAGAATGCAGATCGGAACCGCTCTGTACCGGCAGATGCAGGAAGC
TCACCAGCTCCGGCGTGTCGCGATACACTTCGATGATATCGCTCGGTGAATTCGATACGGATGGCTCGGTGGTAAAGCGA
ATACGATCGATCCCGTCGATCGCAGCAACCAGACGCAGCAGATCGGCAAACGATCCGGTGGTGCCGGTCGTAGTTTTCAC
CACGCCAGGCGTTCACGTTCTGACTCGTAGCAGGTTGACTTCACGCACGCCCTGAGCCGCAAGCTGGTGCATATCTCAAA
CAGAATATCGTCGGACGGACGGCTGACCTCTTCACCACGGGTGTGAAGGCACCACGCAGTAGGTGCAATATTTATTGCAG
CCTTCCATGATGGAGACAAACGCGGTCGGCCCTTCGGCGCGCGGTTCCGGTAGGACGGTCAAACTATCTCGATTTCCGGG
AAGCTGATATCTACAACCGGGCTGCGGTCGCCACGCACGGAGTTGATCATCTGCCGGCAGACGGTGCAGCGTGTTGCGGC
CCAAAAATAATATCGACATCAGTGGGCGCGCTGGCGAATGTGCTCGCCTTCTTGCGATGCCACGCAGCCACCGACGCCGA
TAATCAGGTCTGGATTCTTCTCTTTTAACAGTTTCCAGCGACCTCAACTGATGGAAGACTTTTTCCTGAGCCTTCTCGCG
GGTTGAGCAGGTGATTCAGCAGCAGCACATCCGCTTTCTGTCCGCCTACGTCGGTCAGTTGATAGCCGTGGGTGGCATCC
ACAGATCGGCCATCTTCGAATGAATCGTACTCGTTCATCTGACAGCCCCAGGTTTTAATATGGAGTTTTTTGGGTCATCG
ACTTGCTCTTGCGAAATAGTAGCCAGGAATGCAGGGCGTCATAGTGTAATGCTTTGCTGACCGTTGTGACCAGTATGAGC
GTTATCAGCCCTTAGGGGTAAAAATCCTGTAAACTTAAAGCAGTATTGCTAACAGGATGATTGACCATGACAAATCAACC
AACGGAAATTGCCATTGTCGGCGGAGGAATGGTCGGCGGCGCACTGGCGCTGGGGCTGGCACAGCACGGATTTGCGGTAA
CGGTGATCGGAGCACGCAGAACCAGCGCCGTTTGTCGCTGATAGCCAACGGACGTCGGATCTCGGCGATCAGCGCGGCTT
CGGTATACATTGCTTAAAGGGTTAGGGTCTGGGATGCAGTACAGGCTATGCGTTGCCATCCTTACCGCAGACTGGAAACG
TGGGGAGTGGGAAACGGCGCATGTGGTGTTTGACGCCGCTTGAACTTAAGCTACCGCTGCTTGGCTATATGGTGGAAAAC
ACTGTCCTGCAACAGGCGTTGTGGCAGGCGCTGGAAGCCGCATCCGAAAGTAACGTTATCGTCGTGCCAGGCTCGCTGAT
TGCGCTGCATCGCCATGATGATCTTCAGGAGCTGGAGACTGAAAGGCGGGAAGTGATTTCGCGCGAAGCTGGTGATTGGT
ACCGACGGCGCAAATTCGCAGGTGCGGCAGATGGCGGGAATTGGCGTTCATGACATGGCAGTATGGCGCAGTCGTGCATG
GTTGATTAGCGTCCAGTGCGAGAACGATCCCGGCGACAGCACCTGGCAGCAATTTACTCCGGACGGGACCGCGTGCGTTT
CTGCCGTTGTTTGATAACTGGCATACGCTTAGGGATTGGGTATGTGACTCTGCCCGGCGTCGTGTATGTCGCCAGTTGCA
GAATATGAGTATGCGCACAGCTCCAGAGCGGAAATCGCGAAGCATTTCCCGTCGCGTTCTGGGTTACGTTACACCGCTTG
TCCGCTGGTGCGTTTCCGCTGACGCGTCGCCATGCGTAGCAGTACAGTGCAGCAGGGCTTGCGCTGGTGGGCGATGCCGC
GCATACCATCCATCCGCTGGCGGGGCAGGGAGTGAATCTTGGTTATTCGTGATGTCGATGCCCTGACTTGATGTTCTGGT
CAACGCCCGCAGCTACGGCGAAGCGTGGGCCAGTTATCACTGTCCTGCAAGCGGTACCAGATGCGGCGCATGGCGGATAA
CTTCATTATGCAAAGCGGTATGGATCTGTTTATGCACGGATTCAGCAATAATCTGCCACCACTGCGTTTTATGCGTAATC
TACGGGTTAATGGCGGCGGAGCGTGCTGGCGTGTTGAAACGTCAGGCGCTGAAATATGCGTTAAGGGTTGTAGCCTTACA
ACATTGCCGGGATGACGTGCCTAACCGTAGGTCGGATAAGACGCGGCAGCGTCGCATCCGACATTGAAGGATAAGACGTG
TCAACGATCGCATTCGACATTGAATGAACGCAGAAAAGCAAAAAGGCTCGCCAGAAGCGAGCTTTTTTAATGTGGCTGGG
GTACGAGGATTCGAACCTCGGAATGCCGGAATCAGAATCCCGTGCCTTACCGCTTGGCGATACCCCAACTGGGTGCACTT
AACTAAGGTAAGCGTCTTGACATAAATTGGCTGGGGTAGCGAGGATTCGAACCTCGGAATGCCGGAATCAGAATCCGGTG
CCTTACACGCTTGGCGATACCCCAACAAATTGGTTTTGAATTTGCCGAACATATTCGATACATTCAGAATTTGGTGGCTA
CGACGGGATTCGAACCTGTGACCCCATCATTATGAGTGATGTGCATCTAACCAACTGACGCTATCGTAGCCAGATTGTTT
CTTCGATGGCTGGGGTACCTGGATTCGAACCAGGGAATGCCGGTATCAAAAACCGGGTGCCTTACCGCTTGGCGATACCC
CAATAACCGGGTCGGTGAACCGCTTACTCGAAGAAGATGGCTGGGGTACCTGGATTCGAACAGGGAATGCCCGGTATCAA
AAACCGGTGCCTTACCGCTTGGCGATACCCCATCCGGTACAACGCTTTCGTGGTGAATGGTGCGGAGAGGCGAGACTTGG
AACTCGCACACCTTGCGGGCGCCAGAACCTAAATCTGGTGCGTCTACTCAATTTCGCCACTCCCGCAAAAAAAAGATGTG
TGGCTACGACGGGATTCGAACTGTGCACGCCCACCATTATGAGTGATGTGCTCTAACCAACTGAGCTACGTAGCCATCTT
TTTTTTCGCGATACCTTATCGGCGTTGCGGGGGCGCGATTATGCGTCGTAGAGCCTTAGCAGTCGTCAACCGTCTTTTTC
AAGGAAAATTGCTCGAAAGTGACTGTTTGGTTAGGTTGGAACAGCGTGGCGCTATATTCGTCAATTATTGTTTACTTTGT
GTTTGTTTCCAACCCTACAGCCCATTCTTTTGTCATACAGGATGAAATTCGGAATTTAACAATAGTGGTGGTGAAATTAA
TCTATGAAATACTGGCCTACAGTGGATGAGTTGTCAAACAGTGATGTGGCAAACCCGGAACATTTCCTTACTGCATATCC
AGAATCAACAAGCTACCTCAATAACTGTAAACAGCCCCGGATTTCACCGGGGCTGTTTCGCATTTCTTACTTATACGCCG
ACTGAGTGAACCACCAACCGCGCGACCAGACGGATCGTCCATTTTCTTGAACGCTTTCATCCCATTCGACTCGCTTTAGC
GGTAAGAACAAGCGACGGAAGCGGACGCCCGGCACGCACTCAGCGGCGCTCGGAAGCGGGAATAGTCTTCAAAGATCTCC
CGATACAAGTACGCTTCTTTAGAGGTTGGCGGTGTTGTACGGGAAAGCGGAAGCGCGGCAGTTTCGCAGTTGCTGATCAG
AAACCTTGCTGCGCAGCCACTTCTTTCAGGGTGTCGATCCATACTGTAACCAGACGCCATCGGGAGAACTGCTCTTTCTG
CCGCCAGGCCAGCTTGCAGGCAGATACGCTTCAAAACATTCACGCAGGATGTGTTTTTCCATTTTGCCGTTACCGCACAT
TTTATCCTGTGGGTTAATACGCATCAGCCACATCAAGGAATTTGTTTGTCGAGGAACGGAACGCGTGCTTCCACGCACCG
CAGGCTGACTCGCTTTGTTGGCACGCGCGCAGTCATACATATGCAAGGGCCAGCAGTTTACGCACCGTCCTCCTCATGCA
GTTCTTTGGCATTCGGGGCTTTGTGGAAGTAAAGATAACCGCCGAACACTTTCATCAGCAACCTTCACCGGACAGCACCA
TTTTAATGCCCATCGCCTTTGATCTTACGCGACATTAAATACATCGGTGTTGAAGCGCGAATAGTGGTCACATCATAAGG
TTTCGATAGTGGTAAATCACGTACGCGGATGGCATCCAGACCTTCCTGTACAGTGAAGTGAATTTCGTGATGCACCGTGC
CCAGATGGTTTGCCACTTCCTGGGCTGCTTTCAGATCCGGTGAACCCGGCAGACCTACCAGCAAAGGTAGTGTAACTGCC
GGCCACCAGGGCTTCAGAGCGTTCCTGATCTTGCCACGCGACGGGCGCGTATTTCTTGGTGATAGCGGAAATAATTGAGG
AATCCAGACCACCAGAAAGCAGCACACGCGTAAGGCACACAGACATCAGATGGCCTTTTAACTGAATCTTCCAGTGCCGA
AGCACTCGTTTTTGTCAGGTCACGTTATCTTTCACCGCTATCGTAGTCGAACCAAGTCGGCGATGATAGTAAGAACGGAT
TTCGCCGTCCTGCGCTCCACAAATAGCTCCCCGCCGGGAACTCTTTAATCGTGCGGCAAACTGGCACCAGCGCTTTCATT
TCTGAGGCCACATACAGCTGACCGTGTTCGTCATACCCCATACACAGTGGGATGATCCCCAGATGCGTCGCGACCAATCA
GGTAGGCATCTTTTTCGCTCGTCGTACAGTGCAAAGGCAAACATGCCCTGCAAGTCGTCGAGAAATTCCGGCCCTTTCTT
CCTGATACAGCGCGAGGATCACTTCACAGTCAGACCCGGTCTGGAACTGGTAACGAATCGCCATATTCGGCGCGCGAATG
CCTGGTGGTTGTAGATTTCACCGTTTACCTGCCAGTACGTGGGTTTTTTGTTAGGTTGTATGAGAGGTTGCGCCCCCGCG
TTAACGTCAACAATTGACAACCGTTCGTGGGCGAGAATGGCGTTATCGCTGGCATAAATACCGTGACCAGTCCGGGCCAC
GATGACGCATGCAGGCGTGACAGCTCGAGGGGCTTTCTTACGCAGCTAAGTGCGTCTGTTTTGATATCAGAATAGCGCCA
AAAATTGAACACATAACCTTCTCCGTTAACCTGGTATTTGTTGCTTGTTGTGTTTGCTTGTTTAAAAAAATGCCGCAAAG
CAGCACTGTGCGCAGTCCGATTTGGATGGGTGAAAAAATAAAGAAAAAGTAATTGGATAGACTCTTGTGGATTTGGTGCA
TAAAAAGGTCTGGTGTGAGGATATATTTATTGATTGAATCGATAATTTTTAGCGGGTTTTATTGAATGTTATATTTTACT
TGGGGGCCAAATTTGCTGACAAAGTGCGAGTTTGTTCATGCCGGAATGCGGCGTGAACGCCTTATCCGGCCACAAAAGGC
ATGAAAATTCAATATATTAGCAGGAGCTGCGTAGGCCGTGATAAGCGAGCGCCATCAGGCAGTTTGGCGTTTAGTCATCA
GAGCCAACCACGTCCGCAGACGTGGTTGCTATTCGAAACGTCGATTTCAGCGACTGACCGGGTAAATCCAGCTGGGGCGA
AAAGGCATACCTGTCGATATCGTCGAGCGACGAAACACCAGAATGCACCAGAATCGTCTCCAGACCTGCCTGGAAGCCGG
CCAGAATACGGTACGCAGGTTATCGCCGACAATCACCGTTTCTATCCGAATGCGCCTGCATTATGGTTTAATGCTGCGCG
GATGATCCACGGGCTGGGCTTACAAACATAAGAACGGTTCTGCGCCCGGAAGATTTTCTCAATCCCTGCACAACAACGCG
CGCACAAGCGGGATAAAAACCGCGCGCCGTGGGTGATCCGGATTGGTGGCGATAAAACGTGCACCGTTAGCGACGAAATA
GGCTGCTTTATGCAATCATGTCCCAGTTGTAGGAACCGCGTTTCGCCCAACAATCACGAAAATCAAGGGTTCACATCGGT
AATAGTGAAACCGAGCTTTGTACAGTTCATGAATCAGTGCGCCTTCGCCCACCACATACGCTTTTCTTGCCTTCCTGGCG
ACTGGAGGAATCGTGCAGTCGCCATCGGCAGAGGTATAAAACACGCTGTGCAGGTACATCGACACCTGCGGTGGCAAAGC
GGTTCGCCACGATCTTGCCCAGTCTGCGAAGGATAGTTGGTCTAGCGAACAGCAGCGGCAGGCCTTTATCCATAATCCC

Best,
German

@lh3
Copy link
Owner
lh3 commented Aug 2, 2017

It is not recommended to generate CIGAR for read overlapping because 1) generating CIGAR for every overlap is very slow; 2) it is usually not necessary to have cigar for read overlapping. SAM is also the wrong format for read overlapping. No read overlappers output SAM.

On your example, minimap/minimap2 ignore anchors with the same position if the read name is the same, so you shouldn't see perfect alignment. However, with CIGAR on, alignment extension may still produce a nearly perfect alignment from different seeds. This is a problem with minimap2, which I will try to fix at some point. Thanks.

@lh3 lh3 closed this as completed in 12a5a5f Jan 31, 2018
@lh3 lh3 reopened this Jan 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants