[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpretation of results and integration with SNVs #31

Closed
karini925 opened this issue Jan 15, 2021 · 3 comments
Closed

Interpretation of results and integration with SNVs #31

karini925 opened this issue Jan 15, 2021 · 3 comments

Comments

@karini925
Copy link

Hi again, I successfully ran your tool on WGS data from patient with two tumour samples from two distinct spatial regions. My full data set contains 20 samples for this patient but for now I am just testing the tool on two samples. I have been reading your manuscript in more detail and am interested in the integration with VAFs from SNVs. I would like to try and merge the results from hatchet with my mutations so that I can run them through Pyclone for example.

I was wondering if you can please suggest the correct way to do this. In the results file 'best.seg.ucn', I see for each segment in the genome, the major and minor allele copy number status for each clone along with the abundance of each clone in a given sample. I am just having a bit of a hard time wrapping my head around how to correctly match a given SNV in my data with the appropriate clone results.

Thank you in advance!

Karin

@simozacca
Copy link
Contributor

Thank you for your interest in HATCHet and I would be happy to help with your questions related to SNV analysis.

Using the resulting segmentation, you can get the allele- and clone-specific copy numbers for an SNV POS by retrieving the copy numbers of a segment in the same chromosome with START and END such that START <= POS < END. Note that the results are reported in a data frame and the results for the same segment in different samples are reported in different rows. Also, for every clone, the results report the corresponding allele-specific copy numbers (separated by |) and the proportion of the related clone; note that a proportion of 0 indicates that the clone and the corresponding copy numbers are not present in that sample.

Also, nearly all existing methods for computing CCFs and/or clustering SNVs do not use clone-specific information about copy numbers in input, but only the total proportion of the different allele-specific copy numbers. Thus, considering the following example for the copy numbers of a segment with a specific SNV:

cn_normal u_normal cn_clone1 u_clone1 cn_clone2 u_clone2
1,1 0.1 2,0 0.3 1,1 0.6

Then you can conclude that the SNV is in a segment with copy numbers 1|1 for 70% of cells (including normal) and 2|0 in the remaining 30% of cells.

Moreover, we would like to inform you that we are about to release soon a new method called DeCiFer which clusters SNVs and infers related CCFs, and it should be easy to integrate with HATCHet; differently then other methods, DeCiFer can properly deal with subclonal CNAs.

Please do let us know if you have any further question.

@karini925
Copy link
Author

Thanks for your response and looking forward to trying out DeCiFer!

I was looking into your script: https://github.com/raphael-group/hatchet-paper/blob/master/analysis/explainMutationsCCF.py
It worked for me and I was able to calculate CCF for each of my mutations and most were labelled as "Explained" which is promising. Ideally I would like to now cluster the mutations based on their CCF using something like SciClone to then build a clonal phylogenetic tree. To use SciClone though I would first need to convert my CCF values into adjusted VAF (CCF/2) and I am not sure if this is the most appropriate way to go about it. Was just looking to get your insight into this in case you have come across this type of question before? Otherwise feel free to close this issue. Thanks again!

@simozacca
Copy link
Contributor

If you would like to simply cluster the SNVs based on the CCFs computed by the explained-mutation analysis, I would recommend to consider using PyClone in the alternative mode, i.e. the mode in which the user can directly provide CCFs and let PyClone cluster those.

Unfortunately, I do not think that obtaining the adjusted VAF as CCF/2 is an appropriate approach with CNAs since the adjustment would depends on the number of mutated copies (copies of a segment with the SNV can be different than 1 for example when the SNV is occurring on an allele which is subsequently amplified).

I am closing the issue for now but please feel free to let us know if you have any other question or you would like to have further details about the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants