Hi Matt, After I ran through the again with Annelida, using t

Make sure Sally and Matt get the same results before finalizing pipeline about evolutionary-rates-analysis-pipeline HOT 9 CLOSED

m-orton commented on July 17, 2024

Make sure Sally and Matt get the same results before finalizing pipeline

from evolutionary-rates-analysis-pipeline.

Comments (9)

m-orton commented on July 17, 2024

Hi Sally,

I dont think there is a random element to the code that I am aware of. I ran through Annelida when I was testing it and managed to get the same result on multiple run throughs.
But I agree on rerunning the analysis after I finish coding the added components. I will resend the workspace so we can make sure we start from the same point.

Best Regards,
Matt

from evolutionary-rates-analysis-pipeline.

sadamowi commented on July 17, 2024

OK. Thanks for your thoughts on that. Let's both have another run through the same starting data and using the revised code, once ready.

from evolutionary-rates-analysis-pipeline.

sadamowi commented on July 17, 2024

Hi Matt,

I finished running the revised Annelida code today (Dec. 6th). It ran very well, and I was happy with the alignments. Thank you!

I got similar p-values as you but not exactly the same. (Last time, our results were much more divergent.)

I'm not sure why the results would be slightly different. I will need to look into that in more detail before closing this issue.

Would you please confirm. When you independently ran the analysis twice on Annelida (using the same workspace you sent me) did you get exactly the same results each time? Thank you.

Cheers,
Sally

from evolutionary-rates-analysis-pipeline.

sadamowi commented on July 17, 2024

Hi Matt,

I am still puzzled by this issue. Our alignments appear identical. I compared the trimmed final Polychaeta alignments and they are exactly the same length (number of nucleotides), they contain the same number of sequences, and they have gaps in the same places. So, this is a good sign. I also checked this for the Clitellata.

I did find that my results contain 3 more pairs than yours (dfrelativedist). I have 70 relative distances, and you have 67. I have 2 extra for Clitellata and 1 extra for Polychaeta compared to you. I have no idea why this would be the case, given we are running the same code, and that the alignments are the same. My resulting p-values are very similar to yours but just slightly different, as would be expected if there is a different total number of "trials".

The pseudoreplicate results look the same. There is the same number of values for each of us (6), and the relative distances line up perfectly for those.

I did a comparison of our relative distance results. All of the other distances also line up perfectly except that I have 3 extras. In themselves, they are "normal" values, i.e. within the range of the other values.

One idea I had is that there could be a slight difference in R. What version are you running? I suggest that we should double check that we are running the same version.

Another idea that I had is that we should be careful about rounding. We want to avoid rounding errors. (I haven't found information about this yet, but it could be that different R versions have slightly different rounding approaches. There are different approaches to rounding off, and it is possible that different versions used different defaults.) Therefore, I'd like to suggest that we use more digits for interim calculations. For example, I suggest to use 6 digits past the decimal rather than 3 or 4 (e.g. lines 1341, 1345, 1591).

If you agree, then I suggest to search for the term "round" and change this to 6. Of course, more substantial rounding isn't a problem for plotting purposes at the end. Even if this doesn't solve our issue of the slightly different results, I suggest this would be a good idea anyhow, given the nature of our data. I think we want to avoid rounding issues at interim steps.

I will continue to try to figure out why I am getting 3 extra relative distances. I will also see if that result is replicated if I run the script again.

After that, I'd like to suggest that we fix up the very minor issues, where possible, and then move on to another phylum. Having another case may help us to figure this out.

Cheers,
Sally

from evolutionary-rates-analysis-pipeline.

m-orton commented on July 17, 2024

Hi Sally, that is good to hear that the alignments went well for you. Im not sure why you seem to be getting more pairing results than me but different R versions seems to be a likely possibility.

Im running:
R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree"
RStudio: Version 0.99.491 – © 2009-2015 RStudio, Inc.

In regards to rounding I can change the code to round at more digits as you suggested.

Best regards,
Matt

from evolutionary-rates-analysis-pipeline.

m-orton commented on July 17, 2024

Script has now been edited in the Annelida branch at the lines you mentioned: lines 1341, 1345, 1591 to round to 6 digits.

Best Regards,
Matt

from evolutionary-rates-analysis-pipeline.

sadamowi commented on July 17, 2024

Hi Matt,

Thank you for changing the number of digits for rounding. I think that's a good idea overall even without the issue of the slight discrepancy.

In terms of software versions, I am running:

R version 3.3.2 (Sincere Pumpkin Patch) - I love these names!
RStudio: Version 1.0.44

I clicked "check for updates" in RStudio and was told I am running the current version of RStudio.

I checked online, and there is a newer version of R: R version 3.2.5 (Very, Very Secure Dishes)

What do you think about us updating to these newest versions just to be sure we are avoiding any versioning-related differences between us?

Cheers,
Sally

from evolutionary-rates-analysis-pipeline.

m-orton commented on July 17, 2024

Hi Sally,

I updated R and RStudio to:
R version 3.3.2 (Sincere Pumpkin Patch)
RStudio: Version 1.0.44

I then ran through the script again and I also managed to get 3 additional pairings!

I have just sent you an email with updated Annelida Results in an excel file. Could you confirm the p-values are the same as yours?

I also agree it would be good for us to both update to the most recent R version - Very, Very Secure Dishes. It sounds more secure haha.

Thanks,
Matt

from evolutionary-rates-analysis-pipeline.

sadamowi commented on July 17, 2024

Hi Matt,

Great news! Our results now match up. Thanks for updating your software and rerunning the script for comparison.

So, it does appear that using the same version is important. (However, I'm glad the results turned out very similar between versions. One would want the results to be stable through time. I will be sure to add the R version to the manuscript; that is standard practice. I suggest to add a note about the version used for testing near the top of the commenting of the code itself too.)

If you'd like to update again, I'll update to Very, VERY secure dishes too.

Cheers,
Sally

from evolutionary-rates-analysis-pipeline.

Make sure Sally and Matt get the same results before finalizing pipeline about evolutionary-rates-analysis-pipeline HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent