Skip to content

Remove hardcoded limit on max homopolymer call#8088

Merged
ilyasoifer merged 9 commits intobroadinstitute:masterfrom
Ultimagen:fix_flow_matrix_mods_hc
Dec 7, 2022
Merged

Remove hardcoded limit on max homopolymer call#8088
ilyasoifer merged 9 commits intobroadinstitute:masterfrom
Ultimagen:fix_flow_matrix_mods_hc

Conversation

@ilyasoifer
Copy link
Collaborator

No description provided.

Copy link
Contributor

@meganshand meganshand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Do we have test cases that include this max class value in the header? If not it would be good to add at least one test file with the new header.

@ilyasoifer ilyasoifer force-pushed the fix_flow_matrix_mods_hc branch from eb84f1d to 232a84f Compare November 21, 2022 05:58
@ilyasoifer ilyasoifer force-pushed the fix_flow_matrix_mods_hc branch from acd85de to 097ee71 Compare December 1, 2022 20:18
@ilyasoifer ilyasoifer requested a review from meganshand December 3, 2022 15:53
@ilyasoifer
Copy link
Collaborator Author

@meganshand can you take a look again please?

  1. Added the test for parsing mc tag in the header correctly
  2. Generated a mechanism that automatically determines the correct value of flow-collapse-hmer-length parameter
  3. (unrelated) fixed a small bug in the assembly complexity annotation that created inconsistent results between SDK11 and SDK8

Copy link
Contributor

@meganshand meganshand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor questions about expected test output and default argument values.

FILTER_ALLELES, "true",
FILTER_ALLELES_SOR_THRESHOLD, "3",
FLOW_ASSEMBLY_COLLAPSE_HMER_SIZE_LONG_NAME, "12",
FLOW_ASSEMBLY_COLLAPSE_HMER_SIZE_LONG_NAME, String.valueOf(AssemblyBasedCallerUtils.DETERMINE_COLLAPSE_THRESHOLD),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change the default for STANDARD/ADVANCED flow mode for older samples that don't have the mc tag?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not, if there is no mc - then it will be 12

chr9 81153767 . A <NON_REF> . . END=81153783 GT:DP:GQ:MIN_DP:PL 0/0:25:72:24:0,72,802
chr9 81153784 . A <NON_REF> . . END=81153785 GT:DP:GQ:MIN_DP:PL 0/0:28:22:27:0,22,996
chr9 81153786 . G <NON_REF> . . END=81153800 GT:DP:GQ:MIN_DP:PL 0/0:28:81:28:0,81,1175
chr9 81153801 . C CA,<NON_REF> 0 . ASSEMBLED_HAPS=10;AS_RAW_BaseQRankSum=|0.0,1|NaN;AS_RAW_MQ=90000.00|10800.00|0.00;AS_RAW_MQRankSum=|0.0,1|NaN;AS_RAW_ReadPosRankSum=|0.9,1|NaN;AS_SB_TABLE=8,17|3,0|0,0;BaseQRankSum=0.000;DP=29;ExcessHet=0.0000;FILTERED_HAPS=6;HAPCOMP=4,0;HAPDOM=0.500,0.00;HEC=55,3,2,1,0,0;MLEAC=0,0;MLEAF=0.00,0.00;MQRankSum=0.000;RAW_MQandDP=104400,29;ReadPosRankSum=0.929;SUSP_NOISY_ADJACENT_TP_VARIANT GT:AD:DP:GQ:PL:SB 0/0:25,3,0:28:18:0,18,58,75,692,115:8,17,3,0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change in HAPCOMP annotations is expected?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I fixed a small bug in assemblycomplexity annotation that caused a different result between two different java versions, it subtly changed the output, but it should be fine

? new LongHomopolymerHaplotypeCollapsingEngine(argumentCollection.flowAssemblyCollapseHKerSize, argumentCollection.flowAssemblyCollapsePartialMode, fullReferenceWithPadding,
int collapseHmerSize = argumentCollection.flowAssemblyCollapseHKerSize;
if (collapseHmerSize == DETERMINE_COLLAPSE_THRESHOLD){
collapseHmerSize = AssemblyBasedCallerUtils.determineFlowAssemblyColapseHmer(header);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
collapseHmerSize = AssemblyBasedCallerUtils.determineFlowAssemblyColapseHmer(header);
collapseHmerSize = AssemblyBasedCallerUtils.determineFlowAssemblyCollapseHmer(header);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

}
}

private static int determineFlowAssemblyColapseHmer(SAMFileHeader readsHeader) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static int determineFlowAssemblyColapseHmer(SAMFileHeader readsHeader) {
private static int determineFlowAssemblyCollapseHmer(SAMFileHeader readsHeader) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@ilyasoifer ilyasoifer merged commit 84ade40 into broadinstitute:master Dec 7, 2022
@ilyasoifer ilyasoifer deleted the fix_flow_matrix_mods_hc branch December 7, 2022 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants