A user recently contacted me about not being able to open a particular file published as a supplement. The paper I'm referring to is http://www.pnas.org/content/109/32/13052.long, the data file is ftp://ftp.broadinstitute.org/pub/malaria/pnas-park-2012-suppfile-2.vcf.gz
It looks like the consensus call vcf file has some malformed lines, I've attached a single line showing the problem (not sure how many others there are in the file). A data field looks like:
0/0:39.03:24:<0,.,.>:.
The data values in brackets are supposed to be a comma separated list of numbers, a single "." would also be fine because it would be treated as a missing value. However, providing 0 and then "." creates an error when than period is attempted to be parsed as an integer.
I contacted the author, and this was his response:
"It appears that vcftools is producing those errors when merging together separate vcf files. I've regenerated a new one using GATK CombineVariants. At least for that one example row you showed, it looks corrected now."
I don't use vcftools myself so I haven't attempted to reproduce it. The bad line is attached.
This is perhaps something to do with vcf-merge? Without knowing the original entries used prior to the merge, this might be difficult to replicate and track down.