AsmVar

AsmVar is a software for discovery, genotyping and characterization of structural variants and novel sequence at single nucleotide resolution from de novo assemblies on a population scale

Contributors: Shujia Huang, Siyang Liu, Junhua Rao & Weijian Ye (Contribute equally)
Contact : liusiyang@genomics.cn & huangshujia@genomics.cn
Institute : BGI & KU in GenomeDK consortium
Latest Version : 2015-04-16

Dependency

Installation

$ cd src/AsmvarDetect
$ make

One short shell ~/demo/AsmVarDetector/Test.sh can be run to check the correct installation of AsmVar. It can be used to verify whether the installation is working correctly. This shell will run automatically and the output files are t.age,t4.1025.vcf,t4.1025.svd,t4.1025.summary,t4.1025.gap.bed.

Running AsmVar workflow

The demo shell ./demo/Cookbook/scripts/DemoPipelineGuideline.sh. provides the example for application of Asmvar to detect, genotype and recalibrate the variation calls
Due to the github storage limitations, the full demo including the output files can be found through this link https://www.dropbox.com/sh/fapzzvvo1whizmn/AAAeBsTSRmTyOn8zVJc24fgca?dl=0
We are currently improving the characterization modules and therefore will close the codes for those modules for a while before the 1st, July, 2015

There are 5 main steps about Asmvar workflow,

Instant plans for further developments

  1. Open codes again for the characterization modules
  2. Improve user experience for AsmVar based on the feedback including possibility of using Bam instead MAF as input files; Level of user friendliness.
  3. Streamline evaluations of de novo assemblies with regards to continuity, completeness and accuracy.
  4. Exploration of novel statistical approaches especially for the alternative align, genotyping and recalibration modules to resolve multi-allelic structural variations and to improve time and memory efficiency
  5. Application of AsmVar in haplotype-resolved de novo assemblies

Contribution

The AsmVar is initially developed for the DanishPanGenome project in the GenomeDK platform. The AsmVar framework construction, applications, statistical methods and the evaluation protocols are established by Siyang Liu and Shujia Huang, Junhua Rao and Weijian Ye led by Siyang Liu and Shujia Huang. The coding contributions for different modules are written in the source code title. Most of the initial codes are written in C++, python and perl by Shujia Huang and perl, python and R by Siyang Liu(All modules especially SV discovery and genotyping), Junhua Rao (especially SV mechanism), Weijian Ye (especially Ancestral state and Novel sequence) . Shujia Huang and Siyang Liu take charge of the software architecture and the efficiency of the algorithms. The work is supervised by Anders Krogh and Jun Wang.

Please cite the paper: Discovery, genotyping and characterization of structural variants and novel sequence at single nucleotide resolution from de novo assemblies on a population scale (manuscript in submission)

Acknowledgement : Bioinformatics teams in GenomeDK consortium

LICENSE

Released under the MIT License Copyright © 2014-2015