IVDB Sequence Polymorphism Help

IV sequence polymorphism analysis

About the sequence polymorphism analysis

As multiple sequence alignment is quite time-consuming, we pre-computed genes and proteins of the different strains of influenza A virus for usersĄŻ convenience of in-depth research. We grouped the sequences by host, subtype and segment and performed multi-alignments between groups. The aligned sequence groups have been manually corrected to remove redundant sequences. Polymorphisms are presented through a graphical view of SNP distribution plot, minor allele distribution, as well as tabular statistics on each position versus consensus sequence. Users not only are able to search sequence polymorphisms by host, subtype, and segment but also have instant access to pre-made alignments, phylogenetic trees, and geographical distributions in a world map.

For Nucleotide Sequences

There are two ways to illustrate the sequence polymorphisms. One is the SNP distribution plot and the minor allele distribution plot on each polymorphism site. The other is the consensus sequence with a list of A, T, C, G statistics on each position.
SNP is defined as the position which has different nucleoside between sequences. The major allele on SNP position is defined as the nucleoside whose number on position i is the most, the other nucleosides on this position are defined as minor allele. For example, if there are 260 As, 5 Ts and 1 C on position i, the major allele is A, the minor alleles are T and C.

The SNP distribution plot:

A part of the minor allele distribution plot:

      For the second method to illustrate a sequence polymorphism, the consensus sequence is defined as the major allele sequence. If there are two major alleles, the consensus nucleoside is defined to be N. The A,T,C,G distribution picture on every position is also shown.

For Protein Sequences

The distribution of each amino acid on each polymorphic site is shown in tabular list.

Further Analysis

For sequences you have chosen for polymorphism analysis, we offer tools for further research.

  • Download the alignment result file.
  • Get the corresponding protein polymorphisms from nucleotide sequences, and vice versa.
  • Build phylogenetic tree for these sequences.
  • Draw the data distribution map by using our IV Seqeunce Distribution Tool.

IVDB is supported by institutional grant from Beijing Institute of Genomics, CAS  Feedback