MAIN
INDEX
ANALYTICAL PIPELINE
CONTACT
SYSTEM REQUIREMENTS
REDHORSE Package
Example Data | List Alleles | Index of Utilities Analytical Pipeline | Prerequisites1) Generate Input Data 2) Find Alleles
How to run it?The listAlleles utility of REDHORSE package takes allele file and summarizes it in terms of frequencies. At this stage two different filters can be applied . Type java
-jar REDHORSE.jar listAlleles -h for options. Run the utility as follows:
- The -m parameter is the
minimum frequency required to call a base. A frequency threshold of
0.4 would mean nucleotide at any given position with frequency
greater
than or equal to 40% would be called. If there is more than one
nucleotide with frequency more than 40% at a given genomic position,
which is typical of heterozygous sites in anneuploid genomes, then both
will be listed at this stage. Depending on the threshold specified, there might be
more than 2 nucleotides at any given position.
- The -n parameter calls nucleotides at a position if the read depth (coverage)
at that position is greater than or equal to the number specified. if
the coverage is less than the number specified, a "-" will be reported.
- Finally, -o parameter is the location where the output file needs to be
written.
java -jar REDHORSE.jar
listAlleles -i
"C:\AsisKhan\softwareManuscript\data\AlleleFiles\VAND.allele" -o
"C:\AsisKhan\softwareManuscript\data\ListFiles\VAND.list" -n 5 -m 0.8
|
OutputThe output of the program is as follows:
TGME49_chrVIII 21
A 13.0 1.0
0.0 0.0 0.0
4 9 129.69 22
C 15.0 0.0
0.0 1.0 0.0
5 10 132.4 23
C 15.0 0.0
0.0 1.0 0.0
5 10 132.4 24
C 15.0 0.0
0.0 1.0 0.0
5 10 132.4 25
T 16.0 0.0
1.0 0.0 0.0
6 10 133.5 26
A 16.0 1.0
0.0 0.0 0.0
6 10 133.5 27
A 16.0 1.0
0.0 0.0 0.0
6 10 133.5 28
C 16.0 0.0
0.0 1.0 0.0
6 10 133.5 29
C 16.0 0.0
0.0 1.0 0.0
6 10 133.5 30
C 16.0 0.0
0.0 1.0 0.0
6 10 133.5 31
T 16.0 0.0
1.0 0.0 0.0
6 10 133.5 32
A 17.0 1.0
0.0 0.0 0.0
7 10 134.47 33
A 18.0 1.0
0.0 0.0 0.0
7 11 135.3 34
C 19.0
0.052 0.0
0.94 0.0
8 11 136.10 35
C 19.0 0.0
0.0 1.0 0.0
8 11 136.1 36
C 19.0 0.0
0.0 1.0 0.0
8 11 136.1
........
......... |
The
list file consists of the chromosome name followed by information corresponding to each genomic position.
To generate this list file, a minimum read depth of 10 and minimum
frequency of 15% were specified as shown above. For
example, " 34 C
19.0 0.05263157894736842
0.0 0.9473684210526315
0.0 8 11
136.10526315789474" in the list file would mean the following:
- 34
C 19.0
0.052 0.0
0.94 0.0
8 11 136.1- Genomic position
- 34 C
19.0 0.052
0.0 0.94
0.0 8 11
136.1- Nucleotide called. C is called as it occurs
with 94.7% frequency. A is not called as it is less than 15% thresold
that was specified when running the utility.
- 34 C 19.0
0.052 0.0
0.94 0.0
8 11 136.1- Read depth at that position.
- 34 C 19.0 0.052 0.0 0.94 0.0 8 11 136.1- Normalized frequency at each base
- 34 C 19.0 0.052 0.0 0.94 0.0 8 11 136.1- Forward and reverse reads contributing to the alleles listed
- 34 C 19.0 0.052
0.0 0.94
0.0 8 11 136.1- Average Mapping quality
|
| Index of Utilities Analytical Pipeline |
|