BETA Input Files Format Description
BETA Basic and BETA plus requires both binding and expression file, BETA minus only binding file
Required: Binding data, BED format with 3 columns (chrom, chromStart, chromEnd) or 5 columns (chrom, chromStart, chromEnd, name, score)
Example
chrom | start | end | name | score |
---|---|---|---|---|
chr1 | 1208689 | 1209509 | AR_LNCaP_2 | 51.58 |
chr1 | 1334246 | 1335348 | AR_LNCaP_7 | 54.55 |
chr1 | 2179351 | 2180790 | AR_LNCaP_9 | 257.72 |
*** Please do not contain the header in the bed file, and make sure it is tab delimitated.
Required:Differential expression data
 • LIMMA standard output (LIM)
    ID (optional), RefseqID, logFC, AveExpre, T, Pvalue, Adj.P.Value, B
Example
12196 | NM_001548_at | -6.945783684 | 9.632803007 | -138.2402671 | 6.92E-10 | 2.08E-05 | 11.83285762 |
---|---|---|---|---|---|---|---|
15675 | NM_005409_at | -6.11280866 | 6.322508161 | -117.5664651 | 1.51E-09 | 2.08E-05 | 11.57790488 |
12213 | NM_001565_at | -6.352395593 | 7.838465214 | -113.6000902 | -113.6000902 | 2.08E-05 | 11.51589687 |
 • Cuffdiff standard output (CUF)
    Test_id, gene_id, gene, locus, sample1, sample2, status, value_1, value_2, log2(foldchange), test_stat, p_value, q_value, significant
Example
NM_000014 | NM_000014 | - | chr12:9217772-9268558 | q1 | q2 | NOTEST | 0.102845 | 0.0820513 | -0.325878 | 0.498271 | 0.618293 | 1 | no |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NM_000015 | NM_000015 | - | chr8:18248754-18258723 | q1 | q2 | NOTEST | 0.127358 | 0.30975 | 1.28221 | -1.32328 | 0.185744 | 1 | no |
NM_000016 | NM_000016 | - | chr1:76190042-76229355 | q1 | q2 | NOTEST | 0 | 0 | 0 | 0 | 1 | 1 | no |
NM_000017 | NM_000017 | - | chr12:121163570-121177811 | q1 | q2 | NOTEST | 3.47702 | 3.62422 | 0.0598207 | -0.195815 | 0.844755 | 1 | no |
  • BETA specific format (BSF)
GeneID, Regulatory status (value with + or -), statistical value(e.g. FDR or Pvalue, the smaller value, the more significant it is)
Example
NM_000014 | -0.325878 | 0.618293 |
---|---|---|
NM_000015 | 1.28221 | 0.185744 |
NM_000016 | 0 | 1 |
NM_000017 | 0.0598207 | 0.844755 |
• Other format
BETA supports other differential expression data format including geneID, regulatory status and statistic values. Set this via parameter --info
*** The differential expression file should contain all the genes in the genome, BETA will use all the info to get the static genes, and isolate the up regulated genes and down regulated genes based on the threshold you input.
*** Make sure your differential expression file do not have the header or add the ‘#’ in the front of your header line.
*** If your gene ID is the official gene symbol, please add the parameter --gname2.
*** Although you can select the type of your differential expression format, in case to make sure BETA get the correct information, you would better set the columns information via --info except you have the same format with the above example.
Option: boundary file (--bf): BED format (6 columns as what showed below)
Example
chrom | start | end | name | score | strand |
---|---|---|---|---|---|
chr1 | 521336 | 521779 | 3 | 0.986 | + |
chr1 | 839881 | 840447 | 19 | 0.986 | + |
chr1 | 968212 | 968748 | 48 | 1 | + |
*** Please do not contain the header in the bed file, and make sure it is tab delimitated.
Option: Genome annotation (-r): Downloaded from UCSC
BETA provides hg38, hg19, hg18, mm10, and mm9 annotation.The annotation reference file should contain (refseqID chroms strand txstart txend genesymbol) information in order.
Example
refseqID | chrom | strand | start | end | gname2 |
---|---|---|---|---|---|
NM_032291 | chr1 | + | 66999824 | 67210768 | SGIP1 |
NM_001301823 | chr1 | + | 33546729 | 33586132 | AZIN2 |
NM_032785 | chr1 | - | 48998526 | 50489626 | AGBL4 |
*** Please do not contain the header in the bed file, and make sure it is tab delimitated.
Option: Whole genome sequence data: fasta format (--gs)
The format is like:
Example
>chr1: xxxx-yyyyy
ATCGGGACTTGACCC…
>chr2: xxxx-yyyyy
AGCGTGACTAGAGCC…
...