The two clones are synchronized between https://github.com/cfce/chilin/. We have packaged all dependent software and species-specific data into .tar.gz.
Before installation, make sure that you have gcc, g++, make and java in place, we provide the installation for common system for these dependency.
Note
python must be 2.7 version for macs2 support.
Tool | debian/centos/mac | Usage for ChiLin |
---|---|---|
python dev header | apt-get or yum or port | prerequisites |
python setuptools | apt-get or yum or port | prerequisites |
python numpy package | apt-get or yum or port | prerequisites |
cython | apt-get or yum or port | prerequisites |
R | apt-get or yum or manually | prerequisites |
java/gcc/g++ | apt-get/yum install/Xcode | prerequisites |
ghostscript | apt-get or yum or manually | prerequisites |
texlive-latex | apt-get or yum or manually | prerequisites |
ImageMagick | apt-get or yum or manually | prerequisites |
MACS2 | pypi | peak calling |
seqtk | built-in | packaged into chilin |
bx-python | built-in | packaged into chilin |
FastQC | built-in | packaged into chilin |
BWA | built-in | packaged into chilin |
samtools | built-in | packaged into chilin |
bedtools | built-in | packaged into chilin |
bedClip | built-in | UCSC binary (packaged into chilin) |
bedGraphToBigWig | built-in | UCSC binary (packaged into chilin) |
wigCorrelate | built-in | UCSC binary (packaged into chilin) |
wigToBigWig | built-in | UCSC binary (packaged into chilin) |
mdseqpos | built-in | packaged into chilin |
If you are the administrator, use followings.
sudo apt-get update #python sudo apt-get install python-dev python-numpy python-setuptools cython python-pip #R sudo apt-get install r-base #java sudo apt-get install default-jre sudo apt-get install ghostscript sudo apt-get install imagemagick --fix-missing #Tex sudo apt-get install texlive-latex-base
Then continue to install chilin.
For centos, use:
sudo yum install python-devel numpy python-setuptools python-pip rpm -Uvh http://mirror.chpc.utah.edu/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm sudo yum install tcl tcl-devel tk-devel sudo yum install R sudo yum install ImageMagick sudo easy_install Cython sudo yum install tetex
Then continue to install chilin.
For mac, we suggest using macports, before install macport, user need to have Xcode and Java installed:
## download and install macport # open https://distfiles.macports.org/MacPorts/ and download the right version sudo port install py27-setuptools py27-pip py27-nose py27-cython py27-numpy @1.8.1 ## or use EPD to replace this ## Install R manually from http://cran.cnr.berkeley.edu/bin/macosx/ ## For mac latex, install separately, download and click to install it wget -c http://mirror.ctan.org/systems/mac/mactex/MacTeX.pkg
Install R, MacTex, ImageMagick and ghostscript manually.
Then continue to install chilin.
After solving the dependent prerequisites, install chilin as followings,
git clone http://github.com/cfce/chilin/ cd chilin python setup.py install -f
Then, check your installation:
source chilin_env/bin/activate
# check ChiLin dependent software and data
python setup.py -l
If any software can not be installed, look into their official documentation. Most of time, see dependentsoft to check whether all prerequisites are installed or not, usually it’s the problem of numpy, cython or gcc compiler problem, or R package seqLogo problem. Take a look at all software in the software directory to see what’s going on, and try apt-get, yum, port and pypi to fix the issue.
Lastly, user may need to check the installation of mdseqpos dependency of R seqLogo package, open R console and install dependent R packages:
R -e "source('http://bioconductor.org/biocLite.R');biocLite('seqLogo');library(seqLogo)"
Remember to source your python virtual environment “source ${ChiLin_ROOT}/chilin_env/bin/activate” everytime or put them into your ${HOME}/.bashrc or ${HOME}/.bash_profile.
Note
After installation, the config file is auto-generated and set the species specific data directory default to db under the code root directory.
under the ChiLin source code root directory,
# download from our cistrome server mkdir -p db # change directory to db cd db # download the one you need, this would be over 10 GB, make sure your internet access is over 100k/s, or it's too slow.. # human wget -c http://cistrome.org/chilin/_downloads/hg19.tgz wget -c http://cistrome.org/chilin/_downloads/hg19.tgz.md5 ## check md5 #wget -c http://cistrome.org/chilin/_downloads/hg38.tgz #wget -c http://cistrome.org/chilin/_downloads/hg38.tgz.md5 # mouse #wget -c http://cistrome.org/chilin/_downloads/mm9.tgz #wget -c http://cistrome.org/chilin/_downloads/mm9.tgz.md5 #wget -c http://cistrome.org/chilin/_downloads/mm10.tgz #wget -c http://cistrome.org/chilin/_downloads/mm10.tgz.md5 # check the md5sum for completeness of hg19 md5sum -c hg19.tgz tar xvfz hg19.tgz # download mycoplasma that you are afraid of contaminating your samples wget -c http://cistrome.org/chilin/_downloads/mycoplasma.tgz wget -c http://cistrome.org/chilin/_downloads/mycoplasma.tgz.md5 md5sum -c mycoplasma.tgz.md5 tar xvfz mycoplasma.tgz # change back cd .. # check your data and software installation, if download is ok python setup.py -l
If you
see details about the dependent data.
After these preparation of software and reference data, if you are using our prepared hg38, hg19, mm10, mm9 dependent data, you can skip this part because setup.py already sets chilin.conf.filled for you. If you have your own reference data, open your favorite text editor, appending section in chilin.conf.filled file add in species support like this.
fill in the section with your own data absolute path, then append filled following section
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | [hg19]
genome_index =
# fasta file separated by chromosome, such as chr1.fa
genome_dir =
chrom_len =
dhs =
## blacklist region
velcro =
conservation =
geneTable =
[mm9]
genome_index =
# fasta file separated by chromosome, such as chr1.fa
genome_dir =
chrom_len =
dhs =
## blacklist region
velcro =
conservation =
geneTable =
[hg38]
genome_index =
# fasta file separated by chromosome, such as chr1.fa
genome_dir =
chrom_len =
dhs =
## blacklist region
velcro =
conservation =
geneTable =
|
after conf, see details about this in [species]
.
#Define the site-wide defaults for your system using ABSOLUTE PATHS
#When defining reference species (i.e. your species of interest)
#please refer to "Generating Species References" in README.md on how
#to generate the files
#------------------------------------------------------------------------------
# Tools
#------------------------------------------------------------------------------
#ChiLin is dependent on several tools, please specify the absolute path to
#these tools--ALL FIELDS ARE REQUIRED
#put bedClip, bedGraphToBigWig, bowtie, star, bwa, fastqc, bedtools, macs2, samtools, seqtk, wigCorrelate
#in executable PATH
#other system tool includes convert, pdflatex, R, python2.7
[tool]
mdseqpos =
macs2 =
#------------------------------------------------------------------------------
# Tool parameters
#------------------------------------------------------------------------------
#These are optional parameters for some tools defined above
#NOTE: not all tool parameters can be inputted in this conf--e.g. for bwa,
#the only thing you can affect here is the number of threads used.
[macs2]
#refer to the macs2 help message to find out what these mean, species for effective genome size
extsize = 146
# effecitive genome sizes, support hs, mm, other species, please refer to chromInfo
species =
type = both
fdr = 0.01
keep_dup = 1
[reg]
## regulatory potential score prediction top peaks
peaks = 10000
dist = 100000
[conservation]
## for tf/dnase we suggest 400bp width around summit, for histone 4000
type = tf
peaks = 5000
width = 400
[seqpos]
peaks = 5000
mdscan_width = 200
mdscan_top_peaks = 200
seqpos_mdscan_top_peaks_refine = 500
width = 600
pvalue_cutoff = 0.001
db = cistrome.xml
#------------------------------------------------------------------------------
# Contamination
#------------------------------------------------------------------------------
#OPTIONAL- our contamination module can screen for any species defined below
#specify the species name and the path to the bwa index as follows: e.g.
#ECOLI = /some/path/ecoli
[contamination]
mycoplasma = mycoplasma
# ecoli =
# yeat =
Test installation with demo data,
# non cluster server cd demo bash foxa1 # if you are using slurm sytem cd demo # submit cluster script `foxa1` sbatch foxa1
Check demo data results,
du -h local/local.pdf ## quality report # mac open local/local.pdf # linux nautilus local/
For more options, see Manual.