Introduction

The source is the command line version of this website, while it can do the following extra things:

Install

Requirement

Get the source code

git clone git@bitbucket.org:liulab/crisprdo.git

Download static files

The 2bit file and bwa index (build by yourself) is necessary for run the spacers scan.
2bit file: hg19 hg38 mm9 mm10
Genome chromosome length: hg19 hg38 mm9 mm10
DHS: hg19 hg38 mm9 mm10
SNP: hg19 hg38 mm9 mm10 danRer7 dm6 ce10
exon: hg19 hg38 mm9 mm10 danRer7 dm6 ce10

Config static data before installation

# Go to the crisprdo directory
cd crisprdo/crisprdo
# Config the static files following settings.py.sample
cp settings.py.sample settings.py
vim settings.py

Start to install

sudo python setup.py install

# Or use this command to install for only you:
python setup.py install --user

If you install locally, you may need to add the location to your $PATH and $PYTHONPATH. These two lines can be added to .bashrc file so it will load each time you login.

export PYTHONPATH=${HOME}/.local/lib/python2.7/site-packages:${PYTHONPATH}
export PATH=${HOME}/.local/bin:${PATH}

Test

crispr-do
Usage
Usage: crispr-do <-g genome -c chrX --start=START --end=END> [options]

sgRNA design for a region.

Options:
  --version             show program's version number and exit
  -h, --help            Show this help message and exit.
  -g GENOME             Genome assembly version
  -c CHROM, --chrom=CHROM
                        chromosome, e.g.chr1
  --start=START         start site of the region
  --end=END             end site of the region
  --annotation          If to annotation the sgRNAs with DHS, SNP or exons.
                        default (not set): False
  --lasso-cutoff=LASSO_CUTOFF
                        lasso_cutoff
  --job-id=JOB_ID       any string or hash code to identify this run

To scan spacers in a region without do annotation, try this

crispr-do -g GENOME -c CHROM --start=START --end=END --job-is=NAME    

To do annotation, you need extra static files, DHS, SNP and exon files.

Output

The output is called "spacer" in the directory of your JOB_ID. Here's the meaning of each column of that file

chromthe chromosome name the sgRNA located.
startstart position of the sgRNA in genome
endend position of the sgRNA in genome
hitseqsgRNA sequence, containing the PAM (in red) and 7bp downstream.
strandon which strand the sgRNA located
efficiency_scorethe efficiency of the sgRNA based on its 30bp sequcene.
specificity_scorethe specificity of the sgRNA. This score is ranged from 0 to 100
conservation_scoreaverage conservation score in 30bp sequence (20bp guide + PAM + 7bp), using UCSC phastcons score

and extra 3 columns with annotation option.

DHS_overlapif the sgRNA is overlapped with any DHS regions from encode
SNP overlapif there are any SNP located in the sgRNA
exon_overlapif the sgRNA is overlapped with any exon