Inputs / Outputs

craw_coverage

Inputs

bam file

annotation file

The annotation file is a tsv file. It’s mean that it is a tex file with value separated by tabulation (not spaces). The first line of the file must be the name of the columns the other lines the values. Each line represent a row.

name    gene    chromosome      strand  Position
YEL072W RMD6    chrV    +       14415
YEL071W DLD3    chrV    +       17845
YEL070W DSF1    chrV    +       21097
YEL066W HPA3    chrV    +       27206
YEL065W SIT1    chrV    +       29543
YEL062W NPR2    chrV    +       36254
YEL058W PCM1    chrV    +       44925
YEL056W HAT2    chrV    +       48373

All lines starting with ‘#’ character will be ignored.

# This is the annotation file for Wild type
# bla bla ...
name    gene    chromosome      strand  Position
YEL072W RMD6    chrV    +       14415
YEL071W DLD3    chrV    +       17845
YEL070W DSF1    chrV    +       21097
YEL066W HPA3    chrV    +       27206
YEL065W SIT1    chrV    +       29543
YEL062W NPR2    chrV    +       36254
YEL058W PCM1    chrV    +       44925
YEL056W HAT2    chrV    +       48373
mandatory columns
columns with fixed name
columns with variable name
name    gene    type    chromosome      strand  annotation_start        annotation_end  has_transcript  transcription_end       transcription_start
YEL072W RMD6    gene    chrV    1       13720   14415   1       14745   13569
YEL071W DLD3    gene    chrV    1       16355   17845   1       17881   16177
YEL070W DSF1    gene    chrV    1       19589   21097   1       21197   19539
YEL066W HPA3    gene    chrV    1       26721   27206   1       27625   26137
YEL065W SIT1    gene    chrV    1       27657   29543   1       29601   27625
YEL062W NPR2    gene    chrV    1       34407   36254   1       36401   34321
YEL058W PCM1    gene    chrV    1       43252   44925   1       44993   43217
YEL056W HAT2    gene    chrV    1       47168   48373   1       48457   47105
YEL052W AFG1    gene    chrV    1       56571   58100   1       58105   56537

All other columns are not necessary but will report as is in coverage file.

Outputs

coverage_file

It’s a tsv file with all columns found in annotation file plus the result of coverage position by position centered on the reference position define for each line. for instance

craw_coverage -bam=../data/craw_data_test/WTE1.bam --annot=../data/craw_data_test/annotations.txt
--ref-col=annotation_start --before=0  --after=2000

In the command line above, the column ‘0’ correspond to the annotation_start position the column ‘1’ to annotation_start + 1 on so on until ‘2000’ (here we display only the first 3 columns of the coverage).

# Running Counter RnAseq Window
# Version: craw NOT packaged, it should be a development version | Python 3.4
# With the following arguments:
# --after=2000
# --annot=../data/craw_data_test/annotations.txt
# --bam=../data/craw_data_test/WTE1.bam
# --before=0
# --output=WTE1_0+2000.new.cov
# --qual-thr=15
# --ref-col=annotation_start
# --suffix=cov
sense   name    gene    type    chromosome      strand  annotation_start        annotation_end  has_transcript  transcription_end       transcription_start     0       1       2
S       YEL072W RMD6    gene    chrV    +       13720   14415   1       14745   13569   7       7       7
AS      YEL072W RMD6    gene    chrV    +       13720   14415   1       14745   13569   0       0       0
S       YEL071W DLD3    gene    chrV    +       16355   17845   1       17881   16177   31      33      33

The line starting with ‘#’ are comments and will be ignored for further processing. But in traceability/reproducibility concern, in the comments craw_coverage indicate the version of the program and the arguments used for this experiment.

craw_htmp

Inputs

see cov_out

Outputs