configfile

Headers

The configfile is an ini file which is separated in different blocks:

  1. Dirs

  2. Internals

  3. parkour

  4. software

  5. misc

  6. communication

Dirs

The Dirs block defines path information to important directories.

  1. baseDir: the base directory where the sequencer writes the output into.

  2. outputDir: the directory where demultiplexing will be performed.

  3. flowLogDir: the directory where dissectBCL will write its log files into.

  4. seqFacDir: the directory where the sequencing facility has access to. Lightweight QC files will be written here.

  5. piDir: The base directory that holds each principal investigator’s (PI) folder (See PIs).

Internals

The Internals block defines which PI is internal. Upon completion, projects are either copied into the ‘periphery’ or uploaded via fexsend so external users can download the project. Inside this block there are two elements:

  1. PIs: a list of principal investigators.

  2. seqDir: the directory inside a PI’s directory where the sequencing data can be deposited.

  3. fex: Boolean that indicates if an external project (PI not in PIs list) should be packed as a tar and uploaded using fexsend.

If a project is from an internal PI, it will be copied over into:

piDir/PI/seqDir

Note that multiple seqDirs per PI are allowed. For example if seqDir = sequencing_data:

  1. sequencing_data

  2. sequencing_data1

  3. sequencing_data2

can exist, and the latest (e.g. the one with the highest number) will be used to copy over the data.

parkour

The parkour block contains all necessary information to communicate with parkour <https://github.com/maxplanck-ie/parkour2>. Note that this block contains sensitive information.

  1. user: the username for API requests

  2. pw: the password for API requests

  3. cert: the pem certificate for API requests

  4. URL: the URL to Parkour2, e.g. https://parkour.yourdomain.tld.

software

The software block contains paths to all the necessary software and files that are NOT included in the conda installation.

  1. bclconvert: path to the bcl-convert executable

  2. fastqc_adapters: a (custom) list of adapters used by fastqc.

  3. kraken2db: path to your kraken database (created with contam, or sourced from elsewhere <https://github.com/DerrickWood/kraken2/blob/master/docs/MANUAL.markdown>)

misc

the misc block only contains one item (for now), which is the png file used in the custom multiqc header:

  1. mpiImg: path to jpg file.

communication

The communication block has four elements, all of which are related to email communication by the pipeline.

  1. fromAddress: the e-mail address where the emails come from.

  2. host: the email host <https://docs.python.org/3/library/smtplib.html>

  3. finishedTo: email address(es) to send a notification upon completion of a flowcell. If multiple emails, these are comma separated.

  4. bioinfoCore: email address of the core unit, where error messages go to.

example

[Dirs]
baseDir=/path/to/bcl/folder
outputDir=/path/to/fastq/output/folder
flowLogDir=/path/to/log/folder
seqFacDir=/path/to/share/qc/with/facility
piDir=/base/with/enduser/folders
bioinfoCoreDir=/path/to/share/qc/with/core

[Internals]
PIs=[pi1,pi2,pi3,pi4,pi5]
seqDir=seqfolderstr

[parkour]
pullURL=parkour.pull.url/api/analysis_list/analysis_list
pushURL=parkour.push.url/api/run_statistics/upload
user=parkourUser
password=parkourPw
cert=/path/to/cert.pem
URL=parkour.domain.tld

[software]
bclconvert=/path/to/bclconvert
fastqc_adapters=/path/to/fastqc_adapters.txt
kraken2db=/path/to/kraken2_contaminome/contaminomedb

[misc]
mpiImg=/path/to/multiqc_headerimg.jpg

[communication]
deepSeq=email@seqfacility.de
bioinfoCore=email@bioinfocore.de
fromAddress=sender@dissectbcl.de
host=hostmail.address.de