Run SeekSoulTools

Run tests

Example 1: Basic usage

Set up the necessary configuration files for the analysis, including the paths to the sample data, the chemistry versions, the genome index, the gene annotation file, etc. Run the SeekSoulTools using the following command:

seeksoultools fast run \
--fq1 /path/to/cellline/cellline_R1.fq.gz \
--fq2 /path/to/cellline/cellline_R2.fq.gz \
--samplename demo \
--genomeDir /path/to/GRCh38/star \
--gtf /path/to/GRCh38/genes/genes.gtf \
--rRNAgenomeDir /path/to/hg38_rRNA/star \
--rRNAgtf /path/to/hg38_rRNA/genes/delete_rRNA5.8-18-28_in_rRNA45s.gtf \
--chemistry DD-Q \
--include-introns \
--core 4

Parameter descriptions

Parameters

Descriptions

–fq1

Paths to R1 fastq files.

–fq2

Paths to R2 fastq files.

–samplename

Sample name. A directory will be created named after the sample name in the outdir directory. Only digits, letters, and underscores are supported.

–outdir

Output directory. Default: ./

–genomeDir

The path of the reference genome generated by STAR. The version needs to be consistent with the STAR used by SeekSoulTools.

–gtf

Path to the GTF file for the corresponding species.

–rRNAgenomeDir

The path to the reference genome generated by STAR, which is used for evaluating rRNA portion. The version needs to be consistent with the STAR used by SeekSoulTools.

–rRNAgtf

Path to the GTF file for the corresponding species, which is used for evaluating rRNA portion.

–core

Number of threads used for the analysis

–chemistry

Reagent type, with each type corresponding to a combination of --shift, --pattern, --structure, --barcode, and --sc5p. Available options: DD-Q.
DD-Q corresponds to the SeekOne® DD Single Cell Full-length RNA Sequence Transcriptome-seq Kit.

–skip_misB

If enabled, no base mismatch is allowed for barcode. Default is 1.

–skip_misL

If enabled, no base mismatch is allowed for linker. Default is 1.

–skip_multi

If enabled, discard reads that can be corrected to multiple white-listed barcodes. Barcodes are corrected to the barcode with the highest frequency by default.

–expectNum

Estimated number of captured cells.

–forceCell

When the number of cells obtained from analysis is abnormal, add this parameter with expected value N. SeekSoulTools will select the top N cells based on UMI from high to low.

–include-introns

When disabled, only exon reads are used for quantification. When enabled, intron reads are also used for quantification.

–star_path

Path to another version of STAR for alignment. The version must be compatible with the --genomeDir version. The default --star_path is the STAR in the environment.