Run Seeksoultools

Run tests

Example 1: Basic usage

mkdir -p /demo/myproject/
cd /demo/myproject/
seeksoultools rna run \ 
--fq1 /demo/data/demo3k_R1_001.fastq.gz \ 
--fq2 /demo/data/demo3k_R2_001.fastq.gz \ 
--samplename demo3k \ 
--outdir /demo/myproject/ \ 
--genomeDir /demo/refdata/GRCh38-3.0.0/star \ 
--gtf /path/demo/refdata/GRCh38-3.0.0/genes/genes.gtf \ 
--chemistry MM \ 
--core 4

Example 2: Specify a different version of STAR for analysis. Make sure that the STAR version is compatible with the –genomeDir.

mkdir /demo/myproject/
cd /demo/myproject/
seeksoultools rna run \
--fq1 /demo/data/demo3k_R1_001.fastq.gz \
--fq2 /demo/data/demo3k_R2_001.fastq.gz \
--samplename demo3k \
--outdir /demo/myproject/ \
--genomeDir /demo/refdata/GRCh38/star \
--gtf /path/demo/refdata/GRCh38/genes/genes.gtf \
--chemistry MM \
--core 4 \
--star_path /path/to/cellranger-5.0.0/lib/bin/STAR

Example 3: A sample has multiple sets of fastq files

mkdir /demo/myproject/
cd /demo/myproject/
seeksoultools rna run \
--fq1 /demo/data/demo_S1_L001_R1_001.fastq.gz \
--fq1 /demo/data/demo_S1_L002_R1_001.fastq.gz \
--fq2 /demo/data/demo_S1_L001_R2_001.fastq.gz \
--fq2 /demo/data/demo_S1_L002_R2_001.fastq.gz \
--samplename demo \
--outdir /demo/myproject/ \
--genomeDir /demo/refdata/GRCh38/star \
--gtf /demo/refdata/GRCh38/genes/genes.gtf \
--chemistry MM \
--core 4

Example 4: Customize the structure of R1

seeksoultools rna run \
--fq1 /demo/data/demo3k_R1_001.fastq.gz \
--fq2 /demo/data/demo3k_R2_001.fastq.gz \
--samplename demo \
--outdir /demo/myproject/ \
--genomeDir /demo/refdata/GRCh38/star \
--gtf /demo/refdata/GRCh38/genes/genes.gtf \
--barcode /demo/utils/CLS1.txt \
--barcode /demo/utils/CLS2.txt \
--barcode /demo/utils/CLS3.txt \
--linker /demo/utils/Linker1.txt \
--linker /demo/utils/Linker2.txt \
--structure B9L12B9L13B9U8 \
--core 4
  • The structure of read1 is represented by B9L12B9L13B9U8, which means it consists of three sections of cell barcode, each with 9 bases, and a UMI section with 8 bases. The linker section between the cell barcode and UMI consists of two parts, with the first part being 12 bases and the second part being 13 bases

  • Use --barcode to specify the three sections of barcodes sequentially, and use --linker to specify the two sections of linkers sequentially.

Parameter descriptions

Parameters

Descriptions

–fq1

Paths to R1 fastq files.

–fq2

Paths to R2 fastq files.

–samplename

Sample name. A directory will be created named after the sample name in the outdir directory. Only digits, letters, and underscores are supported.

–outdir

Output directory. Default: ./

–genomeDir

The path of the reference genome generated by STAR. The version needs to be consistent with the STAR used by seeksoultools.

–gtf

Path to the GTF file for the corresponding species.

–core

Number of threads used for the analysis.

–chemistry

Reagent type, with each type corresponding to a combination of --shift, --pattern, --structure, --barcode, and --sc5p. Available options: DDV1, DDV2, DD5V1, MM, MM-D, DD-Q.
DDV1 corresponds to the 3’ transcriptome-seq kit V1 reagent for the DD platform.
DDV2 corresponds to the 3’ transcriptome-seq kit V2 reagent for the DD platform.
DD5V1 corresponds to the 5’ transcriptome-seq kit V1 reagent for the DD platform.
MM corresponds to the 3’ transcriptome-seq kit reagent for the MM platform.
MM-D corresponds to the large-well transcriptome-seq kit for the MM platform.
DD-Q corresponds to the full-length rna sequence transcriptome-seq kit for the DD platform.

–skip_misB

If enabled, no base mismatch is allowed for barcode. Default is 1.

–skip_misL

If enabled, no base mismatch is allowed for linker. Default is 1.

–skip_multi

If enabled, discard reads that can be corrected to multiple white-listed barcodes. Barcodes are corrected to the barcode with the highest frequency by default.

–expectNum

Estimated number of captured cells.

–forceCell

When number of cells obtained from analysis is abnormal, add this parameter with expected value N. Seeksoultools will select the top N cells based on UMI from high to low.

–include-introns

When disabled, only exon reads are used for quantification. When enabled, intron reads are also used for quantification.

–star_path

Path to another version of STAR for alignment. The version must be compatible with the --genomeDir version. The default --star_path is the STAR in the environment.