[](https://github.com/nf-core/eisca/actions/workflows/ci.yml) [](https://github.com/nf-core/eisca/actions/workflows/linting.yml)[](https://nf-co.re/eisca/results)[](https://doi.org/10.5281/zenodo.XXXXXXX) [](https://www.nf-test.com) [](https://www.nextflow.io/) [](https://docs.conda.io/en/latest/) [](https://www.docker.com/) [](https://sylabs.io/docs/) [](https://cloud.seqera.io/launch?pipeline=https://github.com/nf-core/eisca) [](https://nfcore.slack.com/channels/eisca)[](https://twitter.com/nf_core)[](https://mstdn.science/@nf_core)[](https://www.youtube.com/c/nf-core) ## Introduction **TGAC/eisca** is a bioinformatics pipeline that perform analysis for single-cell RNA-seq data. The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes (implemented and to be implemented) are as follows: - **Primary analysis** - FastQC - Raw read QC - TrimGalore - Adapter and quality trimming to FastQ files - Kallisto & Bustools - Mapping & quantification by Kallisto & Bustools - Salmon Alevin - Mapping & quantification by Salmon Alevin - STARsolo - Mapping & quantification by STAR - **Secondary analysis** - QC & cell filtering - cell filtering and QC on raw data and filtered data - Clustering analysis - single-cell clustering analysis - Merging/integration of samples - **Tertiary analysis** - Cell type annotation - Differential expression analysis - Cell-cell communication analysis - Trajectory & pseudotime analysis (to be implemented) - Other downstream analyses (to be implemented) - **Pipeline reporting** - Analysis report - Single-ell Analysis Report. - MultiQC - Aggregate report describing results and QC for tools registered in nf-core - Pipeline information - Report metrics generated during the workflow execution ## Usage > [!NOTE] > If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data. First, prepare a samplesheet with your input data that looks as follows: `samplesheet.csv`: ```csv sample,fastq_1,fastq_2 pbmc8k,pbmc8k_S1_L007_R1_001.fastq.gz,pbmc8k_S1_L007_R2_001.fastq.gz pbmc8k,pbmc8k_S1_L008_R1_001.fastq.gz,pbmc8k_S1_L008_R2_001.fastq.gz pbmc5k,pbmc5k_S1_L003_R1_001.fastq.gz,pbmc5k_S1_L003_R2_001.fastq.gz ``` Each row represents a fastq file (single-end) or a pair of fastq files (paired end). Now, you can run the pipeline using: ```bash nextflow run TGAC/eisca \ -profile \ --input samplesheet.csv \ --genome_fasta GRCm38.p6.genome.chr19.fa \ --gtf gencode.vM19.annotation.chr19.gtf \ --protocol 10XV2 \ --aligner \ --outdir ``` > [!WARNING] > Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; > see [docs](https://nf-co.re/usage/configuration#custom-configuration-files). For more details and further functionality, please refer to the [usage documentation](https://github.com/TGAC/eisca/blob/master/docs/usage.md). ## Pipeline output To see the results of an example test run with a full size dataset refer to the [results](https://nf-co.re/eisca/results) tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the [output documentation](https://github.com/TGAC/eisca/blob/master/docs/output.md). ## Credits nf-core/eisca was originally written by Huihai Wu. We thank the following people for their extensive assistance in the development of this pipeline: ## Contributions and Support If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md). For further information or help, don't hesitate to get in touch on the [Slack `#eisca` channel](https://nfcore.slack.com/channels/eisca) (you can join with [this invite](https://nf-co.re/join/slack)). ## Citations An extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file. You can cite the `nf-core` publication as follows: > **The nf-core framework for community-curated bioinformatics pipelines.** > > Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen. > > _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).