omicverse.alignment.amplicon_16s_pipeline#
- omicverse.alignment.amplicon_16s_pipeline(fastq_dir=None, samples=None, workdir=None, db_fasta=None, *, primer_fwd=None, primer_rev=None, backend='vsearch', threads=4, jobs=None, merge_max_diffs=10, merge_min_overlap=16, filter_max_ee=1.0, filter_min_len=0, filter_max_len=0, derep_min_uniq=2, unoise_alpha=2.0, unoise_minsize=2, chimera_removal=True, otutab_identity=0.97, sintax_cutoff=0.8, sintax_strand='both', sample_metadata=None, overwrite=False)[source]#
Run the full 16S amplicon pipeline.
- Parameters:
fastq_dir (
Optional[str] (default:None)) – Directory containing paired Illumina FASTQs. Samples are auto-discovered by R1/R2 naming. Mutually exclusive withsamples.samples (
Optional[Sequence[Tuple[str,str,Optional[str]]]] (default:None)) – Explicit list of(sample, fq1, fq2)tuples (fq2 may be None for single-end). Mutually exclusive withfastq_dir.workdir (
Optional[str] (default:None)) – Root directory for all intermediate files. No$HOMEfallback.db_fasta (
Optional[str] (default:None)) – Path to a SINTAX-formatted 16S reference FASTA (.faor.fa.gz). IfNone, taxonomy assignment is skipped.primer_fwd (
Optional[str] (default:None)) – PCR primer sequences. When both are provided,cutadapt()runs first; otherwise primer trimming is skipped (e.g. the mothur MiSeq SOP test dataset ships with primers already removed).primer_rev (
Optional[str] (default:None)) – PCR primer sequences. When both are provided,cutadapt()runs first; otherwise primer trimming is skipped (e.g. the mothur MiSeq SOP test dataset ships with primers already removed).backend (
str(default:'vsearch')) –'vsearch'(default) — UNOISE3 via vsearch.'dada2'— pure-Python DADA2 viaomicverse.alignment.dada2_pipeline()(needspip install pydada2).'emu'/'qiime2'still raiseNotImplementedError.threads (
int(default:4)) – CPU parallelism.overwrite (
bool(default:False)) – If True, re-run each step regardless of existing outputs.
- Returns:
Samples × ASVs matrix with taxonomy / sequence / confidence in
var. Sample metadata (if provided) is merged intoobs.- Return type: