Genome Sequencing
Your Industry, Our Focus
De novo and Reference -based Genome Assembly
Genome Analysis
Whole genome sequencing provides information on the entire genetic material of an organism. There are two approaches for assembling high-throughput sequencing reads into longer contiguous genomic sequences:
- De novo Genome Assembly: This approach is used for non-model genomes where no reference genome is available. Sequenced reads are compared to each other, and overlapping reads are used to build longer contiguous sequences. Contigs are oriented and ordered using long reads.
- Reference-based Genome Assembly: This approach involves mapping each read to a reference genome sequence to identify genetic variations like single nucleotide polymorphisms (SNPs), indels, insertions, copy number variants, genome-wide association studies (GWAS), and building haplotypes from genome assemblies.
Eurofins Genomics offers a variety of sequencing platforms such as Illumina MiSeq, NextSeq, NovaSeq, ONT and PacBio, with different read lengths and libraries sizes (paired-end) for whole genome sequencing of humans, animals, plants, and microorganisms like bacteria, viruses, and fungi. Our long read sequencing can handle any genome size, from bacterial genomes to large and complex eukaryotic genomes. Long paired-end reads determine the orientation and relative position of the contigs generated during data assembly.
Genome Assembly Services Provided by Eurofins Genomics
- Bacterial/Fungus De novo Genome Assembly
- PacBio/Nanopore Bacterial/Fungus De novo Assembly
- De novo Genome Assembly up to 1Gb
- Large Genome De novo Assembly >1Gb
- Fungus Hybrid De novo Assembly and Analysis
- Large Genome Hybrid De novo Assembly and Analysis
- Reference-Guided Genome Analysis
- PacBio/Nanopore Reference-Guided Analysis
Contact Eurofins Genomics for all your Genome Assembly needs.
Bioinformatics Workflow & Deliverables for Genome Assemble Requests
- Quality Check of Raw Reads:
- Quality filtration and adapter trimming.
- Removal of primer sequences, poly(A) tails, and reads from ribosomal DNA templates.
- High-quality data used for downstream analysis.
- De novo Assembly:
- Multiple Kmer assembly runs to optimize the assembly.
- PE data assembled using various parameters like Kmer length, coverage cut-off, insert length, and standard deviation.
- Best assembly selected based on scaffold N50 and max scaffold length.
- Final assembly evaluated on metrics like scaffold N50, assembly coverage, GC content, completeness, and accuracy.
- Reference-Based Analysis:
- Downloading reference genome and gene information from public databases.
- Aligning high-quality reads against the reference genome with optimized parameters.
- Gene Prediction:
- Using statistical models to find gene features like start and stop codons, CDS of the genes.
- Predicting coding regions in the given sample.
- Annotation:
- Annotating predicted coding regions against databases like NCBI Nr, Swissprot, KEGG, and COG using BlastX.
- Mapping coding regions to reference canonical pathways in KEGG.
- Assigning GO terms for functional categorization.
Deliverables
De novo Genome Assembly:
- Quality filtration of reads
- De novo assembly generating scaffolds/contigs
- Assembly statistics
- In silico validation using RNA-Seq data (for large complex plants)
- GC percentage
- Repeat identification
- Gene prediction
- Gene annotation
- GO analysis
- SSR discovery
- Phylogenetic analysis
- KEGG pathway analysis
- Comparative genomics with closely related genomes
- Circos plot
- COG orthologous groups analysis
- AMR and Virulence factor analysis.
- Comprehensive report with publication-standard methodology, graphs, and tables
Reference-based Genome Analysis:
- Quality filtration of reads
- Mapping of high-quality reads to the reference genome
- Alignment summary (# reads mapped, # uniquely mapped reads, # reads unmapped, genome coverage)
- Consensus sequence in fasta format
- Gene prediction using gtf/gff
- SNP/Indels identification
- SNP/Indels annotation
- Core gene analysis
- Comparative genomics
- Phylogenetic analysis
- Comprehensive report with publication-standard methodology, graphs, and tables
Frequently Asked Questions (FAQs)
What is the difference between De novo and reference-guided genome assembly?
De novo genome assembly is used for organisms without a reference genome, building sequences from scratch by overlapping reads. Reference-guided assembly maps reads to an existing reference genome to identify genetic variations.
Which sequencing platforms do you use for genome assembly?
We utilize a variety of sequencing platforms including Illumina HiSeq, MiSeq, NextSeq, and PacBio, each with different read lengths and libraries to cater to various genome sizes and complexities.
What types of organisms can you perform genome assembly for?
We offer genome assembly services for a wide range of organisms including humans, animals, plants, bacteria, viruses, and fungi. Our platforms and methodologies are adaptable to different genome sizes and complexities.
Can you provide custom analysis and reporting for specific research needs?
Yes, we offer customizable analysis and reporting options to meet specific research needs. We can tailor our bioinformatics workflows and deliverables to include additional analyses or focus on areas of interest.
What are the applications of genome assembly in research and industry?
Genome assembly has numerous applications including evolutionary studies, identifying genetic variations, disease research, agricultural improvement, and biotechnological innovations. It provides critical insights into the genetic makeup and potential of organisms.