JavaScript is disabled. Please enable to continue!

Mobile search icon
Bioinformatics Services >> RNA Sequencing >> LncRNA Sequencing

LncRNA Sequencing

Long non-coding RNA sequencing service (lncRNA-seq) is an extensive next-generation method to detect the nonprotein-coding transcripts with a length of more than 200nt. lncRNA regulates the expression of coding genes including epigenetic inheritance, polyadenylation, splicing, pre-transcription, transcription and post-transcription. The high-throughput sequencing technology of lncRNA combined with bioinformatics analysis can reveal the quantification and functional enrichment of the target transcripts, strand orientation and their regulatory relations.

Workflow & Deliverables

Bioinformatics workflow for lncRNA-seq analysis:

a. Quality check of raw reads:

The raw reads will be subjected to quality filtration and adapter trimming. The primer sequences, poly(A) tails and reads produced from ribosomal DNA templates will be removed. The high quality data will be used for downstream analysis.

b. Reads alignment to the Reference Genome:

The high quality reads are aligned against the reference sequences using aligner with optimized parameters.

c. Structural Analysis (Alternative Splicing(AS) & Variation Calling)

  • The splice sites are identified from the sequence reads using aligner. Putative variants are discovered from the alignment file generated by aligner program in BAM format. Standard pipeline are used with optimized parameter, to call variants.

d. lncRNA Identification & Annotation

lncRNA are identified and functionally annotated. It predicts lncRNA’s interacting proteins based on neural networks, using sequence as well as structure information.

e. Differential Expression Profiling

Cufflinks/RSEM program is used to assemble and quantify lncRNAs. Differentially expressed lncRNAs are identified using DESeq2 package. FPKM values are used to calculate the log fold change as log2 (FPKM_Treated/FPKM_Control). Log2 Fold Change (FC) values greater than zero are considered up-regulated whereas less than zero are down-regulated along with P-value threshold of 0.05 for statistically significant results.

f. Functional Analysis

The predicted lncRNAs will be functionally annotated using available reference information in gtf/gff files. The pathways are annotated by Kyoto Encyclopedia of Genes and Genomes (KEGG) database. All the lncRNAs classified mainly under five categories: Metabolism, Cellular processes, Genetic information processing, Environmental information processing. The output of KEGG analysis includes KEGG Orthology (KO) assignments and Corresponding Enzyme commission (EC) numbers and metabolic pathways of predicted lncRNAs using KEGG automated annotation server.

Gene ontology (GO) annotations of the coding genes will be determined by Blast2GO. GO terms will be assigned to lncRNAs for functional categorization. lncRNAs will be categorized into categories namely biological process, molecular functions, and cellular component.

g. lncRNA Target Gene Prediction

Target genes of lncRNAs are identified from databases.

Deliverables

  • Quality filtration of reads
  • List of identified splice variants
  • List of putative variants
  • List of lncRNAs along with their annotations
  • Differentially expressed lncRNAs
  • Gene ontology (GO) annotations of lncRNAs
  • KEGG pathway annotations of lncRNAs
  • List of target genes of lncRNAs
  • Compiled report