Identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of Stevia rebaudiana Bertoni

Stevia rebaudiana is an important agricultural crop that yields diterpenoid steviol glycosides (SGs) commonly used to substitute sugar in food products and nutraceuticals. Despite the increasing demand for Stevia leaf and Stevia-based products, the genetic background of this crop remains poorly e...

Full description

Saved in:
Bibliographic Details
Main Author: Azmi Murad, Azrul Afiq
Format: Thesis
Language:English
Published: 2021
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/104312/1/AZRUL%20AFIQ%20BIN%20AZMI%20MURAD%20-%20IR.pdf
http://psasir.upm.edu.my/id/eprint/104312/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Stevia rebaudiana is an important agricultural crop that yields diterpenoid steviol glycosides (SGs) commonly used to substitute sugar in food products and nutraceuticals. Despite the increasing demand for Stevia leaf and Stevia-based products, the genetic background of this crop remains poorly elucidated. The genetic markers available for this species are also extremely lacking. The current study investigated an in-house leaf tissue transcriptome dataset of Stevia rebaudiana and developed genic-SSR markers for the species using in silico approaches. In total, 103,890 de novo assembled contig sequences were analysed. Out of that, 8,065 contigs containing 8,789 genic-SSR loci were unearthed via MIcroSAtellite identification (MISA) tool. From the 8,065 contigs containing genic-SSR (CCGS) found, 7,400 CCGS contained single genic-SSR per locus; while 665 CCGS contained multiple SSR per locus (ML). Furthermore, amongst the 8,789 genic-SSR, 8,302 were identified as pure genic-SSRs, 105 were complex genic-SSRs and the remaining 382 were compound genic-SSRs. From the functional annotation of the 8,065 CCGS identified, 6,447 CCGS were annotated with functional genes; while remaining 1,618 CCGS were unannotated. Out of 6,447 annotated CCGS, 5,494 CCGS matched significantly to protein sequences of various plant species with an E-value cut off at 1.0E-15. Among the 5,494 CCGS, 3,069 CCGS were annotated with known functional genes and containing only single pure genic-SSR per locus. Pure trinucleotide genic-SSRs (52.66%) were the predominant repeats. This was followed by pure di- (35.32%), hexa- (6.48%), penta- (3.84%) and tetranucleotides (1.69%). Microsatellite di- and trinucleotides are preponderant in S. rebaudiana leaf transcriptome. Repeat motif AT/TA (50.28%) was the most abundant among the dinucleotides, and the repeat motif GAT/ATC (12.87%) was predominant among the trinucleotides. From the 3,069 annotated CCGS, 1,617 were mapped to proteins available in the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The biosynthesis pathways with the highest number of annotated CCGS mapped to them were the metabolic pathways, secondary metabolite biosynthesis pathway, and antibiotics biosynthesis pathway. Most studies on S. rebaudiana focused on the biosynthesis of secondary metabolites with a particular interest in SGs that contribute to the natural sweetness of Stevia. In this study, a total number of 14 genic-SSR loci associated with genes involved in the SGs biosynthesis pathway were identified. In addition, twenty pairs of genic-SSR primers were also designed and further validated in this study. From the 20 primer pairs, 17 (85.00%) were successfully cross-amplified in three different varieties of S. rebaudiana (SweetStevia, UKMB408 and AKHL1 var.). Three out of 17 loci screened were found to be polymorphic as revealed by polyacrylamide gel electrophoresis and confirmed by bidirectional amplicon sequencing of the PCR products. In conclusion, the transcriptome dataset has served as an excellent resource for the discovery of genic-SSRs in Stevia rebaudiana, and it also shows promising potential to develop polymorphic genic- SSR markers. As DNA markers available for this species is still very limited, the genic-SSR loci identified in this study will contribute substantially to the development of more DNA markers for the species, which may be applied in population and functional studies in the future. It may also be used as the baseline data towards developing DNA markers for selective breeding in the future.