Identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of Stevia rebaudiana Bertoni
Stevia rebaudiana is an important agricultural crop that yields diterpenoid steviol glycosides (SGs) commonly used to substitute sugar in food products and nutraceuticals. Despite the increasing demand for Stevia leaf and Stevia-based products, the genetic background of this crop remains poorly e...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2021
|
Subjects: | |
Online Access: | http://psasir.upm.edu.my/id/eprint/104312/1/AZRUL%20AFIQ%20BIN%20AZMI%20MURAD%20-%20IR.pdf http://psasir.upm.edu.my/id/eprint/104312/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Stevia rebaudiana is an important agricultural crop that yields diterpenoid steviol
glycosides (SGs) commonly used to substitute sugar in food products and
nutraceuticals. Despite the increasing demand for Stevia leaf and Stevia-based
products, the genetic background of this crop remains poorly elucidated. The
genetic markers available for this species are also extremely lacking. The current
study investigated an in-house leaf tissue transcriptome dataset of Stevia
rebaudiana and developed genic-SSR markers for the species using in silico
approaches. In total, 103,890 de novo assembled contig sequences were
analysed. Out of that, 8,065 contigs containing 8,789 genic-SSR loci were
unearthed via MIcroSAtellite identification (MISA) tool. From the 8,065 contigs
containing genic-SSR (CCGS) found, 7,400 CCGS contained single genic-SSR
per locus; while 665 CCGS contained multiple SSR per locus (ML). Furthermore,
amongst the 8,789 genic-SSR, 8,302 were identified as pure genic-SSRs, 105
were complex genic-SSRs and the remaining 382 were compound genic-SSRs.
From the functional annotation of the 8,065 CCGS identified, 6,447 CCGS were
annotated with functional genes; while remaining 1,618 CCGS were
unannotated. Out of 6,447 annotated CCGS, 5,494 CCGS matched significantly
to protein sequences of various plant species with an E-value cut off at 1.0E-15.
Among the 5,494 CCGS, 3,069 CCGS were annotated with known functional
genes and containing only single pure genic-SSR per locus. Pure trinucleotide
genic-SSRs (52.66%) were the predominant repeats. This was followed by pure
di- (35.32%), hexa- (6.48%), penta- (3.84%) and tetranucleotides (1.69%).
Microsatellite di- and trinucleotides are preponderant in S. rebaudiana leaf
transcriptome. Repeat motif AT/TA (50.28%) was the most abundant among the
dinucleotides, and the repeat motif GAT/ATC (12.87%) was predominant among
the trinucleotides. From the 3,069 annotated CCGS, 1,617 were mapped to
proteins available in the Kyoto Encyclopaedia of Genes and Genomes (KEGG)
database. The biosynthesis pathways with the highest number of annotated
CCGS mapped to them were the metabolic pathways, secondary metabolite
biosynthesis pathway, and antibiotics biosynthesis pathway. Most studies on S.
rebaudiana focused on the biosynthesis of secondary metabolites with a
particular interest in SGs that contribute to the natural sweetness of Stevia. In
this study, a total number of 14 genic-SSR loci associated with genes involved
in the SGs biosynthesis pathway were identified. In addition, twenty pairs of
genic-SSR primers were also designed and further validated in this study. From
the 20 primer pairs, 17 (85.00%) were successfully cross-amplified in three
different varieties of S. rebaudiana (SweetStevia, UKMB408 and AKHL1 var.).
Three out of 17 loci screened were found to be polymorphic as revealed by
polyacrylamide gel electrophoresis and confirmed by bidirectional amplicon
sequencing of the PCR products. In conclusion, the transcriptome dataset has
served as an excellent resource for the discovery of genic-SSRs in Stevia
rebaudiana, and it also shows promising potential to develop polymorphic genic-
SSR markers. As DNA markers available for this species is still very limited, the
genic-SSR loci identified in this study will contribute substantially to the
development of more DNA markers for the species, which may be applied in
population and functional studies in the future. It may also be used as the
baseline data towards developing DNA markers for selective breeding in the
future. |
---|