Research articles
 

By Ms. Shaila P Chavan , Dr. Rajesh Gacche
Corresponding Author Ms. Shaila P Chavan
Deparment of Biotechnology, SRTMU, NANDED, 44, yeshwant nagar extension - India 431602
Submitting Author Ms. Shaila P Chavan
Other Authors Dr. Rajesh Gacche
Dept. of Boteny, SRTMU, NANDED, INDIA, - India

BIOTECHNOLOGY

Expressed Sequence Tags (ESTs), SSR, Markers, Microsatellite, Mungbean

Chavan SP, Gacche R. Identification and Characterization of EST-SSRs in Mungbean. WebmedCentral BIOTECHNOLOGY 2014;5(3):WMC004598
doi: 10.9754/journal.wmc.2014.004598

This is an open-access article distributed under the terms of the Creative Commons Attribution License(CC-BY), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Yes
Submitted on: 23 Mar 2014 07:27:55 PM GMT
Published on: 24 Mar 2014 05:23:51 AM GMT

Abstract


Background: Mungbean has great nutritional sense over other cereal crops and supplements the need of protein and carbohydrates. Mungbean crop improvement programmes through marker assisted selection and gene mapping required molecular markers. In this regards simple sequence repeats (SSR) have great utility mostly because of high polymorphism, co-dominance nature, random distribution and ample availability.

Methods: Conventional method for identification of SSRs through enrichment library and sequencing is labour intensive. Available nucleotide sequence data deposited in public domains is one of the major sources for identifying SSRs.

Results: In the present study, nucleotide sequences available from NCBI database for - mungbean including 829 EST (expressed sequence tags), 83 GSS (genomic survey sequence), and  2903 nucleotide sequences were searched for microsatellite identification and then the characterisation of these SSRs as di, tri and tetra, class I, class II and class III were carried out. The frequency distribution of SSRs concludes that ga/tc (518), ca/tg (238) among di nucleotide and caa/ttg (84) and gaa/ttc among trinucleotide are found most abundantin nucleotide sequence.

Conclusions: Moreover EST-SSR markers are also useful to understand population structure and evaluation of genetic diversity which is requisite for the effective utilization of available genetic resources in Vigna species.

Introduction


Mungbean(Vigna radiata(L.) Wilczek var. radiata) is one of the most important pulse crops grown in South, East and Southeast Asia. It provides significant amounts of protein  (240 g kg-1), carbohydrates (630 g kg-1), and a range of micronutrients in diets (Dahiya et al., 2013). Mungbean  has easily digestable proteins and create less flatulence than proteins derived from other legumes (Butt and Batoola, 2010). In addition, mungbean is lower in phytic acid (PA; 72% of total  phosphorus content) than pigeon pea (CajanuscajanL. Millsp.), soybean (Glycine max L.),  and cereals; PA is commonly found in cereal and legume crops and has a negative impact on  iron (Fe) and zinc (Zn) bioavailability in plant-based diets (Seneviratne et al., 2012). Due to its palatable taste and nutritional quality, mungbean has been used as an iron-rich whole food source for baby food (Dahiya et al., 2013). Besides nutritional importance very limited efforts has been employed for the mungbean crop improvement mostly because of the limited availability of genomic resources.

Molecular markers are the marker of choice over iso-zyme and morphological markers due to ease and abundance (Agarwal et al., 2008). Molecular markers are based on variation occurred in DNA sequences for example base-pair deletions, substitutions, additions or sequence patterns. To detect such polymorphisms, several methods have been developed for example restriction fragment length polymorphism (RFLP), random-amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP) and simple sequence repeat or microsatellite (SSR) (Botstein et al., 1980; Williams et al., 1990; Vos et al., 1995; Goldstein and Schlotterer, 1999).

SSRs are beneficial over other molecular markers as they are more reproducible, highly polymorphic, co-dominant inheritance, abundance and uniform distribution in nuclear as well as organelle genome (Sonah et al., 2011b). SSR are tandemly repeated motifs of 1-6 bases found in all prokaryotic and eukaryotic genomes analyzed to date (Sonah et al., 2011a). They are present in both coding and non-coding regions and are usually characterized by a high degree of length polymorphism (Singh et al., 2010). The variation observed in SSRs mostly found to be depends on their occurrence in coding or non-coding region in the genome. Compared to non-coding SSRs, coding SSR (genic-SSR) are less polymorphic since coding region has high level of selection pressure and so evolve slowly. On the other hand, variation in the genic-SSR may alter the function of gene and therefore reflect functional variation (Varshney et al., 2005). Therefore, genic-SSR has great importance for genomic research.

Identification of SSRs and subsequent development of marker is labor intensive and conventionally performed through enrichment library and sequencing (Tsukazaki et al., 2007; Nunome et al., 2009). However, available nucleotide sequence data deposited in public domains provided opportunity to identify and develop SSR markers in cost effective manner. In this regards several efforts have been made to develop SSR markers by using EST sequences. For instance, EST database has been explored for the development and characterization of EST-SSR markers in castor bean (Qiu et al., 2010). Similarly, chickpea EST sequences have been used to develop SSR markers and to analyses of allelic variation across related species (Choudhary et al., 2009). EST-SSR have also been used for comparative genomic analysis specifically to reveal degree of relatedness and to establish syntenic relationship among related species (Bhati et al., 2010).

Mungbean is orphan crop since very few genomic resources has been made. Earlier studies in mungbean has mostly used RFLP and RAPD markers since it does not required prior sequence information (Young et al., 1993; Lambrides et al., 2000; Chaitieng et al., 2002).  Molecular marker studies in mungbean has been performed but still the marker availability is in scarce (Gupta and Gopalkrishna 2013).  In recent study a genetic linkage map using 430 SSRs was developed but only 30 SSRs from mungbean were located on the map (unpublished data) and there is urgent need to develop genomic and genetic resources for this orphan crop to improve the crop.

In present study, EST sequences available in NCBI database and also nucleotide sequence submitted to this database with available GSS sequence  was used to identify SSRs in mungbean. Distribution of different SSR motifs and length was evaluated. With considering the need of developing more genomic and genetic resources for mungbean the study carried out with mainly three objectives- 1. Finding of SSRs with WEBSAT online software 2. Characterization of these SSRs as Di, Tri, tetra repeats and class I class II classification. 3. Evaluation of the repeats available in different sequence such as EST, nucleotides and in GSS sequence.

Methods


Data retrieval and processing

Entire EST sequences for the Vigna radiata (829 ESTs) were retrieved in Fasta format from the NCBI database (www.ncbi.nlm.nih.gov, 1st March 2014. Most of the ESTs present in NCBI database are not well processed and contains poly (A/T) signals, short reads, and ambiguous sequence calls (Ns). Therefore, EST sequence cleaning was done using the script trim_fasta.pl written by Jennifer Meneghin (http://alrlab.research.pdx.edu). For the EST cleaning, sequences less than 50 nucleotide length were removed and Poly (A/T) sequence greater than 8, present at the 5’ or 3’ end were trimmed. Moreover, sequence fragments other than the ESTs available in NCBI database for Vigna radiata were retrieved in Fasta format. These sequences mostly contain genomic survey sequences (GSS) and cDNA sequences. Therefore, cleaning process as like ESTs was also performed for these sequences.

Sequence assembling

EST database is a primary database which gathered sequences from several independent experiments. Therefore, many copies of same EST sequence may be present. Hence, to make it unique non-redundant, ESTs were assembled using SeqMan Pro (version 7.1.0) sequence assembly tool available in DNASTAR software package (DNASTAR Inc., Madison, WI). The ESTs from Vigna radiata, and batch of other non-EST sequences assembled independently to form primary sequence contigs.

Microsatellite detection

The file saved in fasta format for the sequence available in mungbean is then uploaded independently for EST, nucleotide rest of EST and for GSS sequence in microsatellite identification tool WEBSAT a program written by  Martin W (http://wsmartins.net/websat/). And the parameters set as only di, tri and tetra repeats repeated at least five times were selected. Compound SSRs were defined when 2 SSRs in the same sequence contigs interrupted by 100 nucleotide bases.

Primer designing for SSR markers

A previously demonstrated pipeline of primer designing for SSRs were used for nr sequence data of Vigna radiata (Sonah et al. 2010). In this pipeline, positional information of SSR generated by WEBSAT was used for designing repeats flanking primers. Primer pairs were designed from the flanking sequences of SSRs using primer3_core (www. broadinstitute.org/genome_software/other/primer3.html). For batch mode operation of primer3_core software, two perl scripts p3_in.pl and p3_out.pl were used as interface modules for the program-to-program data interchange between WEBSAT and the primer designing software Primer3. The optimal primer designing parameters like 60ºC annealing temperature, 20 bp primer length and 50% GC content, were kept to ensure 100–280 bp amplicon size.

The result obtained by websat is then analyzed for characterization of the SSRs. Initially, according to theoretically possible combinations each SSR was considered to be unique and was subsequently classified (Zou et al 2012), the reverse complements of microsatellite motifs were considered, and similar microsatellite motifs are grouped together. For example,  an AAG is equivalent to AGA and GAA in different reading frames and to CTT, TCT and TTC on a complementary strand. Thus, there are four possible combinations for di-nt repeats, ten for tri-nt repeats and 33 for tetra-nt repeats, the classification based on nucleotide length is also carried out as stated by Temnykh et al 2001.

Results


ESTs and assembled unique contigs in Vigna radiata

A total of 829 ESTs, 83 GSS and 2903 nucleotide sequences for plant Vigna radiata was available in NCBI (Table 1). However, the set of EST sequences was redundant and having duplicated or overlapping sequences. Therefore, the nucleotide sequences were assembled in contigs for meaningful representation of data, to reduce ambiguity, and to increase collective length of single stretch of a sequence. A total of 270 unique sequence contigs were formed from the 2903 nucleotide sequence. (Table 1).

Assembling of ESTs is crucial step for their effective utilisation in functional annotation, gene discovery and marker development. ESTs shorter than 100 nucleotides are difficult to design primers even those are containing SSRs. Therefore assembled longer ESTs are helpful for the EST-SSR marker development. Previously, several EST assembling programmes have been used that mostly included CAP3 program and Seqman (Pertea et al. 2003, Yu & Li 2008). SeqMan provides rapid assembling with minimum computational resources. Moreover, it facilitated removal of probable contamination of vector sequences prior to assembling. Detection of SNPs and variation present among the assembled sequences is possible in SeqMan. However, the number of ESTs available for Vigna radiata was not enough to identify significant SNPs.

Distribution and Frequency of microsatellites in Vigna radiata ESTs

Total of 1036 SSRs were identified from 2903 nucleotide sequence before contig creation. when all these sequences assemble and total 272 contig developed this may be due to repeat sequences submission. And dinucleotide was observed to be most frequent type of repeat with 78% (807) followed by tri repeats 238 (22%) and tetra repeat 1(0%)  (Fig. 1) before sequence assemble. And class III (< 12 n) repeats were most frequent followed by class I (>20n) and then class II (< 20>12n) (Fig 2). AG/CT in dinucleotide and CAA/TTG in tri nucleotide is most abundant  (Fig 3 and 4). After conting assemble dinucleotide was observed to be most frequent type of repeat with 80% (217) followed by tri repeats 53 (20%) (Fig. 1b). 

When the EST sequence (829) is uploaded in WEBSAT  total of 31 SSRs were found and Dinucleotide AG/CT (12) is more frequent followed by AT/TA (6) and CA/TC (5). Trinucleotides are found only 7 loci and GAA/TTC (2), GAG/CTC (2), and other at single locus is present a single tetra nucleotide is observed in these sequence. Only 83 GSS (genomic survey sequence) is available for the crop and only 3 SSRs are found in this sequence (Table 1).

The frequency of these two motifs was similar to the previously observed frequency in the coding DNA sequences of B. distachyon, S. bicolar, O. sativa, A. thaliana, M. truncatula, and P. trichocarpa (Sonah et al. 2011). Previous report suggests that the repeat AG/CT and AAG/CTT are favoured by both monocot as well as dicot species. However, CG/CG repeats have been observed only in monocots (Sonah et al. 2011). Likewise, in present study no repeats with CG/CG sequence were observed in all three Vigna species. Whereas, the number of tetra-, penta- and hexa-nucleotide repeats identified were much smaller to compare frequency on the basis of sequence. This is similar with previous studies in which very low frequency of these repeats has been reported (Bhati et al 2010, Sonah et al. 2011).

Microsatellite marker database for Vigna  radiata  species

A local database of non redundant Vigna radiata sequence were developed and subsequently used for microsatellite marker development. All the unique EST and non-EST sequence contigs were assembled into 1505 super contigs (nr sequence data). This nt sequence contain 270 SSRs were identified (Fig 1b). The average length of nr contigs was 631 that facilitate designing of PCR primers in flanking region. Finally, a set of 65 primers was successfully designed (Additional Table 1). The primers were designed using the previously demonstrated pipeline with same optional settings (Sonah et al. 2011). We ensure that the set of primers will give expected size of amplification. Since, previously we have demonstrated successful amplification of primers with PCR in wet lab 

Conclusions


The set of 31 EST-SSR and 65 SSR markers developed from nucleotide sequence in present study will be helpful for the gene mapping and tagging experiments, development of genetic linkage map, marker assisted breeding in Vigna radiata. Moreover EST-SSR markers are also useful to understand population structure and evaluation of genetic diversity which is requisite for the effective utilization of available genetic resources in Vigna species.

References


1. Agarwal, M., Shrivastava, N., and Padh, H. (2008). Advances in molecular marker techniques and their applications in plant sciences. Plant Cell Rep. 27, 617-631.

2. Bhati, J., Sonah, H., Jhang, T., Singh, N.K., and Sharma, T.R. (2010). Comparative analysis and EST mining reveals high degree of conservation among five Brassicaceae species. Comp. Funct. Genomics 2010.

3. Botstein, D., White, R.L., Skolnick, M., and Davis, R.W. (1980). Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am. J. Hum. Genet. 32, 314.

4. Butt, M.S., and Batoola, R. (2010). Nutritional and Functional Properties of Some Promising Legumes Protein Isolates. Pakistan Journal of Nutrition 9.

5. Chaitieng, B., Kaga, A., Han, O., Wang, X., Wongkaew, S., Laosuwan, P., Tomooka, N., and Vaughan, D. (2002). Mapping a new source of resistance to powdery mildew in mungbean. Plant breeding 121, 521-525.

6. Choudhary, S., Sethy, N.K., Shokeen, B., and Bhatia, S. (2009). Development of chickpea EST-SSR markers and analysis of allelic variation across related species. Theoretical and applied genetics 118, 591-608.

7. Dahiya, P., Linnemann, A., S van Boekel, M., Khetarpaul, N., Grewal, R., and Nout, M. (2013). Mung bean: technological and nutritional potential. Crit. Rev. Food Sci. Nutr.

8. Goldstein, D.B., and Schlotterer, C. (1999). Microsatellites: evolution and applications.

9. Lambrides, C., Lawn, R., Godwin, I., Manners, J., and Imrie, B. (2000). Two genetic linkage maps of mungbean using RFLP and RAPD markers. Crop and Pasture Science 51, 415-425.

10. Martins WS, Lucas DCS, Neves KFS, Bertioli DJ, WebSat (2009) - A Web Software for MicroSatellite Marker Development, Bioinformation, 3(6):282-283

11. Nunome, T., Negoro, S., Kono, I., Kanamori, H., Miyatake, K., Yamaguchi, H., Ohyama, A., and Fukuoka, H. (2009). Development of SSR markers derived from SSR-enriched genomic library of eggplant (Solanum melongena L.). Theoretical and Applied Genetics 119, 1143-1153.

12. Pandiyan M, Senthil N, Ramamoorthi N, Ganesh SK (2011) Present status and future perspectives of Glycine and Vigna in India. In: Tomooka N, Vaughan DA editors. The 14th NIAS international workshop on genetic resources – Genetic resources and comparative genomics of legumes (Glycine and Vigna). Tsukuba: National Institute of Agrobiological Science. 53–60.

13. Qiu, L., Yang, C., Tian, B., Yang, J.-B., and Liu, A. (2010). Exploiting EST databases for the development and characterization of EST-SSR markers in castor bean (Ricinus communis L.). BMC plant biology 10, 278.

14. Seehalak W Somta P, Sommanas W and Srinives P (2008) Microsatellite markers for mungbean developed from sequence database permanent genetic resources notes 862-864

15. Seneviratne, M., Gunasinghe, W.N., Gama-Arachchige, N., Dissanayake, N.B.U., and Karunaratne, A.M. (2012). Mineral bioavailability in three locally consumed pulses processed using popular methods: interpreted using molar ratios with phytic acid. Ceylon J. Sci. (Biol. Sci.) 41, 19-26.

16. Singh, H., Deshmukh, R.K., Singh, A., Singh, A.K., Gaikwad, K., Sharma, T.R., Mohapatra, T., and Singh, N.K. (2010). Highly variable SSR markers suitable for rice genotyping using agarose gels. Molecular breeding 25, 359-364.

17. Sonah, H., Deshmukh, R.K., Singh, V.P., Gupta, D.K., Singh, N.K., and Sharma, T.R. (2011a). Genomic resources in horticultural crops: status, utility and challenges. Biotechnology advances 29, 199-209.

18. Sonah, H., Deshmukh, R.K., Sharma, A., Singh, V.P., Gupta, D.K., Gacche, R.N., Rana, J.C., Singh, N.K., and Sharma, T.R. (2011b). Genome-wide distribution and organization of microsatellites in plants: an insight into marker development in Brachypodium. Plos one 6, e21298.

19. Tsukazaki, H., Nunome, T., Fukuoka, H., Kanamori, H., Kono, I., Yamashita, K.-i., Wako, T., and Kojima, A. (2007). Isolation of 1,796 SSR clones from SSR-enriched DNA libraries of bunching onion (Allium fistulosum). Euphytica 157, 83-94.

20. Varshney, R.K., Graner, A., and Sorrells, M.E. (2005). Genic microsatellite markers in plants: features and applications. Trends in biotechnology 23, 48-55.

21. Vos, P., Hogers, R., Bleeker, M., Reijans, M., van De Lee, T., Hornes, M., Friters, A., Pot, J., Paleman, J., and Kuiper, M. (1995). AFLP: a new technique for DNA fingerprinting. Nucleic acids research 23, 4407-4414.

22. Williams, J.G., Kubelik, A.R., Livak, K.J., Rafalski, J.A., and Tingey, S.V. (1990). DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic acids research 18, 6531-6535.

23. Young, N., Danesh, D., Menancio-Hautea, D., and Kumar, L. (1993). Mapping oligogenic resistance to powdery mildew in mungbean with RFLPs. Theoretical and Applied Genetics 87, 243-249.

Source(s) of Funding


No external funding

Competing Interests


Authors declare no competing intersts

Reviews
2 reviews posted so far

EST-SSRs in Mungbean
Posted by Dr. Rupesh K Deshmukh on 07 Apr 2014 04:13:04 AM GMT Reviewed by Author Invited Reviewers

Identification and Characterization of EST-SSRs in Mungbean
Posted by Dr. Jai Ghosh on 25 Mar 2014 05:19:57 PM GMT Reviewed by WMC Editors

Comments
0 comments posted so far

Please use this functionality to flag objectionable, inappropriate, inaccurate, and offensive content to WebmedCentral Team and the authors.

 

Author Comments
0 comments posted so far

 

What is article Popularity?

Article popularity is calculated by considering the scores: age of the article
Popularity = (P - 1) / (T + 2)^1.5
Where
P : points is the sum of individual scores, which includes article Views, Downloads, Reviews, Comments and their weightage

Scores   Weightage
Views Points X 1
Download Points X 2
Comment Points X 5
Review Points X 10
Points= sum(Views Points + Download Points + Comment Points + Review Points)
T : time since submission in hours.
P is subtracted by 1 to negate submitter's vote.
Age factor is (time since submission in hours plus two) to the power of 1.5.factor.

How Article Quality Works?

For each article Authors/Readers, Reviewers and WMC Editors can review/rate the articles. These ratings are used to determine Feedback Scores.

In most cases, article receive ratings in the range of 0 to 10. We calculate average of all the ratings and consider it as article quality.

Quality=Average(Authors/Readers Ratings + Reviewers Ratings + WMC Editor Ratings)