๐Ÿ“ Abstract

Recently, the developed technologies simplify to obtain the DNA sequences. Therefore, identifying genes from the large amount of DNA sequence data is an important issue in bioinformatics and it depends on the prediction of the complete gene structure. The accurate identification of the DNA splice sites plays an important role in the identification of the gene structure in eukaryotes. Thus, splice site recognition is the most important component of the finding gene structures. Efficient identification of splice sites is possible with the accurate recognition of the exon-intron boundaries of a given DNA sequence. This identification problem can be handled as a classification of boundary sequence elements into โ€˜exon-intronโ€™ (EI), โ€˜intron-exonโ€™ (IE) or โ€˜Noneโ€™ (N) classes. It is known that fuzzy approaches give better results in many pattern recognition problems, so a Fuzzy Weighted Approach (FWA) is proposed to solve this classification problem. The proposed method uses fuzzy membership values in order to compose the weights of the nucleotide positions. Moreover, this approach gives opportunity to identify the optimal narrow window size of splice site sequences to reduce computational load. The proposed FWA is applied to data set, consists of 3190 sequence, taken from UCI Machine Learning Repository (http://www.ics.uci.edu/~mlearn/MLRepository.html) and has enough high accuracy rate greater than 0.97.

๐Ÿท๏ธ Keywords

Splice Site ClassificationFuzzy NumberOptimal Window SizePosition Specific Scoring Matrix
๐Ÿ“„

Full Text Access

To download the full PDF, please login using your Paper ID and password provided upon submission.

๐Ÿ”‘ Author Login
๐Ÿ“–

Citation

Efendi Nasibov, Sezin Tunaboylu. (2021). Using a Fuzzy Weighted Approach in Determining the Optimal Window Size of Nucleotides for Classification of Splice Sites. Cithara Journal, 61(5). ISSN: 0009-7527