๐ Abstract
Recently, the developed technologies simplify to obtain the DNA sequences. Therefore, identifying genes from the large amount of DNA sequence data is an important issue in bioinformatics and it depends on the prediction of the complete gene structure. The accurate identification of the DNA splice sites plays an important role in the identification of the gene structure in eukaryotes. Thus, splice site recognition is the most important component of the finding gene structures. Efficient identification of splice sites is possible with the accurate recognition of the exon-intron boundaries of a given DNA sequence. This identification problem can be handled as a classification of boundary sequence elements into โexon-intronโ (EI), โintron-exonโ (IE) or โNoneโ (N) classes. It is known that fuzzy approaches give better results in many pattern recognition problems, so a Fuzzy Weighted Approach (FWA) is proposed to solve this classification problem. The proposed method uses fuzzy membership values in order to compose the weights of the nucleotide positions. Moreover, this approach gives opportunity to identify the optimal narrow window size of splice site sequences to reduce computational load. The proposed FWA is applied to data set, consists of 3190 sequence, taken from UCI Machine Learning Repository (http://www.ics.uci.edu/~mlearn/MLRepository.html) and has enough high accuracy rate greater than 0.97.
๐ท๏ธ Keywords
Full Text Access
To download the full PDF, please login using your Paper ID and password provided upon submission.
๐ Author Login