您的位置:首页  > 论文页面

FASTA格式序列特征提取方法

发表时间:2012-03-15  浏览量:2467  下载量:1156
全部作者: 初砚硕,王清丽,王墨洋,刘亚秋
作者单位: 东北林业大学信息与计算机工程学院;东北林业大学盐碱地生物资源环境研究中心
摘 要: 在生物信息学中,FASTA格式是存储核酸序列或氨基酸序列的常用文本格式。当生物研究人员对Blast序列比对结果进行逐条分析时,可能需要针对蛋白质序列中的某功能域或基因序列中执行特定功能的位点进行分析。针对该需求提出一种适用于大型FASTA格式序列文件的算法——压缩索引树统计算法。实验结果表明:该算法时间复杂度和空间复杂度均满足实用要求。�
关 键 词: 生物信息学;压缩索引树;FASTA格式;蛋白质序列
Title: Method of feature extraction of FASTA format sequence
Author: CHU Yanshuo, WANG Qingli, WANG Moyang, LIU Yaqiu
Organization: College of Information and Computer Engineering, Northeast Forestry University; Alkali Soil Natural Environmental Science Center, Northeast Forestry University
Abstract: In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences. When biologists analyze the sequence alignment result file of Blast, they probably confront the requirement of statistics about some domains in peptide sequences or some sites that perform specific functions in gene sequence. In this paper, a fast algorithm capable for the large sequence text file statistics was proposed, named compressed index tree algorithm. The results of experiments revealed the good performance of compressed index tree algorithm.�
Key words: bioinformatics; compressed index tree; FASTA format; protein sequence
发表期数: 2012年3月第5期
引用格式: 初砚硕,王清丽,王墨洋,等. FASTA格式序列特征提取方法[J]. 中国科技论文在线精品论文,2012,5(5):475-481.
 
0 评论数 0
暂无评论
友情链接