TY - JOUR
T1 - A modified Henry gas solubility optimization for solving motif discovery problem
AU - Hashim, Fatma A.
AU - Houssein, Essam H.
AU - Hussain, Kashif
AU - Mabrouk, Mai S.
AU - Al-Atabany, Walid
PY - 2019/11/25
Y1 - 2019/11/25
N2 - The DNA motif discovery (MD) problem is the main challenge of genome biology, and its importance is directly proportional to increasing sequencing technologies. MD plays a vital role in the identification of transcription factor binding sites that help in learning the mechanisms for regulation of gene expression. Metaheuristic algorithms are promising techniques for eliciting motif from DNA genomic sequences, but often fail to demonstrate robust performance by overcoming the inherent challenges in complex gene sequences, making search environment extremely non-convex for optimization methods. This paper proposes a novel modified Henry gas solubility optimization (MHGSO) algorithm for motif discovery which elicits a functional motif in DNA genomic sequences. In our approach, a new stage that captures the main characteristics of the motifs in DNA sequences is proposed, and MHGSO imitates the motifs characteristics for accurate detection of target motif. The performance of the MHGSO algorithm is validated using both synthetic and real datasets. Results confirm the stability and superiority of the proposed algorithm compared to state-of-the-art algorithms including MEME, DREME, XXmotif, PMbPSO, and MACS. Based on several evaluation matrices, MHGSO outperforms the competitor techniques in terms of nucleotide-level correlation coefficient, recall, precision, F-score, Cohen’s Kappa, and statistical validation measures.
AB - The DNA motif discovery (MD) problem is the main challenge of genome biology, and its importance is directly proportional to increasing sequencing technologies. MD plays a vital role in the identification of transcription factor binding sites that help in learning the mechanisms for regulation of gene expression. Metaheuristic algorithms are promising techniques for eliciting motif from DNA genomic sequences, but often fail to demonstrate robust performance by overcoming the inherent challenges in complex gene sequences, making search environment extremely non-convex for optimization methods. This paper proposes a novel modified Henry gas solubility optimization (MHGSO) algorithm for motif discovery which elicits a functional motif in DNA genomic sequences. In our approach, a new stage that captures the main characteristics of the motifs in DNA sequences is proposed, and MHGSO imitates the motifs characteristics for accurate detection of target motif. The performance of the MHGSO algorithm is validated using both synthetic and real datasets. Results confirm the stability and superiority of the proposed algorithm compared to state-of-the-art algorithms including MEME, DREME, XXmotif, PMbPSO, and MACS. Based on several evaluation matrices, MHGSO outperforms the competitor techniques in terms of nucleotide-level correlation coefficient, recall, precision, F-score, Cohen’s Kappa, and statistical validation measures.
U2 - 10.1007/s00521-019-04611-0
DO - 10.1007/s00521-019-04611-0
M3 - Article
SN - 1433-3058
VL - 32
SP - 10759
EP - 10771
JO - Neural Computing and Applications
JF - Neural Computing and Applications
IS - 14
ER -