Pemberian Harakat Bahasa Arab Menggunakan Metode N-Gram dan C5.0

Authors

  • Ajib Hanani Program Magister Teknik Elektro Fakultas Teknik, Universitas Brawijaya
  • Harry Soekotjo Dachlan Jurusan Teknik Elektro Fakultas Teknik Universitas Brawijaya
  • Purnomo Budi Santoso Jurusan Teknik Industri Fakultas Teknik Universitas Brawijaya

DOI:

https://doi.org/10.21776/jeeccis.v9i1.286

Abstract

This research is about an Arabic diacritizer using N-Gram (Quad Gram) and C5.0. The input of application is an undiacritized Arabic sentence. Quad Gram and matching word to known patterns are used to diacritize the undiacritized Arabic sentence. C5.0 is used to simplify the rules for diacritizing the final diacritic mark of word. The output of application is a diacritized Arabic sentence based on morphological and syntactic rules. Then, the diacritized Arabic sentence is converted into speech and spoken by talking avatar using text to speech API. The result of N-Gram (Quad Gram) and matching word to known patterns gives an accuracy of 96% based on morphological rules. The result of C5.0 gives an accuracy of 94% based on syntactic rules. Finally, the result of integrating N-Gram (Gram Quad) and C5.0 gives an accuracy of 90% based on morphological and syntactic rules.Index Terms — Arabic Diacritics, C5.0, N-Gram, Text to Speech.

Downloads

How to Cite

[1]
A. Hanani, H. S. Dachlan, and P. B. Santoso, “Pemberian Harakat Bahasa Arab Menggunakan Metode N-Gram dan C5.0”, jeeccis, vol. 9, no. 1, pp. pp.73–78, Sep. 2015.

Issue

Section

Articles