搜索从阿拉伯文本文件的阿拉伯词

时间:2022-09-13 09:39:42

I am working on a Quran application. I have text file(UTF-8) of the Quran (in Arabic language). I want to search my Arabic word from the Quran. I want to write an Arabic word of Quran without Aarabs e.g. Zaber, Zair, shud, Mud and Paish. Aarabs are basically Arabic vowels. Arabic Aarabs detail

我正在研究*应用程序。我有*的文本文件(UTF-8)(阿拉伯语)。我想从*中搜索我的阿拉伯语单词。我想在没有Aarabs的情况下写一个*的阿拉伯语单词,例如Zaber,Zair,shud,Mud和Paish。 Aarabs基本上是阿拉伯语元音。阿拉伯语Aarabs细节

Following is the code to search the English word from my ArrayList called testingarray. But for Arabic it's not returning the correct word.

以下是从我的ArrayList中搜索名为testingarray的英文单词的代码。但对于阿拉伯语而言,它没有返回正确的单词。

testingarray.get(Index).toString().trim().toLowerCase().contains(word.trim().toLowerCase())) {

1 个解决方案

#1


1  

Here the Arabic set table of the Unicode, It's more easy to use Regex to filter such complex text.

这里是Unicode的阿拉伯语集表,使用正则表达式来过滤这种复杂文本更容易。

This is an example for short vowels removing in PHP (I'm not a java programmer)

这是用PHP删除短元音的一个例子(我不是java程序员)

text.preg_replace("/[\x{064B}-\x{065F}]/u","")

There are more other vowels in Noble Quran you may need to add their ranges.

贵族*中还有更多其他元音,你可能需要添加它们的范围。

Just to be more accurate you may need to Normalize the Arabic text.

为了更准确,您可能需要规范化阿拉伯语文本。

#1


1  

Here the Arabic set table of the Unicode, It's more easy to use Regex to filter such complex text.

这里是Unicode的阿拉伯语集表,使用正则表达式来过滤这种复杂文本更容易。

This is an example for short vowels removing in PHP (I'm not a java programmer)

这是用PHP删除短元音的一个例子(我不是java程序员)

text.preg_replace("/[\x{064B}-\x{065F}]/u","")

There are more other vowels in Noble Quran you may need to add their ranges.

贵族*中还有更多其他元音,你可能需要添加它们的范围。

Just to be more accurate you may need to Normalize the Arabic text.

为了更准确,您可能需要规范化阿拉伯语文本。