View Full Version : Character joining/shaping for Semitic languages files (Arabic, Hebrew, Farsi...)
i wana translate xbmc to arabic user
but i have some problem .
i'm trying translate before and it work good but letter dont continues in arabic
here example :
(programs) in english = (البرامج)in english
but in xbmc (programs) =(ا ل ب ر ا م ج )
there some spaces between letters
i'm using utf8 for arabic encoding
and font working with arabic letters
please help me for support arabic languge
i have translate xml files and fonts supports arabic letters
Gamester17
2008-02-23, 12:48
Sorry but XBMC does not yet support character joining/shaping for Semic languages (Arabic, Hebrew, Farsi...), see this other topic thread for more information:
http://xbmc.org/forum/showthread.php?t=12637
You are more than welcome to research the subject futher, to maybe see if you can find an open source library (C or C++ code) that can enable character joining/shaping for Semic language fonts.
thank u sooo much for ur helping
i'll try do some good work in xbmc project
i'm used it for 3 years and still used it
it's perfect project and multimedia player
jmarshall
2008-02-24, 09:06
We use libfreetype for the font rendering.
It sounds as though for arabic you have need the kerning stuff (I presume that's how it's done at least!) If so, you'll need to read the kerning info (pretty straight forward) and I suggest storing it in a cache so that you only need to read it once. Probably just reading it every char is fine for a start.
Cheers,
Jonathan
Gamester17
2008-02-24, 12:50
We use libfreetype for the font rendering.I thought we used libfreetype2
http://freetype.sourceforge.net/index2.html
http://www.freetype.org/freetype2/index.html
I think Pango may be a likely LGPL open source C/C++ library which could be used?
http://www.pango.org
Pango is a library for laying out and rendering of text, with an emphasis on internationalization. Pango can be used anywhere that text layout is needed, though most of the work on Pango so far has been done in the context of the GTK+ widget toolkit. Pango forms the core of text and font handling for GTK+-2.x.Pango also uses FreeType and features joining and shaping for Semic languages.
PS! It does not sound like eng.Ali is a C/C++ programmer himself?
eng.Ali: please see this thread http://xbmc.org/forum/showthread.php?p=173914#post173914
Hopefully the new patch will find its way to the source tree, but in the meanwhile, can you compile your own build of xbmc?"
If you can, then please download the patch from https://sourceforge.net/tracker/download.php?group_id=87054&atid=581840&file_id=270072&aid=1912468 then apply and compile.
As for the encoding used in language files "strings.xml and langinfo.xml" I used cp1256 "Arabic windows charset" during the test and seemed to work great. Just make sure to change the xml tag to "<?xml version="1.0" encoding="CP1256" standalone="no"?>" and save the in the corresponding Charest.
khaled Hosny
2008-03-29, 02:43
Hello all, this is may first post here.
I believe that pango is the *correct* way to achieve this, you shouldn't have your own BiDi or Arabic shaping code, but rather rely on a more sophisticated international text rendering library, and pango is the best in free software world.
Not to mention that pango does more than simple Arabic shaping, it does support a good deal of OpenType font features, enabling support for more languages and features. For example Pashto (spoken in Afghanistan and written in Arabic script) can't be rendered using the Arabic presentation forms method (as in the patch in the other thread) since Pashto specific characters has no Unocide presentation forms, not to mention support for Nasta`liq script used extensively in Iran and Pakistan.
Pango also support many East Asian scripts that isn't supported by other libraries.
khaled Hosny
2008-03-29, 02:44
BTW, freetype2 doesn't do text layout but rather glyph rendering.
jmarshall
2008-03-29, 03:02
From a quick skim, it look like Pango depends on glib, a dependency which we could do without.
khaled Hosny
2008-03-29, 03:44
But the gain is substantial, by using pango you ensure support for most writing systems now and in the future. Supporting international scripts is pain in the head, IMHO doing it once the correct way will save you lots of trouble.
Pango is already used in GTK+2 (it was developed for it), Gstreamer and many others.
jmarshall
2008-03-29, 04:44
I'm well aware of the benefits of going with a standard library, but it's not really a valid choice if it relies on dependencies that we do not wish to have. Has anyone ported glib to the xbox?
Perhaps you have time to investigate exactly how we could take advantage of pango, with the following knowledge:
1. We want to keep additional dependencies to a minimum.
2. All strings internally in XBMC are utf8. There is conversion (using iconv and fridibi as necessary) from native codesets into utf8 at the input of the string.
3. All strings for rendering expect WCHAR (16bit atm, but ofcourse this can be extended if necessary) and expect any "shaping" of the string to have been already performed. This includes the current fribidi flipping for arabic and hebrew.
4. All rendering is done using freetype2, and all glyphs are cached to a GL or DirectX texture for speed purposes. There is currently no kerning performed, though this is easy enough via a lookup table.
Cheers,
Jonathan
khaled Hosny
2008-03-29, 06:39
I'm well aware of the benefits of going with a standard library, but it's not really a valid choice if it relies on dependencies that we do not wish to have. Has anyone ported glib to the xbox?
I've no idea avtually :(
Perhaps you have time to investigate exactly how we could take advantage of pango, with the following knowledge:
1. We want to keep additional dependencies to a minimum.
2. All strings internally in XBMC are utf8. There is conversion (using iconv and fridibi as necessary) from native codesets into utf8 at the input of the string.
Pango uses utf8 too.
3. All strings for rendering expect WCHAR (16bit atm, but ofcourse this can be extended if necessary) and expect any "shaping" of the string to have been already performed. This includes the current fribidi flipping for arabic and hebrew.
Pango *does* text shaping and mirroring, that is the whole point of it: you pass strings in logical order to it and get it rendered correctly, you don't need to worry about the visual order of that text.
4. All rendering is done using freetype2, and all glyphs are cached to a GL or DirectX texture for speed purposes. There is currently no kerning performed, though this is easy enough via a lookup table.
Pango has a freetype2 backend, you might want to give a look to its reference manual http://library.gnome.org/devel/pango/unstable/ .
Also, you might find the source code of libLASi a useful example http://lasi.svn.sourceforge.net/viewvc/lasi/trunk/src/, as it is written in c++ and using pango/freetype2 for its text layout.
A final note, I'm not arguing here, but I do believe that Arabic Presentation forms based shaping is in adequate, and pango (OpenType based) is the only viable alternative.
Gamester17
2008-03-29, 15:10
Also, you might find the source code of libLASi a useful example http://lasi.svn.sourceforge.net/viewvc/lasi/trunk/src/, as it is written in c++ and using pango/freetype2 for its text layout.Are you only refering to the LASi library (libLASi) as an example on how to implement Pango?
http://unifont.org/lasi/index.html
http://sourceforge.net/projects/lasi
libLASi is library that provides a C++ stream output interface ( with operator << ) for creating PostScript documents that can contain characters from any of the script and symbol blocks supported in Unicode and by the Pango layout engine.
LASi uses Owen Taylor's Pango (http://www.pango.org) text layout engine. Pango itself depends on the glib infrastructure library of the GTK+ toolkit (http://www.gtk.org) and on the FreeType 2 (http://www.freetype.org) font handling library.Would glib (http://www.gtk.org) be one dependency to many for the Xbox? ???
khaled Hosny
2008-03-29, 22:29
Are you only refering to the LASi library (libLASi) as an example on how to implement Pango?
Yup, I don't know much of C/C++ to help, so I though libLASi would be a good example.
Hello all, this is may first post here.
I believe that pango is the *correct* way to achieve this, you shouldn't have your own BiDi or Arabic shaping code, but rather rely on a more sophisticated international text rendering library, and pango is the best in free software world.
Not to mention that pango does more than simple Arabic shaping, it does support a good deal of OpenType font features, enabling support for more languages and features. For example Pashto (spoken in Afghanistan and written in Arabic script) can't be rendered using the Arabic presentation forms method (as in the patch in the other thread) since Pashto specific characters has no Unocide presentation forms, not to mention support for Nasta`liq script used extensively in Iran and Pakistan.
Pango also support many East Asian scripts that isn't supported by other libraries.
I did a search about pango to read more about and it is promising and seems a great solution for complex text handling in most applications.
However, I think pango works in complex scripts by passing the script chunk to the corresponding script handling routine or module ( ring a bell ), and yes it handles most complex script but I think ( and I might be wrong) it will be slow for XBMC on the xbox and it seems that pango is known for slowing down applications that uses it.
Anyway I came across what seems to be your article about pango vs fribidi (http://www.khaledhosny.org/node/119). I didn't like that fact that you demolish other peoples work the way you do specially that pango is using fribidi (I think you are messing the concept of fribidi)! you think you are smart enough to do better, then by all means fribidi and pango are open source so show us your creativity "champ."
khaled Hosny
2008-03-30, 14:34
I think you misd the point of my blog post (I don't consider it an article, and I admit it wasn't clear enugh). I'm not demolish other peoples work, the fact that pango and fribidi are maintained by the same person (http://behdad.org/), I was discussing the concept of using Unicode Arabic presentation forms to do Arabic shaping vs. using smart font technologies like OpenType (http://en.wikipedia.org/wiki/OpenType) (that pango use through HarfBuzz (http://www.freedesktop.org/wiki/Software/HarfBuzz) library) and AAT fonts on MacOSX.
Since my main interests are in typography, the first approach is very ill and provide only very basic features (think in Arabic diacritics which are used frequently, Arabic script using languages as Pashto (http://en.wikipedia.org/wiki/Pashto_language) that can't be supported using the non-opentype approach, vertical scripts
(http://mces.blogspot.com/2006/08/vertical-pango.html), Indic scripts (http://en.wikipedia.org/wiki/Indic_scripts).) Presention forms shaping is a piece from the past that we shouldn't use now with the advanced in text layout over the last 10 years.
Fribidi used to do only BiDi (and pango uses it for that), but the newly realeased 2.x branch has shaping code, which I was actually refering to.
I'm afraid that the thread is messing the point, I don't actually use xbmc :( or has an xbox, I found that page while googling for some thing else, and I thought I'd point t pango so that the developer be aware of other solutions and the shortcommings of other approaches. By all means I didn't want to start a flame war or something, so please ignore my comments if you find it useless.
Sorry but XBMC does not yet support character joining/shaping for Semic languages (Arabic, Hebrew, Farsi...), see this other topic thread for more information:
http://xbmc.org/forum/showthread.php?t=12637
You are more than welcome to research the subject futher, to maybe see if you can find an open source library (C or C++ code) that can enable character joining/shaping for Semic language fonts.
Umm... Farsi isn't a Semitic language. It is Indo-European like English. I guess it doesn't make a difference since their alphabet is borrowed from arabic, however, the language is unrelated. I made an account on this site just to say that.
jmarshall
2008-11-18, 17:55
Thanks - we always appreciate knowledgeable contributions to the forums. The higher the signal to noise ratio the better!
Cheers,
Jonathan