Arab Techies Code Sprint 2009: Day 2 & 3

We started the second day with a Go-Round of our favorite meal, it's always interesting to learn about our slightly different cultures, for example techies from the west are fond of eating Whale Kouskous 'Kouskousy bel 7oot' which after some explanation turns out that it doesnt contain any whale at all, but they use the word 'whale' to mean fish :)

Straight away, we had another go round of report backs from each person on their progress the day before and their plan for the day:

Amr was almost done with the Yamli FireFox extension and wanted to work afterwards on a new topic, Waheed was almost done with Yamli and wanted to work on language detection, Slim who also worked on Yamli was less optimistic than Amr and Waheed because they kept finding bugs. BooDy worked on incorporating Ar-PHP library in a Drupal search module and Hijouij also worked on Ar-PHP integration with Wordpress and wanted to work on social localization tools.

Khaled Al Shamaa of Ar-PHP took the feedback of the developers who wanted to integrate his library with other web applications and made some changes, and wanted to update the rest of the team working on Ar-PHP integrations on the modifications he introduced. Alaa and Khaled Hosny worked on search normalization by transforming the search string into a regular expression for situations where indexing normalized text was not possible.

Mostafa who just joined the Code Sprint introduced himself and his work on Ar-Morph, and expressed his frustration that no Arabs contributed to its development and his excitement to join the Arab Techies Code Sprint. Alaa commented on Mostafa's frustration saying that the lack of communication among Arab technology professionals is wide spread, since Arab Techies keep finding amazing people working creatively and developing great solutions that only few people know about. We hope Arab Techies will contribute to solving this problem and work on bridging and connecting techies, users and academics, giving more exposure to the work they do.

After the report back, we split for 2 hours into the same teams and Mostafa had a quick tour between the 3 teams, then he choose to collaborate with Yamli team.

codesprint_day2_1.jpg

Riham (above) and BooDy (below) coding on the second day.

codesprint_day2_2.jpg

In the next session, we returned to the group and listed the tasks we wanted to work on next:
  • Ar-PHP Drupal integration
  • Search and content clustering
  • Ar-PHP Wordpress integration
  • Social translation tools

And after some discussions we split into 3 teams that worked together until the end of the day:

  • Yamli: It seems that Slim was right, and it needed more work. Slim, Amr and Waheed agreed to continue working on that task.
  • Social translation tools: Alaa, Khaled Hosny, Hijouij, Djihed and BooDy decided to hold a discussion on features that should be introduced in localization tools to enable social translation into pools such as Pootle, and I joined them in discussing the mock ups they came up with.
  • Search & content clustering: Khaled Al Shamaa, Riham, Taha and Mostafa decided to discuss strategies for content clustering but they found they need to improve search first and started working on stopwords.

codesprint_day2_3.jpg

Search team working on stopwords (above: from the right Taha, Riham, Khaled and Mostafa) and Yamli team (below: from the right Waheed, Slim and Amr)

codesprint_day2_4.jpg

We ended the day by taking the sprinters for some traditional Egyptian meal followed by a visit to Zamalek nice bookstores to check Egyptian music and literature :)


The third day had the same format as the day before, starting with a funny Go-Round of what software project will you work on if you had unlimited resources .. seems many of us have lots of ideas to improve existing Open Source projects ;-)

Then we had report backs from each participant, the Yamli team reported finding a bug in the Yamli API that prevented them from finalizing their work, but they had reported the bug and were in communication with Yamli developers who were working on fixing it and it actually got fixed by the end of the report back :)

The Ar-PHP Drupal integration team wrote a module for translietrating file names (BooDy) and another one for auto tagging of Arabic content (Alaa), but they reported finding some limitation in the library and wanted to work on refactoring it and improving its API, a suggestion that was welcomed by Khaled Al Shamaa.

The search team worked on stopwords and wanted to continue working on it, particularly on categorizing it to be able to enable/disable each category per application (ex: month names). From their discussions, they identified a missing bit in stopwords that will need further work in the future which is gererating stopwords by adding suffixes (ex: إلى، إليه، إليها، إليهم، ...)

The social translation tools team had a very interesting discussion and created some mock ups with suggested features and modifications that need to be discussed with Pootle developers.

After the report back, we welcomed the new sprinters: Bassem Jarkas from Ar-Wikipedia, Linuxawy from EGLUG and Omar Abdel Wahab. Alaa suggested to Bassem to incorporate the work done by the search team such as normalization and stopwords in MediaWiki for Ar-Wikipedia.

We split into 4 teams which worked on the following tasks until lunch:
  • Search and Stop Words: Walid, Taha, Riham and Linuxawy
  • Ar-PHP refactoring: Alaa, Khaled Al Shamaa, Slim, Omar, Djihedand BooDy
  • Yamli: Amr, Waheed and Mostafa
  • RTL design document to document the best practices in implementing and/or porting RTL web designes: Hijouij, Khaled Hosny, Bassem and I helped a bit with that :)

codesprint_day3_1.jpg

Different teams spread everywhere (above: from the right Bassem, Walid, Mostafa, Riham and Khaled Hosny) and in the balcony (below: from the right: BooDy, Khaled Hosny, Djihed, Hijouij, Bassem and Alaa)

codesprint_day3_2.jpg

After lunch we had another round of report backs, the search and stopwords were reviewing their lists and said they will be done by the end of the day.

The Ar-PHP team worked on deshaping (i.e reverse shaping, some programes generate preshaped text, which is not proper unicode, diffcult to search, etc.) and stemming, and some of them focused on writing more Drupal modules (to generate Arabic PDFs and to display dates in various Arabic formats) but were not done yet. Khaled Al Shamaa fixed the library loading problem and spent some time in documenting some algorithms. Alaa who worked with Khaled on documenting the algorithms said that this exercise made them realise that the amount of refactoring required was much less than what they had originally expected.

The RTL documentation team were almost done, and Bassem and Khaled Hosny also worked on spellchecking.

Slim finished Yamli bookmarklet and wanted to work on packaging it, and Amr released the Yamli FireFox plugin and sprinters started testing it.. Congratulations everyone :)

We then welcomed our new guest who arrived at lunch, Dr. Noha Yousri, a young academic from the Computer Science Department in the Faculty of Engineering, University of Alexandria who worked on pattern recognition, clustering and document analysis.

Realizing that the Code Sprint was only one day away from ending, before spliting again, we decided to list down the deliverables that are feasable to be finished by the last day of the sprint and mark them as our focus for the last day.
  • Yamli FireFox extension
  • Yamli bookmarklet
  • Stopwords (a categorized list, we will need to discuss distribution channels afterwards).
  • Conjugator
  • Taha's stemmer
  • Process document, documenting the algorithms and data tables used for working on search, stop words, normalization, etc. through out the code sprint.
  • New version of Ar-PHP including the new modifications, the stemmer and the wordlist.
  • New version for the Wordpress plugin for the Ar-PHP library, for search improvement.
  • New version for the FireFox extension for the Ar-PHP library, for detecting and changing the characters if using the wrong keyboard layout.
  • Ar-PHP Drupal module, containing all the Drupal modules that were written through out the Code Sprint to incorporate many of the Ar-PHP functions.
  • Ruby character normalization plugin, for stripping diacritics, Hamazat and common spelling mistakes.
  • Python normalization code using regular expressions, Hamazat and common spelling mistakes.
  • RTL design document.

Were we able to deliver the above list?? Stay tuned for the next blogpost on the final day of Arab Techies Code Sprint ;-)

|

Amazin

Amazing Job!! Wish I was there :(

» |
Taha Zerrouki's picture

إضاءة الصور

شكرا جزيلا،
ملاحظة فقط، أرجو زيادة إضاءة الصور قليلا

» |