Arabic Sentence Segmentation

Shared Task 2026

The Fourth Arabic Natural Language Processing Conference (ArabicNLP 2026) @ EMNLP 2026

Budapest, Hungary

Shared Task Description

The Arabic sentence segmentation shared task focuses on segmenting Arabic documents into coherent sentences. Given an Arabic document as input, participating systems must identify sentence boundaries throughout the text. The task is formulated as a binary token classification problem, where models predict whether a sentence boundary follows each token.

Task 1: Paragraph-Aware Arabic Sentence Segmentation

Given an Arabic document with paragraph boundaries preserved, predict for each token whether a sentence boundary follows it.

Task 2: No-Punctuation Paragraph-Aware Arabic Sentence Segmentation

Given an Arabic document with paragraph boundaries preserved but with punctuation removed, predict for each token whether a sentence boundary follows it.

Task 3: No-Paragraph Arabic Sentence Segmentation

Given an Arabic document without paragraph boundary information, predict for each token whether a sentence boundary follows it.

Task 4: No-Punctuation No-Paragraph Arabic Sentence Segmentation

Given an Arabic document without paragraph boundaries and without punctuation, predict for each token whether a sentence boundary follows it.

Data

The shared task features the AraSeg dataset, a manually annotated corpus for Arabic sentence segmentation. The dataset is provided in four variants for each subtask that differ in the availability of paragraph boundaries and punctuation.

Paragraph-Aware: Documents include paragraph boundaries and punctuation.
No-Punctuation Paragraph-Aware: Documents include paragraph boundaries, but no punctuation marks.
No-Paragraph: Documents include punctuation marks, but no paragraph boundaries.
No-Punctuation No-Paragraph: Documents do not include paragraph boundaries or punctuation marks.

The download links are provided above for each dataset.

Shared Task Tracks

Participants may compete in any combination of subtasks and tracks, resulting in eight possible evaluation settings. Each track differs in the allowed training resources and input conditions.

Closed Track: Models must be trained exclusively on the training set of the AraSeg Corpus.
- Paragraph-Aware: CodaBench Link
- No-Punctuation Paragraph-Aware: CodaBench Link
- No-Paragraph: CodaBench Link
- No-Punctuation No-Paragraph: CodaBench Link

Open Track: No restrictions on external resources, allowing the use of any publicly available data.
- Paragraph-Aware: CodaBench Link
- No-Punctuation Paragraph-Aware: CodaBench Link
- No-Paragraph: CodaBench Link
- No-Punctuation No-Paragraph: CodaBench Link

Refer to our GitHub repository if you'd like to setup the evaluation locally.

Shared Task Phases

Development Phase: This phase will run until July 20, 2026. Participants will build their models and submit predictions on the AraSeg Test set, which is publicly available (i.e, Open Test). Participants must submit their predictions using the respective CodaBench competition for each track. Submitting predictions in this phase does not require registration for the shared task. However, please note that doing so does not make you an official participant in the shared task. To be officially considered, you must register and submit your predictions during the Testing Phase.
Testing Phase: This phase will run from July 20, 2026 to July 25, 2026. Participants will upload their predictions on the Official Blind Test set (henceforth Blind Test). The Blind Test set will only be available to participants who registered to participate in the shared task. Participants must submit their predictions using the respective CodaBench competition for each track.

By registering to participate in the shared task and receiving access to the Official Blind Test set, you commit to submitting a description paper. Participants who register but fail to submit a paper may be disqualified from future shared tasks.

Metrics

We define sentence segmentation as a binary token classification task. We use the following metrics:

Precision: The percentage of predicted sentence boundaries that match a reference sentence boundary. Higher precision means fewer false boundary insertions.
Recall: The percentage of reference sentence boundaries that are correctly predicted. Higher recall means fewer missed sentence boundaries.
F1: The harmonic mean of boundary precision and boundary recall. It provides a single score that balances over-segmentation and under-segmentation errors.

These metrics are computed per document, and average across all documents.

Awards

Top-performing Systems:

We will recognize the top-performing system in each of the tasks + track combinations (4 tasks × 2 tracks), with a $100 prize per winning team.

Best System Description Papers:

We will also present a $200 Best System Description Paper Award recognizing clarity, technical quality, reproducibility, and insight, independent of leaderboard ranking.

Shared Task Registration

Important Dates

May 16, 2026: Release of training, dev and open test data, and evaluation scripts.
July 20, 2026: Registration deadline and release of test data.
July 25, 2026: End of evaluation cycle (test set submission closes).
August 8, 2026: System description paper submissions due.
August 15, 2026: Notification of acceptance.
August 15, 2026: Final results released.
August 22, 2026: Camera-ready versions due.

Shared Task Paper Submission

Coming soon.

Organizers

Mohammed Elkholy: Mohamed bin Zayed University of Artificial Intelligence

Khalid N. Elmadani: New York University Abu Dhabi

Nizar Habash: New York University Abu Dhabi

Bashar Alhafni: Mohamed bin Zayed University of Artificial Intelligence

Contact

For any questions related to this task, check out the FAQs. Feel free to post your questions on our Slack workspace. You are also welcome to contact the organizers directly at this email address: araseg26.organizers@aramlab.ai.

Page updated

Google Sites

Report abuse