IMPACT SCORE JOURNAL RANKING CONFERENCE RANKING Conferences Journals Workshops Seminars SYMPOSIUMS MEETINGS BLOG LaTeX 5G Tutorial Free Tools
MLSP 2024 : Multilingual Lexical Simplification Pipeline (MLSP) Shared Task @ 19th Workshop on Innovative Use of NLP for Building Educational Applications
MLSP 2024 : Multilingual Lexical Simplification Pipeline (MLSP) Shared Task @ 19th Workshop on Innovative Use of NLP for Building Educational Applications

MLSP 2024 : Multilingual Lexical Simplification Pipeline (MLSP) Shared Task @ 19th Workshop on Innovative Use of NLP for Building Educational Applications

Mexico City
Event Date: June 21, 2024 - June 21, 2024
Submission Deadline: March 25, 2024




Call for Papers

The organisers are pleased to announce a new shared task, inviting participants to contribute novel systems for a Multilingual Lexical Simplification Pipeline. This task comprises lexical complexity prediction and lexical simplification, uniting these two core simplification tasks into a single pipeline. We invite participants to develop new lexical simplification systems for these two tasks in a variety of high- and low-resource languages (listed below).

This shared task will be hosted at the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024), which will be colocated with the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024) in Mexico City June 21-22nd.

Lexical complexity prediction was previously explored as part of the LCP 2021 shared task, hosted as part of SemEval 2021 (Shardlow et al. 2021). Participants were presented with a given word in a sentence and asked to evaluate its complexity on a continuous scale. This task requires participants to judge the difficulty of a given target word within a context on a continuous scale in the range 0 (easy to understand) to 1 (hard to understand).

Lexical simplification has also recently been explored at the TSAR 2022 shared task (Saggion et al. 2022), hosted as part of the Text Simplification, Accessibility and Readability Workshop at EMNLP 2022. In this task, systems must provide easier to understand alternatives for a given identified complex word in its context.

The lexical simplification pipeline unites these two tasks. Given a sentence with a marked token, the system must first make a prediction regarding the complexity of that token and secondly provide potential simpler alternatives for the token, or none if the token is judged to not require simplification. By co-developing systems to jointly perform these tasks, participants will create a working lexical simplification pipeline system that can be applied in settings such as education to improve the readability of texts for learners.

**Languages**

We will provide evaluation data for the following languages:

- English (en)
- French (fr)
- Brazillian Portuguese (pt-br)
- Bengali (bn)
- Sinhala (si)
- Filipino (fil)
- Japanese (jp)
- Italian (it)

We also hope to announce at least three further languages for participation.

Participants are free to submit to one or multiple languages. We strongly encourage submissions from multilingual systems that are capable of handling the languages that we have released and further languages beyond the scope of the task. We will provide a separate ranking for multilingual systems that participate in all languages.

**Dataset Format**

There is now a glut of available resources for simplification tasks such as lexical complexity prediction and lexical simplification. As such, each language will provide an unlabelled **test set only** comprising of 570 instances. Labelled trial data will also be released comprising of 30 instances per languages for the purpose of calibrating systems for the evaluation phase. **We will not release new training data for this task.** Participants are encouraged to make use of the many existing resources for lexical complexity prediction and lexical simplification to train their systems. A list of available resources will be hosted on the shared task website.

Each data instance in the trial data will comprise of the following fields: *language, token, begin, end, context, complexity, substitutions*. These are described below:

- Language: The language code for this instance
- Token: The identified (whole-word) token to be evaluated
- Begin: the begin-offset of the token in the context
- End: the end-offset of the token in the context
- Context: the context in which this token appeared. Typically, but not limited to the enclosing sentence boundaries.
- Complexity: A complexity score bounded in the range 0-1 derived from asking 10 annotators to judge the token in its context on a scale of 1 (easy) to 5 (difficult).
- Substitutions: A list of no more than 10 substitutions ranked by frequency of suggestion by the annotators.

Each data instance in the test data will comprise of the following fields: *language, token, begin, end, context*. Participant systems will provide the ‘complexity’ and ‘substitutions’ fields in the same format as the trial data.

**Evaluation**

For Lexical Complexity Prediction, we will evaluate using:

**Root Mean Squared Error** calculated between the system outputs for lexical complexity and the values returned by the annotators. See Shardlow et. al (2021) for details.

For Lexical Simplification We will use two metrics defined in Saggion et al. (2022) as follows:

**MAP@K** uses a ranked list of system-generated substitutes against the set of gold-standard substitutes. MAP@k takes into account the position of the relevant substitutes among the first k generated candidates.

**Accuracy@k@top1**, which is the percentage of instances where at least one of the k top ranked substitutes matches the most frequently suggested synonym in the gold data.

We will also provide **Human End-to-End Evaluation** for:

**Simplicity**,

**Fluency** and

**Meaning Preservation**.

Human evaluation will take place for the top 5 ranking systems according to the automated metrics. Availability of human evaluation will depend on the recruitment of evaluators from the task participants.

**Participant Registration**

Interested parties can register prior to the Trial Data Release via our [participant registration Google Form](https://sites.google.com/d/151nOTm4Lwla2MXolnTgNSk6hNQoCaruX/p/1BWd0x4Q2v8nBJZSslymvPUd2vkzpwCWO/edit)

Further information will be released through [the MLSP shared task website](https://sites.google.com/view/mlsp-sharedtask-2024/home)

**Timeline**

| Fri Feb 16 , 2024 | Trial Data Release |
| --- | --- |
| Fri Mar 15 , 2024 | Test Data Release |
| Mon Mar 25, 2024 | Final Submissions |
| Fri Apr 12, 2024 | System Papers Due |
| Fri Jun 21 2024 | BEA Workshop |

**Organisers**

| Matthew Shardlow | Manchester Metropolitan University |
| --- | --- |
| Marcos Zampieri | George Mason University |
| Kai North | George Mason University |
| Fernando Alva-Manchego | Cardiff University |
| Thomas François | UCLouvain |
| Remi Cardon | UCLouvain |
| Nishat Raihan | George Mason University |
| Tharindu Ranasinghe | Aston University |
| Joseph Imperial | University of Bath |
| Riza Batista-Navarro | University of Manchester |
| Adam Nohejl | NAIST |
| Yusuke Ide | NAIST |
| Akio Hayakawa | Universitat Pompeu Fabra |
| Laura Occhipinti | University of Bologna |
| Horacio Saggion | Universitat Pompeu Fabra |

**References**

Saggion, H., Štajner, S., Ferrés, D., Sheang, K.C., Shardlow, M., North, K. and Zampieri, M., 2022, December. Findings of the TSAR-2022 Shared Task on Multilingual Lexical Simplification. In *Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)* (pp. 271-283).

Shardlow, M., Evans, R., Paetzold, G. and Zampieri, M., 2021, August. SemEval-2021 Task 1: Lexical Complexity Prediction. In *Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)* (pp. 1-16).


Summary

MLSP 2024 : Multilingual Lexical Simplification Pipeline (MLSP) Shared Task @ 19th Workshop on Innovative Use of NLP for Building Educational Applications will take place in Mexico City. It’s a 1 day event starting on Jun 21, 2024 (Friday) and will be winded up on Jun 21, 2024 (Friday).

MLSP 2024 falls under the following areas: NATURAL LANGUAGE PROCESSING, ARTIFICIAL INTELLIGENCE, etc. Submissions for this Workshop can be made by Mar 25, 2024.

Please check the official event website for possible changes before you make any travelling arrangements. Generally, events are strict with their deadlines. It is advisable to check the official website for all the deadlines.

Other Details of the MLSP 2024

  • Short Name: MLSP 2024
  • Full Name: Multilingual Lexical Simplification Pipeline (MLSP) Shared Task @ 19th Workshop on Innovative Use of NLP for Building Educational Applications
  • Timing: 09:00 AM-06:00 PM (expected)
  • Fees: Check the official website of MLSP 2024
  • Event Type: Workshop
  • Website Link: https://sites.google.com/view/mlsp-sharedtask-2024/home
  • Location/Address: Mexico City


Credits and Sources

[1] MLSP 2024 : Multilingual Lexical Simplification Pipeline (MLSP) Shared Task @ 19th Workshop on Innovative Use of NLP for Building Educational Applications


Check other Conferences, Workshops, Seminars, and Events


OTHER NATURAL LANGUAGE PROCESSING EVENTS

NLPAI 2024: 2024 5th International Conference on Natural Language Processing and Artificial Intelligence (NLPAI 2024)
Chongqing, China
Jul 12, 2024
ICNLSP 2024: 7th International Conference on Natural Language and Speech Processing
Trento, Italy
Oct 19, 2024
TAL-SDP 2024: Special issue of the TAL journal: Scholarly Document Processing
N/A
MUWS 2024: MUWS 2024 - The 3rd International Workshop on Multimodal Human Understanding for the Web and Social Media
Phuket, Thailand
Jun 10, 2024
ISPR 2024: 4th International Conference on Intelligent Systems and Pattern Recognition
Istanbul, Turkey
Jun 12, 2024
SHOW ALL

OTHER ARTIFICIAL INTELLIGENCE EVENTS

ICCMA--EI 2024: 2024 The 12th International Conference on Control, Mechatronics and Automation (ICCMA 2024)
Brunel University London, UK
Nov 11, 2024
NLPAI 2024: 2024 5th International Conference on Natural Language Processing and Artificial Intelligence (NLPAI 2024)
Chongqing, China
Jul 12, 2024
ICAITE 2024: 2024 the International Conference on Artificial Intelligence and Teacher Education (ICAITE 2024)
Beijing, China
Oct 12, 2024
Informed ML for Complex Data@ESANN 2024: Informed Machine Learning for Complex Data special session at ESANN 2024
Bruges, Belgium
Oct 9, 2024
Effective Grant Writing Using AI 2024: Invitation to Faculty Development Program Effective Grant Writing Strategies Using AI
Online
Mar 12, 2024
SHOW ALL