IMPACT SCORE JOURNAL RANKING CONFERENCE RANKING Conferences Journals Workshops Seminars SYMPOSIUMS MEETINGS BLOG LaTeX 5G Tutorial Free Tools
BUCC 2023 : 16th Workshop on Building and Using Comparable Corpora
BUCC 2023 : 16th Workshop on Building and Using Comparable Corpora

BUCC 2023 : 16th Workshop on Building and Using Comparable Corpora

Varna, Bulgaria
Event Date: September 07, 2023 - September 08, 2023
Submission Deadline: July 18, 2023
Notification of Acceptance: July 31, 2023
Camera Ready Version Due: August 25, 2023




Call for Papers

16th Workshop on Building and Using Comparable Corpora (BUCC)
with Shared Task on Multilingual Terminology Extraction
from Comparable Specialized Corpora

Co-located with RANLP 2023

September 7 or 8, 2023

Workshop website: https://comparable.limsi.fr/bucc2023/


Workshop proceedings to be published in ACL Anthology

Invited speaker: Sida I. Wang, Meta AI (FAIR)

**************************************************************

MOTIVATION

In the language engineering and the linguistics communities, research in
comparable corpora has been motivated by two main reasons. In language
engineering, on the one hand, it is chiefly motivated by the need to use
comparable corpora as training data for statistical NLP applications
such as statistical and neural machine translation or cross-lingual
retrieval. In linguistics, on the other hand, comparable corpora are of
interest because they enable cross-language discoveries and comparisons.
It is generally accepted in both communities that comparable corpora
consist of documents that are comparable in content and form in various
degrees and dimensions across several languages. Parallel corpora are on
the one end of this spectrum, unrelated corpora on the other.

Comparable corpora have been used in a range of applications, including
Information Retrieval, Machine Translation, Cross-lingual text
classification, etc.? The linguistic definitions and observations
related to comparable corpora can improve methods to mine such corpora
for applications of statistical NLP, for example to extract parallel
corpora from comparable corpora for neural MT. As such, it is of great
interest to bring together builders and users of such corpora.


TOPICS

We solicit contributions on all topics related to comparable (and
parallel) corpora, including but not limited to the following:

Building Comparable Corpora:

* Automatic and semi-automatic methods
* Methods to mine parallel and non-parallel corpora from the web
* Tools and criteria to evaluate the comparability of corpora
* Parallel vs non-parallel corpora, monolingual corpora
* Rare and minority languages, across language families
* Multi-media/multi-modal comparable corpora

Applications of comparable corpora:

* Human translation
* Language learning
* Cross-language information retrieval & document categorization
* Bilingual and multilingual projections
* (Unsupervised) Machine translation
* Writing assistance
* Machine learning techniques using comparable corpora

Mining from Comparable Corpora:

* Cross-language distributional semantics, word embeddings and
pre-trained multilingual transformer models
* Extraction of parallel segments or paraphrases from comparable corpora
* Methods to derive parallel from non-parallel corpora (e.g. to provide
for low-resource languages in neural machine translation)
* Extraction of bilingual and multilingual translations of single words,
multi-word expressions, proper names, named entities, sentences,
paraphrases etc. from comparable corpora
* Induction of morphological, grammatical, and translation rules from
comparable corpora
* Induction of multilingual word classes from comparable corpora

Comparable Corpora in the Humanities:

* Comparing linguistic phenomena across languages in contrastive linguistics
* Analyzing properties of translated language in translation studies
* Studying language change over time in diachronic linguistics
* Assigning texts to authors via authors' corpora in forensic linguistics
* Comparing rhetorical features in discourse analysis
* Studying cultural differences in sociolinguistics
* Analyzing language universals in typological research


IMPORTANT DATES

July 18, 2023: Paper submission deadline
July 31, 2021: Notification of acceptance
August 25, 2021: Camera ready final papers
September 7 or 8, 2023: Workshop date

For updates see the workshop website


PRACTICAL INFORMATION

Workshop registration is via the main conference registration site

The workshop proceedings will be published in the ACL Anthology.


SUBMISSION GUIDELINES

Please follow the style sheet and templates (for LaTeX, Overleaf and
MS-Word) provided for the main conference
Papers should be submitted as a PDF file using the START conference
manager
Submissions must describe original and unpublished work and range
from 4 to 8 pages plus unlimited references.
Reviewing will be double blind, so the papers should not reveal the
authors' identity. Accepted papers will be published in the workshop
proceedings, which will be included in the ACL Anthology.

Double submission policy: Parallel submission to other meetings or
publications is possible but must be immediately (i.e. as soon as known
to the authors) notified to the workshop organizers by e-mail.

For further information and updates see the BUCC 2023 website


BUCC 2023 SHARED TASK
Bilingual Term Alignment in Comparable Specialized Corpora

The BUCC 2023 shared task is on multilingual terminology alignment in
comparable corpora. Many research groups are working on this problem
using a wide variety of approaches. However, as there is no standard way
to measure the performance of the systems, the published results are not
comparable and the pros and cons of the various approaches are not
clear. The shared task aims at solving these problems by organizing a
fair comparison of systems. This is accomplished by providing corpora
and evaluation datasets for a number of language pairs and domains.

Moreover, the importance of dealing with multi-word expressions in
Natural Language Processing applications has been recognized for a long
time. In particular, multi-word expressions pose serious challenges for
machine translation systems because of their syntactic and semantic
properties. Furthermore, multi-word expressions tend to be more
frequent in domain-specific text, hence the need to handle them in tasks
with specialized-domain corpora.

Through the 2023 BUCC shared task, we seek to evaluate methods that
detect pairs of terms that are translations of each other in two
comparable corpora, with an emphasis on multi-word terms in specialized
domains.

For the schedule and further details see the shared task website


WORKSHOP ORGANIZERS

* Reinhard Rapp (University of Mainz and Magdeburg-Stendal University of
Applied Sciences, Germany)
* Pierre Zweigenbaum (Université Paris-Saclay, CNRS, LISN, Orsay, France)
* Serge Sharoff (University of Leeds, United Kingdom)

Contact workshop: reinhardrapp (at) gmx (dot) de
Contact shared task: pz (at) lisn (dot) fr


PROGRAMME COMMITTEE

* Ebrahim Ansari (Institue for Advanced Studies in Basic Sciences, Iran)
* Thierry Etchegoyhen (Vicomtech, Spain)
* Philippe Langlais (Université de Montréal, Canada)
* Yves Lepage (Waseda University, Japan)
* Shervin Malmasi (Amazon, USA)
* Emmanuel Morin (Université de Nantes, France)
* Dragos Stefan Munteanu (RWS, USA)
* Reinhard Rapp (University of Mainz and Magdeburg-Stendal University of
Applied Sciences, Germany)
* Nasredine Semmar (CEA LIST, Paris, France)
* Serge Sharoff (University of Leeds, UK)
* Richard Sproat (OGI School of Science & Technology, USA)
* Tim Van de Cruys (KU Leuven, Belgium)
* Pierre Zweigenbaum (Université Paris-Saclay, CNRS, LISN, Orsay, France)


Summary

BUCC 2023 : 16th Workshop on Building and Using Comparable Corpora will take place in Varna, Bulgaria. It’s a 2 days event starting on Sep 7, 2023 (Thursday) and will be winded up on Sep 8, 2023 (Friday).

BUCC 2023 falls under the following areas: NLP, COMPUTATIONAL LINGUISTICS, etc. Submissions for this Workshop can be made by Jul 18, 2023. Authors can expect the result of submission by Jul 31, 2023. Upon acceptance, authors should submit the final version of the manuscript on or before Aug 25, 2023 to the official website of the Workshop.

Please check the official event website for possible changes before you make any travelling arrangements. Generally, events are strict with their deadlines. It is advisable to check the official website for all the deadlines.

Other Details of the BUCC 2023

  • Short Name: BUCC 2023
  • Full Name: 16th Workshop on Building and Using Comparable Corpora
  • Timing: 09:00 AM-06:00 PM (expected)
  • Fees: Check the official website of BUCC 2023
  • Event Type: Workshop
  • Website Link: https://comparable.limsi.fr/bucc2023/
  • Location/Address: Varna, Bulgaria


Credits and Sources

[1] BUCC 2023 : 16th Workshop on Building and Using Comparable Corpora


Check other Conferences, Workshops, Seminars, and Events


OTHER NLP EVENTS

SemDial 2024: The 28th Workshop on the Semantics and Pragmatics of Dialogue
Trento, Italy
Sep 11, 2024
GamesandNLP 2024: Games and NLP 2024 Workshop
Turin, Italy
May 21, 2024
GITT 2024: Second International Workshop on Gender-Inclusive Translation Technologies
Sheffield, UK
Jun 27, 2024
LoResMT 2024: The Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages
Bangkok, Thailand
Aug 15, 2024
SIGDIAL 2024: The 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Tokyo, Japan
Sep 18, 2024
SHOW ALL

OTHER COMPUTATIONAL LINGUISTICS EVENTS

SemDial 2024: The 28th Workshop on the Semantics and Pragmatics of Dialogue
Trento, Italy
Sep 11, 2024
GamesandNLP 2024: Games and NLP 2024 Workshop
Turin, Italy
May 21, 2024
GITT 2024: Second International Workshop on Gender-Inclusive Translation Technologies
Sheffield, UK
Jun 27, 2024
LoResMT 2024: The Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages
Bangkok, Thailand
Aug 15, 2024
SIGDIAL 2024: The 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Tokyo, Japan
Sep 18, 2024
SHOW ALL