IMPACT SCORE JOURNAL RANKING CONFERENCE RANKING Conferences Journals Workshops Seminars SYMPOSIUMS MEETINGS BLOG LaTeX 5G Tutorial Free Tools
Tokenization 2025 : Tokenization Workshop
Tokenization 2025 : Tokenization Workshop

Tokenization 2025 : Tokenization Workshop

Vancouver, BC, Canada
Event Date: July 18, 2025 - July 18, 2025
Submission Deadline: May 30, 2025
Notification of Acceptance: June 09, 2025




Call for Papers



Tokshop: Tokenization Workshop (ICML 2025)

Submission to the Tokenization Workshop begins on April 14, 2025, via OpenReview. The deadline for submissions is May 30, 2025, at 11:59pm (anywhere on earth). Notifications of acceptance will be sent out on June 9, 2025, and camera-ready papers will be due shortly afterward at 11:59pm (anywhere on earth). The workshop will take place on July 18, 2025.

Workshop Description



The Tokenization Workshop (TokShop) at ICML aims to bring together researchers and practitioners from all corners of machine learning to explore tokenization in its broadest sense. We will discuss innovations, challenges, and future directions for tokenization across diverse data types and modalities.

Call for Papers

Topics of interest include:


Subword Tokenization in NLP: Analysis of techniques such as BPE, WordPiece, and UnigramLM, as well as improvements for efficiency, interpretability, and adaptability.


Multimodal Tokenization: Tokenization strategies for images, audio, video, and other modalities, including methods to align representations across different types of data.


Multilingual Tokenization: Development of tokenizers that work robustly across languages and scripts, and investigation into failure modes tied to tokenization.


Tokenizer Modification Post-Training: Methods for updating tokenizers after model training to boost performance and/or efficiency without retraining from scratch.


Alternative Input Representations: Exploration of non-traditional tokenization approaches, such as byte-level, pixel-level, or patch-based representations.


Statistical Perspectives on Tokenization: Empirical analysis of token distributions, compression properties, and correlations with model behavior.

By broadening the scope of tokenization research beyond language, this workshop seeks to foster cross-disciplinary dialogue and inspire new advances at the intersection of representation learning, data efficiency, and model design.

Submission guidelines

Our author guidelines follow the ICML requirements unless otherwise specified.


Paper submission is hosted on OpenReview.


Each submission should contain up to 9 pages, not including references or appendix (shorter submissions also welcome).



Please use the provided LaTeX template (Style Files) for your submission. Please follow the paper formatting guidelines general to ICML as specified in the style files. Authors may not modify the style files or use templates designed for other conferences.



The paper should be anonymized and uploaded to OpenReview as a single PDF.



You may use as many pages of references and appendix as you wish, but reviewers are not required to read the appendix.



Posting papers on preprint servers like ArXiv is permitted.



We encourage each submission to discuss the limitations as well as ethical and societal implications of their work, wherever applicable (but neither are required). These sections do not count towards the page limit.




This workshop offers both archival and non-archival options for submissions. Archival papers will be indexed with proceedings, while non-archival submissions will not.


The review process will be double-blind


Read more: https://tokenization-workshop.github.io/




Summary

Tokenization 2025 : Tokenization Workshop will take place in Vancouver, BC, Canada. It’s a 1 day event starting on Jul 18, 2025 (Friday) and will be winded up on Jul 18, 2025 (Friday).

Tokenization 2025 falls under the following areas: NLP, COMPUTATIONAL LINGUISTICS, ARTIFICIAL INTELLIGENE, etc. Submissions for this Workshop can be made by May 30, 2025. Authors can expect the result of submission by Jun 9, 2025.

Please check the official event website for possible changes before you make any travelling arrangements. Generally, events are strict with their deadlines. It is advisable to check the official website for all the deadlines.

Other Details of the Tokenization 2025

  • Short Name: Tokenization 2025
  • Full Name: Tokenization Workshop
  • Timing: 09:00 AM-06:00 PM (expected)
  • Fees: Check the official website of Tokenization 2025
  • Event Type: Workshop
  • Website Link: https://tokenization-workshop.github.io/
  • Location/Address: Vancouver, BC, Canada


Credits and Sources

[1] Tokenization 2025 : Tokenization Workshop


Check other Conferences, Workshops, Seminars, and Events


OTHER NLP EVENTS

SIR 2025: First Workshop on Semantics for Interdisciplinary Research SIR@IXCS2025
Düsseldorf, Germany
Sep 24, 2025
NLLP 2025: 7th Workshop on Natural Legal Language Processing
Suzhou, China
Nov 8, 2025
SymGenAI4Sci 2025: SymGenAI4Sci Workshop on Symbolic and Generative AI for Science, taking place as part of SEMANtiCS 2025
Vienna, Austria
Sep 3, 2025
TSAR 2025: Fourth Workshop on Text Simplification, Accessibility and Readability
Suzhou, China
Nov 5, 2025
OMMM 2025: Second CFP - Interdisciplinary Workshop on Observations of Misunderstood, Misguided and Malicious Use of Language Models
Varna, Bulgaria
Sep 11, 2025
SHOW ALL

OTHER COMPUTATIONAL LINGUISTICS EVENTS

SIR 2025: First Workshop on Semantics for Interdisciplinary Research SIR@IXCS2025
Düsseldorf, Germany
Sep 24, 2025
NLLP 2025: 7th Workshop on Natural Legal Language Processing
Suzhou, China
Nov 8, 2025
LREC 2025: Fifteenth biennial Language Resources and Evaluation Conference
Palma, Mallorca, Spain
May 11, 2026
LM4DH 2025: The First Workshop on Natural Language Processing and Language Models for Digital Humanities (LM4DH 2025) @ RANLP_2025
Varna, Bulgaria
Sep 11, 2025
WiNLP 2025: Widening Natural Language Processing Workshop
Suzhou, China
Nov 5, 2025
SHOW ALL

OTHER ARTIFICIAL INTELLIGENE EVENTS

SIR 2025: First Workshop on Semantics for Interdisciplinary Research SIR@IXCS2025
Düsseldorf, Germany
Sep 24, 2025
NLLP 2025: 7th Workshop on Natural Legal Language Processing
Suzhou, China
Nov 8, 2025
SymGenAI4Sci 2025: SymGenAI4Sci Workshop on Symbolic and Generative AI for Science, taking place as part of SEMANtiCS 2025
Vienna, Austria
Sep 3, 2025
LREC 2025: Fifteenth biennial Language Resources and Evaluation Conference
Palma, Mallorca, Spain
May 11, 2026
LM4DH 2025: The First Workshop on Natural Language Processing and Language Models for Digital Humanities (LM4DH 2025) @ RANLP_2025
Varna, Bulgaria
Sep 11, 2025
SHOW ALL