resurchify Conferences Journals Workshops Seminars

FinSBD-2 Shared Task 2020 : Sentence Boundary Detection in PDF Noisy Text in the Financial Domain

Yokohama, Japan
Event Date: Mar 13, 2020 - May 08, 2020
Submission Deadline: May 15, 2020


Sentences are basic units of the written language. Detecting the beginning and end of sentences, or sentence boundary detection (SBD), is the foundational first step in many Natural Language Processing (NLP) applications such as POS tagging; syntactic, semantic, and discourse parsing; information extraction; or machine translation.

Despite its important role in NLP, Sentence Boundary Detection has so far not received enough attention. Previous research in the area has been confined to only formal texts (news, European Parliament proceedings, etc.) where existing rule-based and machine learning approaches are extremely accurate so-long the data is perfectly clean. No sentence boundary detection research to date has addressed the problem in noisy texts extracted automatically from machine-readable files (generally PDF file format) such as financial documents.

One type of financial document is the prospectus. Financial prospectuses are official PDF documents in which investment funds precisely describe their characteristics and investment modalities. The most important step of extracting any information from these files is to parse them to get noisy unstructured text, clean the text, format the information (by adding several tags) and finally, transform it into semi-structured text, where sentence and list boundaries are well marked.

These prospectuses also contain many visual demarcations indicating a hierarchy of sections including bullets and numbering. There are many sentence fragments and titles, and not just complete sentences. The prospectuses more often than not contain punctuation errors. And in order to structure the dense information in a more easily read format, lists are often used.

Call For Paper

We invite submissions of research papers on all topics related to NLP for Financial Technology (FinTech) applications. Besides, one of our goals of this workshop is to foster collaboration between researchers and developers from computational linguistics and finance and economic areas. Original studies reporting joint work are therefore especially encouraged. Topics of interest include, but are not limited to:

  • Text-based Market Provisioning
  • NLP-based Investment Management
  • Crowdfunding Analysis with Text Data
  • Text-oriented Customer Preference Analysis
  • Insurance Application with Textual Information
  • NLP-based Know Your Customer (KYC) Approach
  • Applications or Systems for FinTech with NLP Methods

Hotel Deals

Check Other Conferences, Workshops, and Seminars


SIPRO 2019 : 5th International Conference on Signal and Image Processing
Toronto, Canada
Jul 13, 2019
Remote Sensing journal (Special Issue) 2019 : Image Optimization in Remote Sensing


RLPMTM 2021 : Applied Sciences special issue Rich Linguistic Processing for Multilingual Text Mining

NLCAI 2020 : International Conference on Natural Language Computing and AI
London, United Kingdom
Jul 25, 2020
GWN 2021 : 11th International Global Wordnet Conference
Pretoria, South Africa
Jan 18, 2021
AdNLP 2020 : International conference on Advanced Natural Language Processing
Toronto, Canada
Jul 11, 2020
WeCNLP 2020 : The Third Annual West Coast NLP Summit
Oct 30, 2020


Frontiers in Big Data 2020 : Novel Big Data Technologies in Public Health

ICMTEL 2021 : 3rd EAI International Conference on Multimedia Technology and Enhanced Learning
Leicester, Great Britain
Apr 09, 2021
ICDM 2021 : 21th Industrial Conference on Data Mining
New York, USA
Jul 14, 2021
MLDM 2021 : 17th International Conference on Machine Learning and Data Mining
New York, USA
Jul 17, 2021
RSDA 2020 : The 5th IEEE International Workshop on Reliability and Security Data Analysis
Coimbra, Portugal (Virtual workshop)
Oct 12, 2020