resurchify Conferences Journals Workshops Seminars

FinSBD-2 Shared Task 2020 : Sentence Boundary Detection in PDF Noisy Text in the Financial Domain

Yokohama, Japan
Event Date: Mar 13, 2020 - May 08, 2020
Submission Deadline: May 15, 2020


Sentences are basic units of the written language. Detecting the beginning and end of sentences, or sentence boundary detection (SBD), is the foundational first step in many Natural Language Processing (NLP) applications such as POS tagging; syntactic, semantic, and discourse parsing; information extraction; or machine translation.

Despite its important role in NLP, Sentence Boundary Detection has so far not received enough attention. Previous research in the area has been confined to only formal texts (news, European Parliament proceedings, etc.) where existing rule-based and machine learning approaches are extremely accurate so-long the data is perfectly clean. No sentence boundary detection research to date has addressed the problem in noisy texts extracted automatically from machine-readable files (generally PDF file format) such as financial documents.

One type of financial document is the prospectus. Financial prospectuses are official PDF documents in which investment funds precisely describe their characteristics and investment modalities. The most important step of extracting any information from these files is to parse them to get noisy unstructured text, clean the text, format the information (by adding several tags) and finally, transform it into semi-structured text, where sentence and list boundaries are well marked.

These prospectuses also contain many visual demarcations indicating a hierarchy of sections including bullets and numbering. There are many sentence fragments and titles, and not just complete sentences. The prospectuses more often than not contain punctuation errors. And in order to structure the dense information in a more easily read format, lists are often used.

Call For Paper

We invite submissions of research papers on all topics related to NLP for Financial Technology (FinTech) applications. Besides, one of our goals of this workshop is to foster collaboration between researchers and developers from computational linguistics and finance and economic areas. Original studies reporting joint work are therefore especially encouraged. Topics of interest include, but are not limited to:

  • Text-based Market Provisioning
  • NLP-based Investment Management
  • Crowdfunding Analysis with Text Data
  • Text-oriented Customer Preference Analysis
  • Insurance Application with Textual Information
  • NLP-based Know Your Customer (KYC) Approach
  • Applications or Systems for FinTech with NLP Methods

Hotel Deals

Check Other Conferences, Workshops, and Seminars


SIPRO 2019 : 5th International Conference on Signal and Image Processing
Toronto, Canada
Jul 13, 2019
Remote Sensing journal (Special Issue) 2019 : Image Optimization in Remote Sensing


EComNLP 2020 : COLING Workshop on NLP in E-Commerce
Barcelona, Spain
Sep 13, 2020
MNLP 2020 : 4th IEEE Conference on Machine Learning and Natural Language Processing
Agadir, Essaouira, Morocco
Dec 12, 2020
AICA 2020 : O'Reilly AI Conference San Jose
San Jose, CA
Mar 15, 2020
DSAA 2020 : Special Session FAKE NEWS, BOTS AND TROLLS
Sydney, Australia
Oct 04, 2020
FigLang 2020 : The Second Workshop on Figurative Language Processing
Seattle, WA, USA
Jul 09, 2020


eLife 2020 : Drive forward Research Communication and Culture
Cambridge, UK
Sep 02, 2020
ACSTY 2020 : 6th International Conference on Advances in Computer Science and Information Technology
Copenhagen, Denmark
Apr 25, 2020
XKDD 2020 : 2nd International Workshop on eXplainable Knowledge Discovery in Data Mining
Ghent, Belgium
Sep 14, 2020
NAMSP 2020 : International Workshop on New Approaches for Multidimensional Signal Processing
Sofia, Bulgaria
Jul 09, 2020
ICGI 2020 : The 15th International Conference on Grammatical Inference
New York, USA
Aug 26, 2020