IMPACT SCORE JOURNAL RANKING CONFERENCE RANKING Conferences Journals Workshops Seminars SYMPOSIUMS MEETINGS BLOG LaTeX 5G Tutorial Free Tools
FinSBD-2 Shared Task 2020 : Sentence Boundary Detection in PDF Noisy Text in the Financial Domain
FinSBD-2 Shared Task 2020 : Sentence Boundary Detection in PDF Noisy Text in the Financial Domain

FinSBD-2 Shared Task 2020 : Sentence Boundary Detection in PDF Noisy Text in the Financial Domain

Yokohama, Japan
Event Date: March 13, 2020 - May 08, 2020
Submission Deadline: May 15, 2020




About

Sentences are basic units of the written language. Detecting the beginning and end of sentences, or sentence boundary detection (SBD), is the foundational first step in many Natural Language Processing (NLP) applications such as POS tagging; syntactic, semantic, and discourse parsing; information extraction; or machine translation.

Despite its important role in NLP, Sentence Boundary Detection has so far not received enough attention. Previous research in the area has been confined to only formal texts (news, European Parliament proceedings, etc.) where existing rule-based and machine learning approaches are extremely accurate so-long the data is perfectly clean. No sentence boundary detection research to date has addressed the problem in noisy texts extracted automatically from machine-readable files (generally PDF file format) such as financial documents.

One type of financial document is the prospectus. Financial prospectuses are official PDF documents in which investment funds precisely describe their characteristics and investment modalities. The most important step of extracting any information from these files is to parse them to get noisy unstructured text, clean the text, format the information (by adding several tags) and finally, transform it into semi-structured text, where sentence and list boundaries are well marked.

These prospectuses also contain many visual demarcations indicating a hierarchy of sections including bullets and numbering. There are many sentence fragments and titles, and not just complete sentences. The prospectuses more often than not contain punctuation errors. And in order to structure the dense information in a more easily read format, lists are often used.


Call for Papers

We invite submissions of research papers on all topics related to NLP for Financial Technology (FinTech) applications. Besides, one of our goals of this workshop is to foster collaboration between researchers and developers from computational linguistics and finance and economic areas. Original studies reporting joint work are therefore especially encouraged. Topics of interest include, but are not limited to:

  • Text-based Market Provisioning
  • NLP-based Investment Management
  • Crowdfunding Analysis with Text Data
  • Text-oriented Customer Preference Analysis
  • Insurance Application with Textual Information
  • NLP-based Know Your Customer (KYC) Approach
  • Applications or Systems for FinTech with NLP Methods


Credits and Sources

[1] FinSBD-2 Shared Task 2020 : Sentence Boundary Detection in PDF Noisy Text in the Financial Domain


Check other Conferences, Workshops, Seminars, and Events


OTHER SEGMENTATION EVENTS

SIGI 2024: 10th International Conference on Signal and Image Processing
Toronto, Canada
Jul 20, 2024
SIPRO 2024: 10th International Conference on Signal and Image Processing
Zurich, Switzerland
May 18, 2024
KiTS 2023: The 2023 MICCAI Kidney Tumor Segmentation Challenge
Vancouver, Canada
Oct 8, 2023
Shared Task - FinSBD-3: The 3rd Shared Task on Structure Boundary Detection, an extension of Sentence Boundary Detection
Ljubljana, Slovenia
Apr 19, 2021
SIPRO 2019: 5th International Conference on Signal and Image Processing
Toronto, Canada
Jul 13, 2019
SHOW ALL

OTHER TOKENIZATION EVENTS

RE4WEB 2024: Requirements Engineering for WEB3 systems Workshop at IEEE RE 2024 Conference
Iceland
Jun 24, 2024
DLT 2024: 6th Distributed Ledger Technology Workshop
Turin, Italy
May 14, 2024
SHOW ALL

OTHER NLP EVENTS

SIR 2025: First Workshop on Semantics for Interdisciplinary Research SIR@IXCS2025
Düsseldorf, Germany
Sep 24, 2025
NLLP 2025: 7th Workshop on Natural Legal Language Processing
Suzhou, China
Nov 8, 2025
SymGenAI4Sci 2025: SymGenAI4Sci Workshop on Symbolic and Generative AI for Science, taking place as part of SEMANtiCS 2025
Vienna, Austria
Sep 3, 2025
TSAR 2025: Fourth Workshop on Text Simplification, Accessibility and Readability
Suzhou, China
Nov 5, 2025
OMMM 2025: Second CFP - Interdisciplinary Workshop on Observations of Misunderstood, Misguided and Malicious Use of Language Models
Varna, Bulgaria
Sep 11, 2025
SHOW ALL

OTHER MACHINE LEARNING EVENTS

ArIT 2025: 6th International Conference on Advances in Artificial Intelligence Techniques
Toronto, Canada
Jul 19, 2025
ICSIE--EI 2026: 2026 14th International Conference on Software and Information Engineering (ICSIE 2026)
Himeji, Japan
Jan 16, 2026
ICoSSE--Ei 2026: 2026 9th International Conference on Software and System Engineering (ICoSSE 2026)
Lyon, France
Apr 13, 2026
ICHCSC 2025: 4th International Conference on Human-Centric Smart Computing (ICHCSC 2025)
Jaipur, India
Oct 10, 2025
CMLA 2025: 7th International Conference on Machine Learning & Applications
Toronto, Canada
Jul 19, 2025
SHOW ALL