Fortia organized two shared tasks FinSim and FinSBD in the context of the FinNLP Workshop taking place as a part of IJCAI-PRICAI in january 2021 (originally planned for July 2020) in Yokohama, Japan .
Both tasks addressed NLP and Information Extraction-related topics in the financial domain.
- FinSim’s topic is learning semantic representations. You can find more about this task here
- FinSBD 2020 was our second edition of a task aming to extract sentence and list boundaries & hierarchy from PDF documents. You can find more about this year here and last year here
Both shared tasks ran from March through May 2020. In total 37 teams registered and 11 teams submitted solutions. The participating teams are preparing articles describing their systems to appear in the ACL Anthology, and presented during the IJCAI-PRICAI conference. Our own articles detailing shared task information and results will also be published soon.
In this post you can see a summary of the results as well as an analysis regarding participation in these two tasks.
Overall, 30% of the teams who registered for the challenge actually participated, that is to say submitted results for evaluation. We tried giving some extra days to all teams that had started working on the task and needed a short extension, so we suppose that teams that didn’t send results hadn’t managed to get to a sufficient point, either because of unsatisfactory results, or because they registered but didn’t really follow through with working on the data.
Let’s have a look at four characteristics of our participating and registered teams:
- 1- Registration time
- 2- Affiliation
- 3- Country
- 4- Team size.
So when do motivated teams sign up ?
Early birds registered before any dataset was made available, and showed interest from the very first calls for participation. Mid-period participants registered when only training data was available. While late birds registered when all datasets had been released.
Here’s a breakdown of the distribution of these categories in our two groups.
Our analysis shows that people that sign up late participate the most:
- 50% of our late registrations managed to send results, compared to
- 31% for the Early birds, which is close to the average (30%).
- 22% for those in-between.
Late birds had the highest participation rate, possibly thanks to the availability of the information and data which made the process more focused.
Both tasks gathered interest from Industry and Academia, as well as independent researchers.
While academics were the ones that registered the most (51% of all registrations), teams with an industry affiliation had a higher completion rate: 50%, compared to 20% for academia. A small number of independent researchers showed interest, but didn’t manage to submit results.
A possible interpretation for these statistics could be that academics are in general more active in the world of scientific challenges, so they register more often. However, Industry participants might be benefiting from a selection bias, where only people from labs that are really interested in the tasks register and subsequently it is more meaningful for them to participate and compare.
Note. The team size is calculated based on the registration form data. The participants’ articles may give a more up to date version.
Most of the teams that responded were actually registered as comprised of only one researcher (54%), followed by teams of two (27-30%).
Team size doesn’t appear to have a particular, significant, impact on participation. While teams of three have a 50% rate of participation, they were much rarer. The even rarer larger teams did not submit results. The larger groups of solo and two-member teams tend to be close to the average of 30% participation.
Finally the shared task gathered interest from teams of nine different countries: India, USA, China, France, Japan, South Korea, Australia, Hungary and Singapore – in descending order of registrations.
Unsurprisingly, the final participation rate does not depend on geography. In all the countries were there were enough registrations the participation rate was close to the average, while the other, more unique cases, were either at 0% or 100%.
We looked at four aspects characterizing the teams that registered and participated in our two shared tasks: FinSim and FinSBD.
- Late registrations had the highest participation rate.
- Academics registered more, but industry participants managed to submit results more often.
- Based on these data, we couldn’t find any significant impact of team size to participation, but more than half of the teams registered were actually solo researchers.
- The 37 registered teams came from nine countries around the globe.
If you would like to know more about the tasks, their results and the participants’s systems, stay tuned ! You will be able to read more on our Fortia blog, or the Workshop and Conference websites.
Curious about how you would score in an academic challenge?
You can still participate in our third challenge FinTOC’2 until the end of June.
FinTOC has already over 50 registered teams, so we are really curious to see if the participation statistics confirm our current conclusions.