Projects in Legal Informatics

DAPRECO (Feb 2017 - Aug 2019) - forthcoming

DAPRECO (DAta Protection REgulation COmpliance) is a CORE project that has been retained for funding on Nov 2016. DAPRECO is the single senior CORE project proposed by SnT in 2016 that has been retained for funding, out of 18 submitted ones. The project will start on Feb 2017 and it will last two years and an half. It aims at developing a methodology for developing legal ontologies as well as a knowledge base for representing and correlating the norms in the upcoming General Data Protection Regulation and in several ISO standards.

No further info about DAPRECO are currently available.

MIREL (Jan 2016 - Dec 2019) - ongoing

MIREL (MIning and REasoning on Legal texts) is supported by a Marie Skłodowska-Curie Research and Innovation Staff Exchange project . It has been retained for funding with an overall score of 97.2%. MIREL involves 16 international partners, including 3 industrial partners. I coordinated the writing of MIREL and I am currently managing its activities. MIREL promotes mobility and staff exchange in order to create an inter-continental inter-disciplinary consortium in Law and Artificial Intelligence areas including NLP, Computational Ontologies, Argumentation, and Logic&Reasoning.

ProLeMAS (Jun 2015 - May 2017) - ongoing

ProLeMAS (PROcessing LEgal language in normative Multi-Agent Systems) is supported by a Marie Skłodowska-Curie Individual fellowship . It has been retained for funding with an overall score of 96.4%. ProLeMAS aims at (1) using reification to fill the gap between the current formalizations in deontic logics and the richness of natural language semantics and (2) implementing tools for (semi-)automatically building machine-readable representations from legal texts via NLP. In the context of ProLeMAS a six-month secondment period will be carried out at the private company APIS Hristovich EOOD in Sofia.

EUCases (Oct 2013 - Oct 2015) - ended

EUCases (Linking Legal Open Data in Europe) was a collaborative Research Project supported by the Seventh Framework Programme (FP7) funding and involving five partners, among which Nomotika S.R.L. and APIS Hristovich EOOD. I joined the project after one year it started and it was my first relevant experience in legal informatics. I was responsible of the NLP toolkit for the Italian language. All prototypes and techniques developed in these activities have been integrated in the MenslegiS system. The project ProLeMAS was built on EUCases results. Nomotika S.R.L. and APIS Hristovich EOOD, were subsequently involved in the projects ProLeMAS and MIREL.

ICT4LAW (Mar 2009 - Feb 2012) - ended

ICT4LAW was a large interdisciplinary research project involving twelve partners, six academic partners and six industrial ones. The goal was to create novel services for citizens, enterprises, public administration and policy makers. My role in the project was minimal, but it was useful to learn general expertise in legal informatics. I developed rule-bases systems for recognizing modificatory provisions and I carried out dependency parsing to feed statistical classifiers. The ICT4LAW project led to the creation of the spin-off Nomotika S.R.L., founded by the Department of Computer Science of the University of Turin and the company Augeos S.P.A., both partners in the project.

Past research

NL Quantifiers

In the context of my PhD thesis, I worked on formal representation of NL quantifiers. In particular, I devised a new logical framework to properly represent Scopeless readings, such as cumulative and collective readings. I authored six journal publications and several conference/workshop ones on the topic and I am the single author of four of these journal publications. My last publication on the topic was (Robaldo, Szymanik, Meijering, 2014), which are coauthored with Jakub Szymanik and Ben Meijering. After that publication, I stopped working on NL Quantifiers. On the other hand, Jakub was later awarded by an ERC Starting Grant on related topics (developing cognitive semantics of generalized quantifiers). Congrats Jakub!

Penn Discourse Treebank (PDTB)

The PDTB is a corpus developed at the University of Pennsylvania (UPenn). The PDTB is, to date, the largest annotation effort at the discourse level, providing annotations of the argument structure, attribution and semantics of discourse connectives. After my PhD thesis, I visited University of Pennsylvania for five months (and later again in 2009 for two months) where I started working with the PDTB research group. I contributed to the writing of the PDTB 2.0 annotation manual and the sense annotation in the release 2.0 of the corpus. During that period, I also started working with reification-based semantics, specifically the approach of Jerry R. Hobbs, and I used it to model concessive relations found in the PDTB.

Sentiment Analysis

In 2013, I defined, together with Luigi Di Caro, an XML formalism called OpinionMining-ML for tagging users' opinions on products and services, and I built a corpus of 1000 comments about restaurants taken from www.2spaghi.it, one of the biggest web2.0 sites about Italian restaurants and pizzerias. Afterwards, I won the Working Capital Accelerator 2014, a Telecom Italia grant to support new startups and innovative research projects, with the project SentiTagger, aiming at automatically tagging comments in OpinionMining-ML. The selection was highly competitive: only 40 projects out of about 1,300 submitted ones were selected. Each selected project was granted 25,000 euros from Telecom Italia.

Gamification

In 2013, I worked on Gamification-based approaches to corpora building, pionereed by Massimo Poesio. I was specifically involved in the Phrase Detective game-with-a-purpose, aiming at creating anaphorically annotated resources through Web cooperation. I was an expert annotator of the game and I developed a converter from Italian texts to the input format of the game via dependency parsing, in order to allow annotations in Italian. Massimo Poesio was later awarded with an ERC Advanced grant, on the project ''DALI - Disagreements and Language Interpretation'', which proposes more advanced games, drawn from Phrase Detective, to collect massive amounts of data about anaphora from people playing them. Congrats Massimo!