ICAIL 2017
DESI VII Workshop on
Using Advanced Data Analysis in eDiscovery & Related Disciplines
to Identify and Protect Sensitive Information in Large Collections

June 12, 2017, Strand Campus, King's College London, UK

Purpose |

Important Dates |

Submissions |

DESI History

Papers |

Program |

Registration |

Organizing Committee |

Program Committee

Purpose

The DESI VII workshop will provide a platform for discussion of best practices and innovations in the use of advanced search technology, text classification, language processing, data organization, visualization and related techniques for the purposes of accessing and managing electronically stored information. One focus of the DESI VII workshop will be on emerging protocols and novel techniques for identifying and protecting sensitive information in large collections. The workshop will also welcome contributions on other topics that are within the workshop’s broader scope. We expect the refined focus on protecting sensitive content this year to be directly relevant to at least four application contexts:

eDiscovery in complex litigation
European Union (EU) privacy policies
Audits and internal investigations
Public access to government records

We expect to address the following open questions:

In eDiscovery: What techniques are currently being used to classify information found in email or other data sources as privileged, confidential, or otherwise protected by law? How widespread is the use of technology for this type of information identification? How well do current technologies perform with respect to the classification of sensitive information?

In EU privacy policies: To what degree can current algorithmic techniques adequately characterize content that individuals might wish to have blocked from certain types of access in adherence with “right to be forgotten” laws? To what extent can the process of adjudicating such requests reasonably be automated? How well do algorithmic techniques perform in identifying sensitive data that may need to be blocked from cross-border transfers? To what extent can these capabilities satisfy requirements for algorithmic accountability?

In audits and investigations: What tools and techniques are available to find and protect well-defined categories of sensitive content? Examples from the US and Canada might include protected health information, student education records, customer record information, card holder data, or proprietary or confidential information (e.g., trade secrets). To what extent can taxonomies be constructed for information that is routinely the focus of internal audits to facilitate automatic detection of those categories of information? To what extent can technical support for investigations be designed to protect sensitive content that is not material to the investigation?

In public access requests: How well can current procedures and automated techniques identify and protect personal, political, proprietary or otherwise confidential content? To what extent can automated techniques reliably detect specific types of personally identifiable information which, if released, would constitute an unwarranted invasion of privacy?

The workshop discussion will be grounded in the results of original research, such as that reported in interdisciplinary venues such as ICAIL, law reviews, technical conferences in specific disciplines (e.g., KDD, ICWSM, ACL, SIGIR), and shared task evaluations (e.g., TREC, CLEF, NTCIR).

Participation is invited from all interested parties, including those with backgrounds in:

Archives and records management
Artificial intelligence and law
Cognitive science
Content analytics
Corpus analysis
Computational linguistics
Digital forensics
eDiscovery
Human-computer interaction
Human language technology
Information governance
Information retrieval
Knowledge management
Legal informatics
Legal sensemaking
Litigation support
Natural language processing
Machine learning
Privacy
Sentiment analysis
Text mining and classification
Visual analytics

Important Dates

April 1, 2017: Refereed paper submissions due
May 1, 2017: Decisions on refereed papers returned
May 1, 2017: Unrefereed position papers requested [late submissions will be accepted]
May 14, 2017: Early registration deadline (for lower fees)
May 15, 2017: Draft program posted
June 12, 2017: DESI VII Workshop

Submissions

Two types of written contributions are invited:

Research & Operational Practice Papers. Original papers (limited to 4 to 10 pages) describing current research results, experimental or emerging practices, or current best practices. Research and operational practice papers will be peer reviewed separately. After peer review, accepted papers will be posted on the DESI VII website. Authors of accepted operational practice or research papers will be invited to present their work either as an oral or a poster presentation. These papers are due on April 1, 2017; decisions will be returned by May 1, 2017.
Position papers (limited to 1-5 pages) describing individual interests, for inclusion on the DESI VII web site and distribution to workshop participants. Submissions of this type are particularly valuable when bringing together diverse research communities. Additionally, these papers can help with our selection of discussion leaders and panelists. Position papers are not peer reviewed, but there is an editorial review to ensure that they satisfy the 5-page length limit and that they address one or more topics within the broad scope of the workshop. Position papers are requested by May 1, 2017. Participation in the workshop is open, so while prior submission of position papers is strongly encouraged it is not strictly required.

Please note that because of the workshop’s focus on research interchange, we are not able to accept commercial white papers or similar corporate materials.

Submissions should be sent by email to Jack Conrad (jack.g.conrad (put at sign here with no spaces on either side) tr.com) with the subject line DESI VII RESEARCH/OPERATIONALPRACTICE PAPER or DESI VII POSITION PAPER. All submissions received will be acknowledged within 3 days.

A PDF of the second Call for Submissions is also available.

DESI History

DESI VII follows five successful prior DESI (Discovery of Electronically Stored Information) workshops: at ICAIL 2007 (DESI I, Palo Alto), ICAIL 2009 (DESI III, Barcelona), ICAIL 2011 (DESI IV, Pittsburgh), ICAIL 2013 (DESI V, Rome), ICAIL 2015 (DESI VI, San Diego), and an intermediate workshop (DESI II) at University College London in 2008. In DESI I, a wide array of individuals came together for perhaps the first time to foster engagement between e-discovery practitioners and a broad range of research communities who might contribute to the development of new technologies to support the e-discovery process. The DESI II and III workshops broadened the scope of this discussion to include comparisons of requirements between differing national settings and legal environments. DESI IV built on these efforts, in having a first-of-its-kind general discussion of standard-setting for the legal profession through contemplation of ISO 9001 frameworks as well as capability maturity models. Most recently, DESI V extended the discussion of standards to include the question of what standards could and should be made applicable to the use of predictive coding and other advanced techniques, that were at the time beginning to be cited in U.S. case law. The DESI VI workshop in San Diego aimed to broaden the scope of legal issues to which advanced data analysis and classification technologies might credibly be applied, beyond ediscovery to a fuller range of information governance applications.

Papers

Abstracts
- Keynote address: Maura Grossman and Gordon Cormack (University of Waterloo) will speak on "Selective Digital Amnesia." An abstract and slides are available
- Invited speakers: Tim Gollins (National Records of Scotland) and Craig Macdonald (University of Glasgow) will speak on "Assisting Digital Sensitivity Review of Government Records." An abstract is available.
Peer-Reviewed Research Papers
- Eoghan Casey, Maria Angela Biasiotti and Fabrizio Turchi, Using Standardization and Ontology to Enhance Data Protection and Intelligent Analysis of Electronic Evidence
- William Dimm, Confirming Recall Adequacy With Unbiased Multi-Stage Acceptance Testing (slides)
- Douglas Oard, Jyothi Vinjumur and Fabrizio Sebastiani, When is it Rational to Review for Privilege? (slides)
- Marijn Schraagen, Matthieu Brinkhuis and Floris Bex, Evaluation of Named Entity Recognition in Dutch Online Criminal Complaints (slides)
Position Papers
- Joseph Bartolo, Jr. and Sandra Serkes, Understanding the General Data Protection Regulation (GDPR)
- Ben Ferko and Josh Rattan, Exploring Technology Implications of Advanced Sensitive Information Discovery and Analysis Methods in Large Organizations
- Maureen O'Neill, Phil Richards and Eric Willis, “Out-of-the-Box” Scans for Sensitive Data: Easy Solution to a Difficult Problem? (slides)
- Mithileysh Sathiyanarayanan and Cagatay Turkay, Challenges and Opportunities in using Analytics Combined with Visualisation Techniques for Finding Anomalies in Digital Communications
- James Sherer, When is a Chair not a Chair? Big Data Algorithms, Disparate Impact, and Considerations of Modular Programming
- Anna Wennakoski, Algorithmic Accountability in the Area of Consumer Privacy: Is it Make Believe or for Real?

Program

The program is available as a PDF file.

Registration

DESI is a part of the International Conference on Artificial Intelligence and Law, and DESI participants must therefore register using the ICAIL Registration Site.

Organizing Committee

Jason R. Baron, Drinker Biddle & Reath LLP
Jack G. Conrad, Thomson Reuters
Hans Henseler, University of Applied Sciences Leiden
Amanda Jones, H5
Douglas W. Oard, University of Maryland

Program Committee

Simon Attfield, Middlesex University (UK)
Bennett B. Borden, Drinker Biddle LLP (US)
Chris Dale, eDisclosure Information Project (UK)
Murat Kantarcioglu, University of Texas at Dallas (US)
Fei Liu, University of Central Florida (US)
Debra Logan, Gartner (UK)
Craig Macdonald, University of Glasgow (UK)
Jeremy Pickens, Catalyst (US)
Maarten de Rijke, University of Amsterdam (NL)
Fabrizio Sebastiani, CNR-ISTI (IT)
David Willcox, Serious Fraud Office (UK)
Grace Hui Yang, Georgetown University (US)

Doug Oard

Last modified: Mon Jul 24 18:48:14 2017

ICAIL 2017 DESI VII Workshop on Using Advanced Data Analysis in eDiscovery & Related Disciplines to Identify and Protect Sensitive Information in Large Collections

June 12, 2017, Strand Campus, King's College London, UK

Purpose

Important Dates

Submissions

DESI History

Papers

Program

Registration

Organizing Committee

Program Committee

ICAIL 2017
DESI VII Workshop on
Using Advanced Data Analysis in eDiscovery & Related Disciplines
to Identify and Protect Sensitive Information in Large Collections