NSF IIS Program
Project # 1618695

Safely Searching Among Sensitive Content


News

     
  • All good things must end, and we've now come to the end of this project. It's been a fantastic opportunity to explore an important new research problem, and to work with some fantastic people on that! One thing we are doing here at the end of the project is working with the Linguistic Data Consortium to deposit the annotations that we and others have created for the Avocado Email Research Collection.
  • Ph.D. Student Xin Qian has led the work two new papers that explore managing conversational interaction for search that have implications for search among sensitive content, working with Professor Doug Oard. The first paper is for TREC 2021, where Xin explored alternative ranking techniques in the Conversational Assistance Track. The second will be presented at the 2022 iConference, on the use of conversational representations of historical figures in museum exhibits. Because this work involved searching informal content, privacy protection issues will surely arise.
  • We're celebrating the graduation of Ph.D. student Mahmoud Sayed, whose dissertation topic was Search Among Sensitive Content!
  • Our article on using machine learning to support review for Freedom of Information Act exemptions has been accepted for a special issue on Computational Archival Science of the ACM Journal on Computing and Cultural Heritage that will likely be published in 2022. This is joint work between Professor Jason Baron, Ph.D. student Mahmoud Sayed, and Professor Doug Oard.
  • Professor Doug Oard presented a Tutorial with Graham McDonald from the University of Glasgow on Search Among Sensitive Content at the 2021 European Conference on Information Retrieval.
  • Professor Doug Oard presented a keynote address on The Importance of Domain Knowledge for Computational Law at the SIGIR 2020 workshop on Legal Intelligence .
  • Professor Doug Oard and Ph.D. student Jyothi Vinjumur will be presenting our 2018 TOIS paper on "Jointly Minimizing the Expected Costs of Review for Responsiveness and Privilege" on video at the 2020 ACM SIGIR conference and discussing that paper with the participants. SIGIR offers TOIS authors this opportunity as a way of increasing the exposure of articles that are published in TOIS, but because of the annual schedule of SIGIR there's a bit of a lag between publication and the presentation.
  • Ph.D. student Mahmoud Sayed will be presenting a paper from pretty much the entire team on "A Test Collection for Relevance and Sensitivity" at SIGIR 2020, which had been originally planned for Xi'an China, but which will now be held online. Professor Doug Oard and REU Student Will Cox plan to join him "there" to discuss the paper with online participants.
  • We're celebrating the graduation of REU student Jonah Rivera! Will Cox is continuing with us for the summer, working on automated content analysis for our new test collection.
  • We've finished the initial annotations for our new relevance and sensitivity test collection!
  • Our full paper on "We Could, But Should We? Ethical Considerations for Providing Access to GeoCities and Other Historical Digital Collections" led by Professor Jimmy Lin had been scheduled for presentation at the ACM CHIIR conference in Vancouver in mid-March of 2020, but the COVID-19 coronavirus had other ideas. So that paper is available from ACM, but we didn't get the chance to present it.
  • Professor Doug Oard and Ph.D. student Mahmoud Sayed contributed to the SIGIR 2019 workshop report on "FACTS-IR: Fairness, Accountability, Confidentiality, Transparency, and Safety in Information Retrieval" that was published in the December 2019 issue of SIGIR Forum.
  • We welcomed two new Research Experience for Undergraduate (REU) students to the project in August, 2019: Will Cox and Jonah Rivera. Will and Jonah are working with us on creation of a new email test collection for relevance and sensitivity.
  • Ph.D. student Mahmoud Sayed presented a talk on "Detecting Sensitive Content" at the SIGIR Workshop on Fairness, Accountability, Confidentiality, Transparency and Safety in Information Retrieval, (FACTS-IR) in Paris, France in July 2019.
  • Ph.D. student Mahmoud Sayed presented a full paper on "Jointly Modeling Relevance and Sensitivity for Search Among Sensitive Content" at SIGIR 2019 in Paris, France in July 2019.
  • Professor Doug Oard gave a talk on "The Future of Ubiquitous Spoken Content" at the Good Systems Seminar at the University of Texas at Austin in April, 2019.
  • Professor Doug Oard is helping out as the Program Committee chair for the First ACM SIGIR/SIGKDD Africa Summer School on Machine Learning for Data Mining and Search (AFIRM) in Cape Town, South Africa in January, 2019.
  • Professor Doug Oard gave a talk on "Searching Intermixed Shareable and Sensitive Content" at Wadhwani AI in Mumbai, India in December, 2018.
  • Professor Doug Oard and former Ph.D. student Dr. Jyothi Vinjumur gave three talks at a full-day seminar on E-Discovery at the National Law School of India in Bangalore, including one on "Jointly Minimizing the Expected Costs of Review for Responsiveness and Privilege" and one on "An Information Scientist's Perspective on U.S. E-Discovery Practice", in December, 2019.
  • Professor Doug Oard gave a talk on Searching Intermixed Shareable and Sensitive Content at Wadhwani AI in Mumbai, India in December, 2018.
  • Professor Doug Oard gave an invited talk on "When You Can't Review Everything, What Then?" at the Towards Assistive Digital Sensitivity Review (TASER) Workshop at the University of Glasgow in the UK in December, 2018.
  • We organized an Open Forum on Safe Search Among Sensitive Content at the 2018 Society of American Archivists Conference in August, 2018.
  • Professor Doug Oard gave a talk on this project at Kyushu University in Japan in July, 2018.
  • iSchool Ph.D. student Jyothi Vinjumur successfully defended her dissertation in April, 2018!
  • We went live on the Web with a demo version of our interactive system for training classifiers to recognize sensitive email in March, 2018. Check it out at http://clip-sasc.umiacs.umd.edu/. The demo version works with the Enron email collection, but you can also download it from here and run it on your own email. Unzip, start it as described in the readme, index an mbox collection (e.g., as a "takeout" file from gmail), and you should be off and running.
  • Professor Doug Oard attended the NTCIR Workshop in Tokyo Japan in December, 2017 to explore the design of a privacy task for the Lifelogging track of NTCIR 14, one of the world's major information retrieval evaluation venues.
  • Ph.D. Student Yulu Wang successfully defended her dissertation in November, 2017!
  • Professor Doug Oard presented at talk on this project at the University of New Hampshire in September, 2017.
  • Professor Katie Shilton presented a paper at the Digital Humanities Conference's First International Workshop on Privacy Sensitive Collections for Digital Scholarship in Montreal, Canada in August, 2017.
  • Professor Doug Oard gave a public presentation on "Creating Search Engines that Know What Not to Find" at Search Engines Amsterdam in the Netherlands in June, 2017.
  • Professor Doug Oard presented a paper on "When is it Rational to Review for Privilege?" at the ICAIL 2017 DESI Workshop in London in June, 2017.
  • Professor Doug Oard presented a talk on this project at the University of Glasgow (UK) in June, 2017.
  • Ph.D. Student Mossaab Bagdouri created our first system for detecting sensitive content in email in June, 2017.
  • Computer Science Ph.D. Student Mossaab Bagdouri successfully defended his dissertation in May, 2017!
  • Professor Doug Oard gave a talk on this project at the National Institute of Informatics in Tokyo, Japan in December, 2016.
  • Professor Doug Oard gave an invited talk on "The Other Side of the Coin: Proactive Language Technologies for Protecting Sensitive Content" at the AAAI Fall Symposium on Privacy and Language Technologies in Arlington, VA in November, 2016.
  • Professor Doug Oard gave a talk on Effectively Searching Among Sensitive Content at the University of Florida in September, 2016.
  • Professor Doug Oard inaugurated the new project with a talk on Effectively Searching Among Sensitive Content at the University of Central Florida in September, 2016.
   Project Pages:

Page created: December 11, 2017

This material is based upon work supported by the National Science Foundation under Grant No. 1618695. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.