trust, privacy (Breakout session 8) sec'y Benjamin Grosof; moderator Leora Morgenstern Richard Anderson: apparently a topic of research?: - rights/shareab for data that OFR will collect - San Cannon: what public-use subsets / obfuscations for researchers - Benjamin Grosof: need to operationalize the Benjamin: privacy "inference problem" (San Cannon: a.k.a. "mosaic") - eg opp to reverse engineer a firm's investment strategies from trades data, eg OFR will get data by trader desk not just firm - basic approach is to actually try extracting answers to queries of interest over an integrated repository Louiqa: previous efforts to learn from, that have dealt with similar issues: - Census - health care, HIPAA Louiqa: research topic: new kinds of data sources, - eg crowdsource mortgage data (Zillow-ish?) Pete Kyle: issue: property rights to data, eg by exchanges or dealers - Leonard Nakamura: there is murky law, eg wrt use of URIs; "sweat of brow" case precedents Leonard Nakamura: topic for research: economic anal of value of these property rights Charles Taylor: OFR has broad powers to define its relevance scope of data to demand/collect - but they cannot collect from financial infrastructure, eg DTCC -- whose members however have agreed to since they have to provide individually - bank examiners already work with very restricted rules - unclear what powers OFR will have to share its confidential data more broadly with other regulators - unclear if OFR collects directly whether it will cut out private data providers (in a way that's prohibited or legally actionable) Louiqa: research topic: ramifications of the fact - pro forma definitions -- therefore ontologies -- can themselves be confidential - Pete: or a new (kind of) derivative, with pro forma valuation - Benj: distinguish voluntary portion of the disclosure/reporting Andrei Kirilenko: research topic: map of data feeds construct a map/"geography" of who provides data, where do they get it from and how, who owns it; incl. aggregation aspect - very immediate issue in conn. w/ data repositories, data formats . upcoming event: rules soon to be issued in 90-180 days by CFTC (Commodities Future Trading Corp.); sim. process by SEC in progress Michael Wellman: research topic: incentives incentives of value-added info products that there might be a market for - Chester Spatt: this is impt, eg use automated (econ) mechanism design . also: incent disclosures in form that's more useful to others - Michael Donnelly: eg data is binding in some sense - Louiqa: can approach via multiple sources of data - San Cannon: research topic: study correlations of accuracy / hiding to when info is submitted, eg liars report early vs. late Michael Wellman: research topic: control and audit of how info is used, that it's used only for its intended/permitted purpose(s) - more about usages of data than authority to collect - it's about accountability - eg bookie discloses their illegal income to IRS, since knows that they are institutionally constrained from sharing that with the FBI Leonard Nakamura: - stress tests in banks/etc. -- need to be able to drill down to compare regulator #s to the bank #s, not just take the bank #s Pete Kyle: research topic: need to have meta-data about trust level of accuracy of the data in sense of correspondence to reality, - eg liar loans without proper documentation - eg need provenance on prices that come from dealer valuation model, or where dealer is suspect in some other way - Leora M: research topic: how things can go wrong -- is a big area - Louiqa: can borrow from life sci's work on mult ways to categorize a prediction eg about biological function there - Andrei Kirilenko: timestamps are often kinda meaningless or bogus, . eg some from backend clearance providers have been aggregated, and are off by say 30-40 sec; . eg lack of synchronization of clocks among order receipt and execution steps . Louiqa: can approach lags via comp sci techniques - Leonard Nakamura: inaccurate data is of prime importance to regulators . bad data often indicates their processes are broken/failing . ... or that it's suspicious, eg when departs a lot from some avg . research topic: how to help data entry be more accurate, eg detect inconsistencies at that time Charles Taylor: research topic: extend accuracy and incentives to rest of how the financial system works, not just capital markets - eg very important: underwriting standards decline underlay much of the credit crisis . residence appraisals, mortgage-backed ratings - what can one do to detect; delays; broken processes - how to get beyond impressionistic and anecdotal - Benjamin: trust is the other side of leverage % recap of research topics: confidentiality - masking and aggregation - privacy inference problem . investment strategies . sandbox for investigations - policy repn . for rights/trust mgmt . eg use rule-based with deontic, similar to repn of regulation - ontologies themselves can be confidential -- what are the implications? . eg pro forma, valuation models, new kinds of derivatives accuracy - incentives for accuracy - types of inaccuracy - examples . underwriting; eg liar loans, rater corruption . stress tests . timestamping of trades - forensics . data mining -- learn patterns of the deceivers or the troubled . eg wash trades to create apparent bogus price to report to risk mgers map of data feeds - provenance - aggregation property rights - value of info - incentives for usefulness of info - economics-heavy more than KR-heavy %%