Title: Supporting Scientific Analytics under Data Uncertainty and Query Uncertainty
Location: PCRI (https://www.lri.fr/info.pratiques.php), room 455
Date and time: January 16, 2015, 10 am
Abstract:
Data management is becoming increasingly important in large-scale
scientific applications such as computational astrophysics, severe weather
monitoring, and genomics. In this talk,
I present our recent work to address two major challenges raised by those
scientific applications. The first challenge regards “data uncertainty”, due to the fact that scientific measurements are
inherently noisy and uncertain. In particular, we address uncertain data
management under the array model, which has gained popularity for large-scale
scientific data processing due to performance benefits. We propose a suite of
storage and evaluation strategies to support array operations under data uncertainty.
Results from Sloan Digital Sky Survey (SDSS) datasets show that our techniques
outperform state-of-the-art methods by 1.7x to 4.3x for the Subarray operation
and 1 to 2 orders of magnitude for Structure-Join.
As scientific data continues
to grow in size and diversity, it is becoming harder for the user to express her
data interests precisely in a formal language like SQL. We refer to this second
problem as “query uncertainty”. This leads to a strong
need for “interactive data exploration,” a service that efficiently navigates the
user through a large data space to identify the objects of interest. We present
our initial work on interactive data exploration, with results suggesting that it
is possible to predict user interests modeled by conjunctive queries with a
small number of samples, while providing interactive performance.
Bio:
Yanlei Diao is Associate Professor of Computer
Science at the University of Massachusetts Amherst. Her research interests are
in information architectures and data management systems, with a focus on big
data analytics, scientific analytics, data streams, uncertain data management, and
RFID and sensor data management. She received her PhD in Computer Science from
the University of California, Berkeley in 2005, her M.S. in Computer Science
from the Hong Kong University of Science and Technology in 2000, and her B.S.
in Computer Science from Fudan University in 1998.
Yanlei
Diao was a recipient of the 2013 CRA-W Borg Early Career Award (one female
computer scientist selected each year), IBM Scalable Innovation Faculty Award,
and NSF Career Award, and she was a finalist of the Microsoft Research New
Faculty Award. She spoke at the Distinguished Faculty Lecture Series at the
University of Texas at Austin. Her PhD dissertation “Query Processing for
Large-Scale XML Message Brokering” won the 2006 ACM-SIGMOD Dissertation Award
Honorable Mention. She is currently Editor-in-Chief of the ACM SIGMOD Record,
Associate Editor of ACM TODS, Area Chair of SIGMOD 2015, and member of the SIGMOD
Executive Committee and SIGMOD Software Systems Award Committee. In the past,
she has served as Associate Editor of PVLDB, organizing committee member of
SIGMOD, CIDR, DMSN, and the New England Database Summit, as well as on the
program committees of many international conferences and workshops. Her
research has been strongly supported by industry with awards from Google, IBM,
Cisco, NEC labs, and the Advanced Cybersecurity Center.
No comments:
Post a Comment