| Date: | 14/01/2015, 14h |
| Room: | Télécom ParisTech, LINCS (23 avenue d'Italie, Paris), Salle du Conseil |
| Speaker: | Yanlei Diao (Universityof Massachusetts, Amherst) |
| Talk: | Platforms and Applications for "Big and Fast" Data Analytics |
| Abstract: | Recently there has been a significant interest in building big data systems that can handle not only "big data" but also "fast data" for analytics. Our work is strongly motivated by recent real-world case studies that point to the need for a general, unified data processing framework to support analytical queries with different latency requirements. Towards this goal, our project is designed to transform the popular MapReduce computation model, originally proposed for batch processing, into distributed (near) real-time processing. In this talk, I start by examining the widely used Hadoop system and presenting a thorough analysis to understand the causes of high latency in Hadoop. I then present a number of necessary architectural changes, as well as new resource configuration and optimization techniques to meet user-specified latency requirements while maximizing throughput. Experiments using typical workloads in click stream analysis and twitter feed analysis show that our techniques reduce the latency from tens or hundreds of seconds in Hadoop to sub-second in our system, with 2x-7x increase in throughput. Our system also outperforms state-of-the-art distributed stream systems, Twitter Storm and Spark Streaming, by a wide margin. Finally, I will show some initial results and challenges of supporting big and fast data analytics in the emerging domain of genomics. |
| Biography: | Yanlei Diao is Associate Professor of Computer Science at the University of Massachusetts Amherst. Her research interests are in information architectures and data management systems, with a focus on big data analytics, data streams, uncertain data management, and RFID and sensor data management. She received her PhD in Computer Science from the University of California, Berkeley in 2005, her M.S. in Computer Science from the Hong Kong University of Science and Technology in 2000, and her B.S. in Computer Science from Fudan University in 1998. Yanlei Diao was a recipient of the 2013 CRA-W Borg Early Career Award (one female computer scientist selected each year), IBM Scalable Innovation Faculty Award, and NSF Career Award, and she was a finalist of the Microsoft Research New Faculty Award. She spoke at the Distinguished Faculty Lecture Series at the University of Texas at Austin. Her PhD dissertation "Query Processing for Large-Scale XML Message Brokering" won the 2006 ACM-SIGMOD Dissertation Award Honorable Mention. She is currently Editor-in-Chief of the ACM SIGMOD Record, Associate Editor of ACM TODS, Area Chair of SIGMOD 2015, and member of the SIGMOD Executive Committee and SIGMOD Software Systems Award Committee. In the past, she has served as Associate Editor of PVLDB, organizing committee member of SIGMOD, CIDR, DMSN, and the New England Database Summit, as well as on the program committees of many international conferences and workshops. Her research has been strongly supported by industry with awards from Google, IBM, Cisco, NEC labs, and the Advanced Cybersecurity Center. |
Thursday, December 18, 2014
Yanlei Diao: Platforms and Applications for "Big and Fast" Data Analytics, Télécom ParisTech (INFRES) and LIX Seminar, January 14, 2015
Monday, December 8, 2014
Puya-Hossein Vahabi: "Social media and blogging", PCRI, January 16, 2015
Puya - Hossein Vahabi will give a seminar on January 16, 2015, at 14.00. It will take place in the PCRI building, room 445.
When: Friday, January 16, at 14.00.
Where: PCRI building, room 445
Who: Puya - Hossein Vahabi
Title: Social media and Blogging
Abstract:
The
talk will be focused on Social Media and Blogging. In particular I'll
present three different works: 1. A novel approach for as-you-type
network-aware top-k keyword search over social media; 2. A novel
approach to harness the social community information to discover and
model the evolution of topics in social networks using matrix
co-factorization; 3. A novel approach to enhance user engagement in
online social-network, micro-blogging, and other online platforms.
Short Bio:
Puya
- Hossein Vahabi is a researcher at Yahoo Labs (2014-Now) working on
social media, stream data, and advertising. He got his Ph.D. in Computer
Science and Engineering (2009 - 2012) at the IMT Lucca, Italy. He
graduated with a thesis on "Recommendation Techniques for Web Search and
Social Media". The thesis was supervised by Prof. Ricardo Baeza-Yates
(Vice President of Research for Europe and Latin America, leading the
Yahoo Labs), and Dr. Fabrizio Silvestri (Senior Researcher at Yahoo Labs
and Researcher at National Research Council of Italy). Before joining
Yahoo, he was involved in several startups on social blogging and social
video streaming. In 2010, during his Ph.D., he worked on recommender
systems (and massive query log analysis) at Yahoo Labs for a year, and
he was also a research associate to the National Council for Research of
Italy for three years.
Subscribe to:
Comments (Atom)