The goal of the 1st International Workshop on Next Generation Clouds for Extreme Data (XtremeCLOUD) is to bring together researchers and practitioners from both academia and industry to explore, discuss and possibly redefine the state of the art in Cloud Computing relative to heterogeneity, resource management, scalability, methods and tools applied over any part of algorithms and computing infrastructures, as well as use-cases and applications that relate to extreme data analytics. This workshop will solicit original research work on fundamental aspects of Cloud Computing that enable extreme scale data processing as well as the design, implementation and evaluation of novel tools and methods for optimizing Big Data applications and workflows.
Authors are invited to submit papers containing unpublished, original work (not under review elsewhere) of up to 6 pages of double column text using single spaced 10 point size on 8.5 x 11 inch pages, as per IEEE 8.5 x 11 manuscript guidelines. Templates are available from IEEE website. Authors should submit a PDF file. Papers conforming to the above guidelines can be submitted through the workshop’s EasyChair submission system.
Accepted papers must be presented at the workshop. At least one author of each accepted submission must attend the workshop and all workshop participants must pay at least the CloudCom 2018 workshop registration fee. All accepted papers will be published by the IEEE. All presented papers will be subject to peer review process.
This talk will outline my current understanding of analytical data management in the cloud.
I will start with a short overview of the state of the art in cloud systems for analytics, focusing on used for ETL,
warehousing, BI and machine learning. In this, I will also describe some of my experiences as scientific
advisor of Databricks, the company behind Spark and hosting data science software as-a-service
in both AWS and Azure. These experiences will include architectural trends for analytical data
management systems, and also cover some recent scientific research at CWI. The final part
of the talk will focus on exploiting future, heterogeneous, hardware in the cloud. Beyond machine learning,
it is currently unclear how data science pipelines, and specifically in ETL, warehousing and BI,
will be able to use novel hardware elements in both storage and processing. Rather than
providing answers, I will be outlining questions to be addressed by future research in this area.
Peter Boncz holds appointments as tenured researcher at CWI and professor at VU University
Amsterdam. His academic background is in core database architecture, with the architecture
of MonetDB the main topic of his PhD thesis –MonetDB won the 2016 ACM SIGMOD systems
award. This work focused on architecture-conscious database research, which studies the interaction
between computer architecture and data management techniques. His specific contributions are in
cache-conscious join methods, query and transaction processing in columnar database systems,
and vectorized query execution. He has a strong track record in bridging the gap between academia
and commercial application, receiving the Dutch ICT Regie Award 2006 for his role in the CWI
spin-off company Data Distilleries. In 2008 he founded a new CWI spin-off company called Vectorwise,
dedicated to state-of-the art business intelligence technology. He is also the co-recipient of the
2009 VLDB 10 Years Best Paper Award, and in 2013 received the Humboldt Research Award.
His current interests are data systems architectures covering various angles such as database-as-a-service
in the cloud, graph and network databases, and databases that can take advantage of heterogeneous
processors and modern storage media. He is an scientific advisor to Databricks, the Berkeley
spin-off company that operates a Spark cloud service with R&D offices in San Francisco and Amsterdam.
9:00-10:30 Welcome – Keynote
Peter Boncz: Systems for cloud data analytics: present and future
10:30-11:00 Coffee break
15:15-16:00 Coffee break