Proposal ID: 780245
Role: Partner
Acronym: E2DATA
Topic: ICT-16-2017
Type of action: RIA
Call identifier: H2020-ICT-2016-2017

E2DATA:European Extreme Performing Big Data Stacks

Duration in months: 36
Fixed keyword 1: Scalability
Fixed keyword 2: Real time data analytics
Fixed keyword 3: Data stream analysis
Free keywords: Heterogeneous computing, Elastic resource provisioning, Big Data Applications, Energy Efficiency

Imagine a Big Data application with the following characteristics: (i) it has to process large amounts of complex streaming data, (ii) the application logic that processes the incoming data must execute and complete within a strict time limit, and (iii) there is a limited budget for infrastructure resources. In today’s world, the data would be streamed from the local network or edge devices to a cloud provider which is rented by a customer to perform the data execution. The Big Data software stack, in an application and hardware agnostic manner, will split the execution stream into multiple tasks and send them for processing on the nodes the customer has paid for. If the outcome does not match the strict three second business requirement, then the customer has two options: 1) scale-up (by upgrading processors at node level), 2) scale-out (by adding nodes to their clusters), or 3) manually implement code optimizations specific to the underlying hardware. E2Data proposes an end-to-end solution for Big Data deployments that will fully exploit and advance the state-of-the-art in infrastructure services by delivering a performance increase of up to 10x while utilizing up to 50% less cloud resources. E2Data will provide a new Big Data software paradigm of achieving the maximum resource utilization for heterogeneous cloud deployments without affecting current Big Data programming norms (i.e. no code changes in the original source).

Lab URL: