I-BiDaaS increases impact in research community and contributes to innovation capacity.
I-BiDaaS develops a set of complementary data processing tools, applicable to both batch and streaming data. The applicability of these tools is verified through (i) 3 complementary, large-scale pilots within the project, driven from 3 different industries (automobile manufacturing, finance/banking and telecoms); (ii) the engagement and involvement of third-parties based on existing links and collaborations of I-BiDaaS partners.
In regard to (i), I-BiDaaS is aiming to demonstrate the applicability of its innovative, distributed big data analytics approach based on fully asynchronous Innovative Distributed Solvers and their key feature to solve optimization problems in three real-world settings in automotive industry, in the banking sector and the telecommunication sector. The applicability will be demonstrated as well in terms of application layer tools that enable both developers (e.g., e-banking system developers) and users (e.g., e-services users) to manage batch and streaming data. It will also be demonstrated through advance interactive visualization tools that support both batch and streaming processing modules at the service of industrial decision support processes in the three mentioned pilot scenarios. Finally, I-BiDaaS will also demonstrate its approach in the context of real-time complex event processing over extremely large numbers of high-volume streams of possibly noisy, possibly incomplete data by generating synthetic data or combining synthetic data and real data in these three real-world settings.
In regard to (ii) data processing tools and services will be made available through a number of different infrastructure frameworks provided by the I-BiDaaS consortium: an incubator provided by Telefonica will be exploited to host and communicate the tools and services developed; it will also be exploited a potential incorporation of tools and services to TID’s 4th cognitive platform (AURA); big data training environment will also be made available by CRF (Campus Melfi) towards the same direction.
I-BiDaaS demonstrates significant increase of speed of data throughput and access, as measured against relevant, industry-validated benchmarks.
I-BiDaaS is using a number of tools and technologies as well as the capacity of its partners to be able to demonstrate the significant increase of speed of data throughput and access, as measured against relevant, industry-validated benchmarks. I-BiDaaS utilizes FORTH’s solution which provides a real-time high-speed stream processing and pattern matching engine, tailored for continuous stream data, utilizing GPGPUs for the acceleration of computations and it will be used by the I-BiDaaS system for streaming analytics. The currently deployed solution in FORTH is able to process more than 60Gbps of real-time network traffic and perform string pattern matching on top of it. Utilizing the parallel characteristics of GPUs, high performance stream processing can be easily achieved, in the order of tens of Gbps of throughput. FORTH’s stream processing and pattern matching engine will be offered as an API that provides data analytics, offloading computationally intensive tasks to GPUs for processing acceleration.
I-BiDaaS adopts distributed multi-agent optimization algorithms to make significant contributions to these methods. This is due to the fact that distributed multi-agent algorithms are either designed for fully distributed architectures, or for a star network, master-worker architecture which in the case of I-BiDaaS build on the existing methods and develop hybrid methods that simultaneously exploit: 1) message passing across neighbouring (worker) agents; and 2) globally available information across groups of agents through aggregation. By exploiting this structure, the project will potentially allow for significant improvements in solution speed and significantly better scalability of methods.
I-BiDaaS through the utilisation of COMP’s which will be used as a programming framework for the batch processing like applications/algorithms will achieve an impact on the speed of the execution by leveraging on its ability to transform a sequential application into a parallel and distributed one. At least 30% increment is expected in the stakeholders that have direct access to big data and the relevant analytics tools, setting thus the grounds for significant increment in data throughput as well.
I-BiDaaS substantial increases in the definition and uptake of standards fostering data sharing, exchange and interoperability.
I-BiDaaS increases the definition and update of a number of different standards related to Big Data Analytics: (i) standard programming paradigms such as Hadoop; (ii) standards related to Big Data Analytics Maturity Models such as the ones developed by TIBCO, by SAS Analytics Assessment, and the IDC Big Data and Analytics (BDA) Maturity Scape Framework; (iii) Data Science Code of Professional Conduct standards, developed by the Data Science Association.
In the same framework, I-BiDaaS addresses specific challenges in priorities defined by BDV SRIA and make a step towards new and advanced standardisation procedures. These challenges include (i) Semantic annotation of unstructured and semi-structured data; (ii) Data quality; (iii) Data-as-a-service in the data management priority; (iv) heterogeneity and Scalability in the data processing architectures priority; (v) Analytics frameworks & processing; (vi) Predictive and prescriptive analytics in the data analytics priority and (vii) Interactive visual analytics of multiple scale data; (viii) Interactive visual data exploration and querying in a multi-device context in the data visualisation and user interaction priority.
Additionally, I-BiDaaS contributes to the distributed data and process mining, predictive analytics and visualization at the service of industrial decision support processes by design a runtime environment for predictive analytics to enable data scientists to develop prediction models based on the open standard Predictive Model Markup Language (PMML), and deploy them to the runtime engine. This is effectuated by the design and development of the necessary visualisation tools for DS processes. I-BiDaaS through the exploitation strategy of the partners contributes to this effort by (a) Developing a demonstrator for business continuity, proving the potential for integration into existing company processes, and (b) develop a business case, based on the big data analytics tools to be developed, demonstrating the added value of the approach with respect to existing standards.
The Data Economy and the link to I-BiDaaS
Data Economy is an integral part of the Digital Single Market strategy of the European Commission, and as such, it is considered as an essential resource for growth, competitiveness, innovation, job creation and societal progress in general. It is however widely acknowledged by the industry that many companies continue to struggle to turn opportunity from big data into realized gains. There is little actual knowledge on how organisations translate the potential of big data into actual economic and social value.
Moreover, relevant to I-BiDaaS is the debate on algorithmic and human-based intelligence and in particular the acknowledgement that when processing and interpreting data, human actors can be influenced, for example, by time constraints and scepticism with regard to relying on data; team compositions; visualizations of input and output; relational versus analytic and evidence-based mind sets, and historical insights. To mitigate such influences, scholars and practitioners have begun to explore the potentials of algorithms that are able to process big data at ever-increasing speeds.
Within this logic the I-BiDaaS consortium reflected on the 4 steps proposed by the European Commission in order to leverage on the potential of Big Data as well as the motivation behind the targeted call under H2020 -ICT-2016-2017 which was to develop technologies that would increase the efficiency of all EU companies and organisations that need to manage vast and complex amounts of data and in particular the competitiveness of EU enterprises. In this context, the emphasis is on rigorously measured increases in performance in data processing at a very large scale.
In this respect, I-BiDaaS is expected to produce services and tools that aim at enhancing big data processing performance for both non-IT users (Comprehensive multiple option user interface, advanced visualizations, fabricating high-quality Big Data for testing) and for IT-users/developers (programming interfaces; Sequential programming paradigm; Open source software repository for Big Data processing tasks).
These tools are expected to increase performance in three levels:
1. the speed of data analysis procedures;
2. the usability and applicability of big data analytics tools and processes;
3. the amount of data that can be processed.
The proposed solutions will be made available through existing incubators thus enabling SMEs, start-ups and entrepreneurs to exploit them and accelerate their development further. In a nutshell, I-BiDaaS will deliver a full array of big data business analytics solutions for structured, unstructured, noisy data for companies in multiple industries (finance, telecom and automotive) that are more accessible, cost-effective and employee-empowering than existing solutions, which gives companies the confidence to deploy Big Data Self-Service solutions across the organisation, from consumer-facing employees with little IT experience or expertise to top.