Model-Based Testing of Big Data Systems (MBT4BID)

Big Data has now become a very important driver for innovation, competition and growth for various industries such as health, administration, agriculture, defense, and education. Developing Big Data systems, however, is not trivial and relies on disruptive technologies such as Cloud Computing, Internet of Things and Data Analytics. Big data systems are inherently distributed systems consisting of multiple heterogeneous nodes. As such, architecting Big Data Systems require to explicitly deal with concurrency, reliability, consistency, time performance, consistency, and replication concerns. As systems grow and tend to include thousands of processing nodes and disks, geographically distributed across data centres designing scalable Big Data

Systems that meet these quality concerns becomes a serious problem. An important challenge of Big Data systems is ensuring consistency with respect to the defined specifications. A popular approach for checking whether a system meets the specification and its intended purpose is software testing. Testing a system requires executing the test cases that can detect the potential defects in the program. In general, exhaustive testing is not possible or practical for most real programs due to the large number of possible inputs and sequences of operations. In particular, for Big Data systems exhaustive testing is not possible. Because of the large set of possible tests only a selected set of tests can be executed within feasible time limits. As such, the key challenge of testing is how to select the tests that are most likely to expose failures in the system. In this project we aim to address the challenges for testing Big Data systems. In particular, we will focus on providing a systematic approach for architecture-driven model-based testing of big data systems.