RP1: Distributed and Real-Time Machine Learning for Financial Data Analysis (WP1)


Big data has both high volume and high velocity – one way this manifests is as silos of in-situ data representing departments in banks that are very difficult to move and integrate to obtain a single coherent customer view. Further, the ability to perform data analytics – dynamically and in near real-time – of rapidly changing customer and market data is increasingly critical for competitiveness. By considering the distributed nature of financial data storage and the velocity of financial markets, the objective of this RP is to develop distributed and real-time machine learning methods to identify decentralised and dynamic models for financial analysis, prediction, and risk management.

This project will develop (i) methods to identify cross-effects between different data resources, regions, sectors, and markets, (ii) distributed versions of methods to identify decentralised models that include individual local model components learned from local resources and cross-impact model components learned from data resources in other regions/sectors/markets, and (iii) real-time learning methods to update decentralised models and address financial market velocity.

Based on the distributed and cloud computing infrastructure, this approach should address the weakness of existing data-centralised and off-line machine learning methods, which fail to consider the cost of data transportation, storage, and fast timevarying characteristics of financial markets. The originality of this approach is its dynamic integration, by distributed and real-time mining, to maximise the effectiveness and efficiency of big data analysis.

Early Stage Resercher working on the project: Sergio Garcia Vega

Supervisor: Professor John Keane, University of Manchester / john.keane(at)manchester.ac.uk