RP2: Divide and Conquer Deep Learning for Big Data in Finance (WP1)

03.11.2016

 

In the era of Finance Big Data, how can one conquer something so big and so vast? Management and learning in Finance Big Data should thus follow the holistic “Divide & Conquer” philosophy. We will develop a novel platform that supports all aspects of this philosophy, including workflow-based tools for content ingest and description, distributed storage, the management and preservation, self-data organisation (SDO) for partitioning large categories of finance data, user interaction models and visualisation, use of large evolving and self-adapting clouds of evolutionary feature synthesis (EFS) and classifier networks to learn hidden patterns in finance data, and services for different applications that can inherently use Cloud computing environments such as anomaly detection, new trends detection and prediction, and risk management. Furthermore, hashing or quantisation encoding methods are common tools exploiting the divide and conquer approach.

We aim to design novel vector quantisation and hashing techniques, which will surpass the state of the art and use them for the first time in Big Data for finance. Our current solution using ranked 2nd in the 2014 MSR-Bing Image Retrieval Challenge. Smart subsampling in Big Data is naturally suitable for distributed computing environments. We will study several subsampling methods by inferring the properties of finance Big Data. The main goal is to achieve highly scalable algorithms.

The expected results comprise a novel platform that supports all aspects of this philosophy, including workflowbased tools for finance data analysis; distributed storage; management of an expanding and evolving body of finance data; use of a large, evolving and self-adapting clouds of evolutionary feature synthesizers and classifier networks capable of “learning” important and relevant aspects of finance Big Data; development of novel hashing- and vector-quantisation-based techniques for data encoding and smart data sampling; and implementation of these techniques in a Grid/Cloud computing environment.

Early Stage Resercher working on the project: Adamantios Ntakaris

Supervisor: Professor Moncef Gabbouj, Tampere University of Technology / moncef.gabbouj(at)tut.fi

TUT logo