The modeling of the order book is certainly one of the major chal-
lenges for the contemporary econometrics. The constant and fast grow on a
world-level of the electronic platforms and the associated increasing number of
traders, led to a market of extreme complexity. The traditional trading based
on human traders has been rapidly substituted with algorithms, able to respond
and adapt to the changes in the market and trade accordingly at an extraordi-
As a consequence, a primary problem related to the actual volume of the trans-
action, in the sense of the number of events that take place and the amount of
information that needs to be processed. On trading platforms such as NYSE
the size of the compressed raw market data (so called ITCH feed) is likely to be
around 25GB for a single trading day. Obviously, the complexity of the order
book modeling goes beyond the theoretical ﬁnancial-econometrical issues but
begins with the feasible and eﬃcient data management problem (data accessi-
bility, information extraction, fast computation of variables that may require
to lookup in the ﬁles for millions of records for being computed). Think that
commonly the algorithms trade at a millisecond frequency and that very likely
we observe several events per millisecond, this through all the market opening
and pre-post opening hours. At a ﬁrst glance, the order book data looks like
a massive list of events with some corresponding features, of highly stochastic
nature, i.e. a chaos.
Trades (either physical persons or not) access the market by submitting or-
ders. An order expresses the willingness of the traded of buying or selling a
given quantity of a share for a given price. Orders can be submitted against
the best available price or not, we talk of market or limit orders respectively.
The purpose of the limit orders is that of trading (e.g. a buy market order
gets matched with the outstanding best sell orders, leading to a trade). On the
other hand, limit orders don’t get an automatic match since submitted at and
arbitrary price, which is not the best one. The trader therefore has the belief
(rational) that the market will move till the point that the submitted limit or-
der reached the current best price level and gets traded against a market order.
Clearly the mechanism of order submission is highly stochastic since each trader
places his orders based on his own beliefs (e.g. diﬀerent algorithms), i.e. prices,
quantities, bid-ask side and frequency of submission are completely up to the
trader. Also, a submitted order can be canceled and the cancel can be either
total (all the submitted quantity is removed) or partial. The same applies to
the trades, since the market order quantity not necessarily equals the quantity
to match, so a trade occurs but the remaining quantity remains in the queue.
Order submissions, trades and cancellations represent the three types of events
that aﬀect the order book. The time-ordered list of the events corresponds to
the so-called message book. The snapshot of the outstanding limit orders in a
given moment instead depicts the so-called order book state. The algorithms are
instructed with sophisticated machine-learning techniques to submit the three
types of messages above mentioned, ﬁghting among each other to get a trade
and submit orders and cancels based on some optimal criteria.
In this framework it is clear that the mathematical modeling of the dynamics of
the order book states and features is a non-trivial problem. There is no unique
approach to tackle this problem. There are analytic models relying on some
market-based assumptions as well as totally data-driven models that exploit
machine learning techniques to predict the short-term dynamics of the book.
Interestingly there are also models that rely on the chaos theory (using tech-
niques commonly applied to natural sciences to describe phenomena, such as
particle dynamics in gases, ruled by randomness only) trying to characterize the
chaotic and apparently illogical behavior of the book, renouncing to ﬁnd logical
patterns and to seek for ﬁnancial-justiﬁed hypotheses to build these models on.
If on one hand the number of agents trading in the order book implies a
widespread number of algorithms developers that almost exclusively with ma-
chine learning techniques attempt the prediction of its future dynamics -more
or less successfully-, the ﬁnancial-econometric modeling is a non-saturated re-
search ﬁeld in that gives the researches many directions to explore.
Being part of the BigData ﬁnance group and speciﬁcally working on such con-
temporary and challenging topic, rather than making me feel alone in a desolate
desert of high complexity of apparent irrational stochastic, makes me feel ex-
cited for the research opportunities this research area provides. Since I started
this journey I have been forced to develop a number of cross-sectional skills (e.g.
cloud computing, big data analytic, machine learning techniques, econometric
theory for point processes) that the contemporary researches must handle and
be acquainted with to deal with this kind of problems. Well far from being an
expert, I sense that this set of multidisciplinary skills are required for the future
data analyst and scientist and feel glad to have the opportunity to work in this
direction within the BigData ﬁnance network.
Martin Magris is based at Tampere University of Technology 2016-2019, and his research project is Order Books Dynamics and Announcement Effects during Financial Crisis (WP3)