Machine Learning

Machine Learning

Machine learning has evolved from pattern recognition and matching to the self-adaptation of models to new data; related, but different, paradigms are data mining and Bayesian analysis. Substantially, the learning from previous computations is employed to generate well funded and repeatable results. Specifically, it is possible to cheaply and almost automatically produce models that can analyze large datasets and provides accurate decional information.

Types of Learning

Machine learning aims at getting better at some task through practice; algorithms and paradigms can be classified as:

  • Supervised learning: the algorithm is trained with a some examples of correct responses (i.e., inputs and outputs).
    • K-Nearest Neighbours
    • Naive Bayes Classifier
    • Regression
    • Support Vector Machines
    • Decision Trees
    • Neural Networks
  • Unsupervised learning: the algorithm attempts to identify similarities between the inputs (i.e., clustering).
    • K-Means Clustering
    • Self-Organizing Maps
    • Neural Networks
  • Reinforcement learning: the algorithm is provided wrong examples but not how to distinguish between right and wrong; instead, punishment and rewards are employed to orient the algorithm.
    • Montecarlo
    • Deep Neural Networks
    • Value Function
  • Evolutionary learning: biological evolution is used to find a solution by considering its fitness with the data; thus, the codification of the data in a suitable biological model is the key.

A nice and free introduction is available here.

Differences with Statistical Modeling

In general, machine learning relies on automatic and dynamic model selection with unstructured data fow which a data generation process is unknown; it is data-driven and focuses on finding structure. Instead, classic statistical approaches are more suitable for data that can be assumed as generated by a model.

Statistical InferenceMachine Learning
GoalCausal models with explanatory powerPrediction performance, often with limited explanatory power
DataThe data is generated by a modelThe data generation process is unknown
FrameworkProbabilisticAlghorithmic and Probabilistic
ExpressibilityTypically linearNon-linear
Model SelectionBased on information criteriaNumerical Optimization
ScalabilityLimited to lower-dimensional dataScales to high-dimensional input data
RobustnessProne to over-fittingDesigned for out-of-sample performance

The Economic Impact of Machine Learning

Artificial Intelligence (AI) and machine learning are not targeting only robotics and automation; indeed, the Industry 4.0 standard implies the intertwined roles of humans and AI systems. The business applications are numerous and ranging from healthcare (i.e., clinical, analytical, and technical solutions), insurance (i.e., information about a population’s health, quality and costs assessments) to manufacturing, government, and financial markets. For instance, binary logistic regression allows to identify predictors and develop risk-scoring algorithms to identify which course of actions is the best for the considered individual (patient, customer, or both). In some instances, machine learning are employed to obtained a solution based only on data and mathematical models without the bias introduced by heuristic rules derived from human experience. Other relevant applications include recommendation engines (i.e., like Netflix, Amazon, etc.), sentiment analysis for marketing, fraud detection, demand forecasting from flexible pricing, churn predictions, etc.

The Growth-Optimal Portfolio

In general, automated algorithms (i.e., scanners) based on machine learning allow for sequential investment strategies based on traditional principles (i.e., Kelly criterion, fundamental ratios, factors, etc.) but are more powerful in the maximization of the available capital. With respect to portfolio management, common theoretical results involve models related to single assets in a single period with the underlying modeled as a stochastic process. However, the most effective models are more complex (i.e., multi-factor and multi-assets over multiple periods) and can surely profit from the optimizations (i.e., rebalancing between assets without excessive fees) achievable only with machine learning techniques. In other terms, machine learning allows to find the growth-optimal portfolio (GOP), that is the portfolio having a maximized expected growth rate over any duration. Moreover, machine learning does not imply any assumption on the probabilistic distributions of the involved assets.

Algorithmic Trading

Before 2009 AT was not so common; indeed, automated systems were mainly used from large firms like Goldman Sachs, Morgan Stanley, Citicorp, Credit Suisse, and UBS to manage their orders, minimize market impact, optimize execution, slice large orders, etc. However, today algorithmic computerized trading strategies are commonly employed and offered to customers wit different objectives. For instance, Credit Suisse CrossFinder+, Deutsche Bank’s Autobahn, and Fidelity’s FCM were liquidity-seeker strategies. Similarly, hedge funds deployed various algorithmic techniques to optimize their strategies and achieve some form of competitive advantage. As a result, algorithmic trading overlapped with quantitative trading and included technical analysis and fundamental data to automate procedures and process large amounts of information. Machine learning is (sometimes) successfully employed to calibrate the parameters of trading strategies and optimize their results. An example retail platform providing semi-automated strategies that can be modified and customized is Metatrader.

Another approach requires standard programming languages (C, C++, R, or Python) and customized libraries to develop ad hoc framework and software. Below, the same strategy (i.e., trading the gold ETF) is implemented using Decision Tree and KNN as algorithm; a simple class represent the strategy whereas the remaining code is dedicated to diagnostic and analysis of the behavior of the machines. While this approach is more flexible and extendible is also expensive in terms of dedication, maintenance, and skills; notwithstanding, most of the code can be applied to multiple domain and problems.


Chan, E. (2013). Algorithmic Trading: Winning Strategies and their Rationale

Dixon, M.F., Halperin, P., Bilokon, P. (2020). Machine Learning in Finance. Springer.

Hall P., Phan W., and Whitson K. (2016). The Evolution of Analytics. O’Reilly Media, Inc.

Györfi L., Ottucsák G., Walk H. (2012). Machine Learning for Financial Engineering. Imperial College Press.

Marsland S. (2015). Machine Learning. An Algorithmic Perspective. CRC Press Taylor & Francis Group.

Wilmott P. (2019). Machine Learning. An Applied Mathematics Introduction. Panda Ohana Publishing.

Relevant Posts and Pages

Sentiment Analysis

Sentiment Analysis

In finance, sentiment refers to the measurement of the excessive confidence, either positive or negative, of investors in a specific situation. The fundamental reason why it is studied is because psychologists determined sentiment as a relevant factor affecting the judgement and decisions of investors.

error: Hey, drop me a line if you want some content!!