Skip to Content


Leveraging big data in equity markets

July 2020

By Peter Sumner, Portfolio Manager – Australian Equities & AREITs

One of MLC’s defining features is our belief in the importance of diversification in all its dimensions. This means diversification by asset classes, geographies, risk categories, investment approaches, listed and unlisted assets, investment managers and more.

Diversification across many dimensions means that our portfolios aren’t hostage to a single investment idea or theme for return success. Our Index Plus portfolios1 reflect this as they combine active asset allocation, active fixed income and currency management, alongside enhanced equities.

Enhanced equity investing is higher on the risk/return spectrum than index equities, and smart beta or factor investing (relative to index), but lower than fundamental equities or high alpha seeking strategies.

The enhanced equities strategy in the Index Plus portfolios have a ~30-50 basis points tracking error2 — compared to the 200-300 basis points tracking error usually associated with most fundamental active equity strategies or even higher for high alpha strategies — aimed at achieving 20-30 basis points outperformance against the S&P/ASX 200 Accumulation Index.

Some of the world’s most renowned quantitative managers have become associated with enhanced equities. However, the way these 21st century quants invest is light years away from 20th century quants.

Well-known value ratios such as low price-to-earnings, price-to-book, price-to-cash flow, as well as “success factors” such as price momentum, earnings revisions, earnings growth and return-on-equity are relied on by old-school quant managers.

This approach essentially assumes that the future is an extension of the recent past. This kind of extrapolation sounds more like a behavioural bias founded on human nature than science. We think that the drivers of stock prices are much more varied, complex and subtle than can be explained by well-known investment metrics.  

Think about this? It would have been a losing investment strategy, over the past decade, to have simply invested on low price-to-earnings, price-to-book, and price-to-cash flow. Sticking by factors because they have previously been pointers to future return success smacks of dogmatism, even naivety.

Breadth to uncover information advantages

By contrast, “systematic investing” by 21st quants aims to uncover small information advantages by analysing oceans of new data generated each day by the Internet, smartphones, satellites and other innovations.

In his influential 1989 paper, “The fundamental law of active management,” 3 Richard Grinold proposed three variable sources of return: an investor can have skill in forecasting exceptional returns, or breadth of their investment strategy and the value added of their investment strategy.

Rather than attempting to pick individual stocks or forecasting industry returns, 21st century quants concentrate, through research processes, on applying “breadth” in understanding how the marketplace processes new information and sets prices for different types of companies.

By focusing on a vast number of stocks in a market, they aim to capture systematic return effects that are undiscovered or ignored by many investors due to their subtlety, complexity, and size of net opportunity after transaction costs.

The term “systematic” is important as it conveys the idea that while investment insights (or “fundamental ideas”) originate from people (researchers and analysts), scientific and quantitative techniques are applied to determine which insights are economically valid and can be profitably exploited in a risk-controlled and cost-effective manner.

It’s a far cry from “black-box” investing or “set and forget” investing. The human element is ever-present as successful monitoring, validation and implementation requires the constant attention and input of people.

Finally, this way of investing argues that technology-based, disciplined investment processes have a comparative advantage over people in consistently capturing and implementing investment insight. It allows the analysis of more information quicker, removes emotion from decision-making, and is superior at managing the complex nature of risk and the components of transaction costs.

No need to wait for information

For 21st century quants, the world is a treasure trove of information that can be surfaced every day. It doesn’t require waiting for company announcements, official economic statistics, high profile private sector surveys, or pounding pavements on company visits.

Advances in machine learning and artificial intelligence (AI) now enables cutting-edge systematic investors to capture and measure thousands of relevant attributes in real time (Chart 1). These include consumer behaviours, product developments, company trends, differences across regions and more.  

They combine all these insights to forecast stock prices and build portfolios.

Chart 1: Traditional and time-consuming research inputs vs constantly “on” information availability

How fundamental investors traditionally gain insights

How systematic investors gain insights

News, broker reports, ASX announcements, blogs

Natural language processing

Meetings with experts

Big Crowd

Field visits

Satellite and GPS

Government statistics (eg GDP, inflation unemployment, capex intentions) and closely watched private sector surveys (eg NAB Monthly Business Survey, Westpac-Melbourne Institute Index of Consumer Sentiment


Intentions and emotions

Search trends

Source: BlackRock and MLC Investments Limited

Let’s unpack this a little more. Beyond financial data lies “big data”, which refers to the large, diverse sets of information that grow at ever-increasing rates. It encompasses the volume of information, the velocity or speed at which it is created and collected, and the variety or scope of the data points being covered.

Systematic investors are big data miners that draw on technologies like natural language processing, image recognition and machine learning to analyse it and uncover new investment insights.

Five general categories of data — text, search, social media, images, and video4 — are especially rich sources of information.

Let’s touch on each in turn. 

1. Text: uncovering truth in language

Company executives tend to be very careful in their written as well as spoken communication around financial performance, business strategy, ESG issues and other communications such as earnings announcement question and answer sessions with shareholders and other market participants, knowing that a stray word can move share prices.

At the same time, research analysts and financial journalists in Sherlock Holmes mode try and get behind executives’ every word and gesture in a “what do they really mean” hunt.

Systematic investors believe they’ve found a way of finding out what companies really mean by drawing on natural language processing (NLP) techniques. Initially their work focused on measuring sentiment through counting the use of positive versus negative words.

As time went on, they began to focus on the source and target audience of the text. One finding: remarks by CEOs tended to be more positive and scripted, so focusing on the CFO’s remarks or the Q&A sections of an earnings call proved consistently more useful.5

Comparing sentiment from one source (such as information given to the market eg a press release) with another source (such as information given to a regulator eg The US Securities and Exchange Commission, SEC) can help to identify instances of companies’ careful public language versus hard facts (Chart 2).6

This example reveals two things. Even very disciplined companies can inadvertently reveal truths that they may not want exposed. Sometimes, all it takes is a different context, as the SEC filing in Chart 2 makes plain.

Secondly, markets are continuously changing, and that means investment opportunities have a lifecycle.7 To keep up with new opportunities as previous information advantages fade, an investment team needs to be relentless about finding new data sources and innovating the technologies used to analyse that data.8

Chart 2: Hard truths vs careful public statements

Source: SEC EDGAR. Paul Ma, Information or Spin? Evidence from Language Differences Between 8-Ks and Press Releases, 2012.

Text: technology as a time and labour saver9

In a typical research process of a long-only investor and without the help of machine learning, a portfolio manager needs to travel to company sites to examine its operations.  

They analyse the whole value chain by speaking with suppliers, competitors and customers, some of whom may be unlisted companies.  They listen to earnings calls and read research reports, which tend to be very time-consuming.

However, with the help of machine learning, the asset manager can use satellite images to examine companies’ on-the-ground activities. They can also save time by not having to go through so many texts or listen to management calls.

Instead, the machines can read the text, in multiple languages, and generate summary reports. The same logic applies to other important themes that may influence the market including consumer trends, healthcare and social governance.

Text: being efficient with sell-side analysis

Sell-side research from brokers remains important to asset managers. Fundamental investors can read through entire reports and consider the implications for stocks they own or are thinking of buying.

Sell-side research is useful for quant managers too. However, quants have traditionally only been able to use analyst information that arrives in the form of numbers — the earnings forecasts and the recommendations, which were easy to convert to a numerical scale.

It meant that the rest of the sell-side document was ignored. Now, though, they can process and interpret the entire analyst report and, for example, understand more about the analyst’s sentiment, as well as the nuances around the earnings forecast.

Analyst reports are unstructured in that each analyst writes their own view of a company without trying to fit into any industry-wide template. As a further complication, analyst reports include legal disclaimers. Although these are easy for humans to identify, they represent a bigger challenge for computers.

The computer’s edge, though, is in reading the roughly thousands of analyst reports generated globally every day and analysing them consistently.

2. Search

For investors, one example of interesting search activity relates to people researching online before buying items ranging from smart phones to cars, TV and clothes.

Internet search activity “data scraping” can help to predict sales. Investors have long been predicting future sales, so monitoring internet search activity provides 21st century quants with new data to improve that effort.

Want to know how many iPhones Apple will sell next quarter? You could wait for the next company announcement, but that means the information becomes a trailing indicator.

Alternatively, you can do what cutting-edge quants do and count the number of iPhone searches made on Google to get ahead of the company announcement.

3. Social media

The third big data category of interest to investors is social media: Twitter, Facebook, LinkedIn, and so forth.

Social media is varied, as are the potential uses of social media data. Websites, such as LinkedIn include data on who works for which companies, who is leaving, and who are the new hires.

Systematic investors can estimate employee sentiment by employee movements. They can estimate whether labour costs are increasing or decreasing based on the number, level, and quality of the new hires and departures.

Once again, employee sentiment and labour costs are of long-standing interest to investors. Social media simply represent new sources of data to help predict those quantities.

4 and 5. Images and video

The fourth and fifth categories are images and video. Investors currently make less use of these categories than the other three, but this will change over the next few years, especially as an increasingly large fraction of all data.

Fundamental investors currently try to judge body language at in-person meetings with senior management.

Computers can analyse videos of senior management presentations for the same purpose and will be able to analyse all such presentations across the entire investment universe.

Systematic investing is another element of diversification

The way 21st century quants gather and analyse information and deploy it is a far cry from quantitative investing, as it’s been traditionally understood. That said, we don’t think of it as the “be-all and end-all” of investing.

Instead, we regard this way of investing a valuable component of the diversification we offer our clients in our portfolios, including the enhanced equity part of our Index Plus portfolios.

It complements other investment approaches and leverages technology in a cost-efficient way to potentially source a modest amount of alpha in a highly risk-controlled way.

1 Index Plus portfolios means the MLC Wholesale Index Plus Conservative Growth Portfolio; MLC Wholesale Index Plus Balanced Portfolio; and MLC Wholesale Index Plus Growth Portfolio.
2 Tracking error is the relative risk of a portfolio compared to its benchmark. Tracking error can also be a comment on the performance of a portfolio manager.
3 The fundamental law of active management. Richard C. Grinold. The Journal of Portfolio Management Spring 1989, 15 (3) 30-37; DOI: Accessed 17 June 2020.
4 The future of investment management. Ronald N. Kahn. CFA Institute Research Foundation, 26 May 2018,, Accessed 19 June 2020.
5 New technologies changing asset management. Jeff Shen, Raffaele Savi, Richard Mathieson, BlackRock, 28 October, 2019. Accessed 19 June 2020
6 Ibid
7 Ibid
8 Ibid
9 Examples from this point are drawn from The future of investment management. Ronald N. Kahn. CFA Institute Research Foundation, 26 May 2018,

Important information

This communication is provided by MLC Investments Limited (ABN 30 002 641 661, AFSL 230705) (MLC), Responsible Entity of the MLC Wholesale Index Plus Conservative Growth Portfolio; MLC Wholesale Index Plus Balanced Portfolio; and MLC Wholesale Index Plus Growth Portfolio and a member of the group of companies comprised National Australia Bank Limited , its related companies, associated entities and any officer, employee, agent, adviser or contractor (‘NAB Group’). An investment with MLCI does not represent a deposit or liability of, and is not guaranteed by, the NAB Group.

This information has been prepared for the general information and it is not intended to constitute a recommendation or advice. It has been prepared without taking account of any investor’s objectives, financial situation or needs and because of that investors should, before acting on the advice, consider the appropriateness of the advice having regard to their personal objectives, financial situation and needs.

The Product Disclosure Statement (PDS) for the MLC Wholesale Index Plus Conservative Growth Portfolio; MLC Wholesale Index Plus Balanced Portfolio; and MLC Wholesale Index Plus Growth Portfolio is available upon request by phoning 1300 738 355 or on our website at

Any opinions expressed in this communication constitute our judgement at the time of issue and are subject to change. We believe that the information contained in this communication is correct and that any estimates, opinions, conclusions or recommendations are reasonably held or made as at the time of compilation. However, no warranty is made as to their accuracy or reliability (which may change without notice) or other information contained in this communication.

This information is directed to and prepared for Australian residents only.