EDGAR MARKEVICIUS

The future of threat detection is security machine learning

2023-06-01

Data is the new oil.
Clive Humby

At the most fundamental level, security is just a risk reduction. Security leaders leverage people, processes, and technology levers to prioritize and reduce risks facing their organizations. It is important to note that it is not possible or feasible to eliminate the risk entirely. The only other sensible path forward is for organizations to adopt assume breach mentality.
Given enough time and resources, even the most secure organizations will, fall and their perimeter will be breached. How they detect and respond at time when they do will be the difference between the data breach (and publicity associated with it not to mention lost customer trust) or a thwarted attack and gained right to fight another day.

Naturally, the question arises about the effort and energy exerted to protect the organization. If we can not prevent the breach, is it all for nothing, then?
Thankfully all the hard work at securing applications and associated infrastructure (cloud, enterprise footprint) is not lost in time like tears in the rain. Protective security controls and hardening measures introduce significant friction to the attackers and gift the organization time during the breach to detect activity(at various stages of the attack life cycle) and respond to the ongoing attack.

Companies generate more transactional data and logs on their systems than ever in the past. However, the ability to make sense of all this data and interconnect data sets on distinct threats is becoming more difficult as the volume, velocity, and variety expand.
Threat detection engineering is facing an enormous challenge to build predictive and analytical power to generate actionable threat insights from all this data. Traditional rule-based (tripwire) detection workflows are no longer adequate to address ever increasing massive logs and data generated by different events occurring in the organization.

To address this growing data problem, it is critical to fundamentally shift the detection engineering mindset (and training) from rule-based, mainly stateless events to stateful risk and behavior-based detections. It will challenge traditional security engineers to learn and integrate new concepts from data science, data engineering, and machine learning domains.

This is the future, I saw it first hand recently at Twitter, where I embedded security, data science and machine learning domain experts into Information Security organization to develop novel threat insights to detect threat actors targeting sensitive assets. The team shipped data pipelines, wrangled with raw data and logs stored in the cloud warehouses, experimented and deployed machine learning models to predict threat actors at scale. Ability to process and extract the right amount of information from the events, interconnect and correlate them, and predict behavior-based activity is hard to achieve by relying on only 3rd party tooling (not to mention privacy and data implications leaving boundaries of your organization).

Detection fidelity, speed, and sophistication are unmatched by traditional rule-based alerts. The experiment was an overwhelming success and will be the future of next-generation security detection engineering at scale.
Statistical techniques, applied machine learning, data science, and data engineering will be the domain knowledge security engineers will need to succeed in the detection engineering space.

If data is the new oil, security machine learning is the new oil refinery for the future!

P.S. For those who are not scared away by time commitment and rigor, I recommend to take a look at MIT's Statistics and Data Science micro masters sequence to learn fundamentals first:

1. Probability - The Science and Uncertainty of Data (MITx)
2. Machine Learning with Python - From Linear Models to Deep Learning (MITx)
3. Fundamentals of Statistics (MITx)