How Machine Learning Is Used In Cyber Security

How Machine Learning Is Used In Cyber Security?

Last Updated on 7 June 2023 by admin

As cyber criminals utilise new technology in their attacks, information security professionals must also adapt and implement new methods in their cyber defence. This game of cat and mouse means that cyber security is always at the cutting edge of technology.

In recent years, machine learning has been used in cyber security to predict and identify attacks as they happen. However, cybercriminals are also utilising machine learning to hide their malware or launch the most convincing phishing campaigns. In this two-part blog, we will look at how machine learning plays a crucial role in cyber security and cyber-attacks.

In this first part, we explore ways in which machine learning is used in cyber defence. While this should play a crucial role in any cyber defence strategy, recent reports suggest GCHQ are not utilising Machine learning enough.

What is machine learning?

Machine learning uses models that use sets of data to learn the underlying concepts so that they are able to predict what future data should look like or classify data into groups. In the context of cyber security, this could be classifying network traffic as malicious or normal.

How is machine learning used in cyber security?

Networks generate a lot of traffic. Too much for any single team of security professionals to analyse meaningfully. Machine learning can be used to learn the underlying trends of the data, allowing for future predictions to be made such as changes to malware. They are also excellent at classifying threats and differentiating between malicious and normal network traffic.

Machine learning can also be used to detect if a person has clicked on malicious links that may launch phishing attacks or if a website is hosting malicious content by analysing the content on the site.

Below we provide some important examples of how machine learning can support your cyber security strategy.

Threat detection of malware

As particular malware becomes more familiar in the security industry, all antivirus software will become aware of it and easily be able to identify its signature. This heavy reliance on previous knowledge can mean small variations in the malware will enable it to avoid detection. Machine learning models can take the previously known malware and learn its underlying concepts. From this, it will be able to detect malware even after it has been altered to avoid detection. This has been implemented in many antivirus solutions, known as heuristic detection. Researchers have achieved accuracy of over 85% using this method.

The power of machine learning comes from the fact that it can detect new types of malware that have not been seen before. It can also detect known threats with a higher degree of accuracy than previous tools were capable of. This is because machine learning utilizes huge amounts of data to analyse the behaviour of a file, rather than relying on human experts to recognise malicious code.

Phishing page and URL detection

Phishing attacks are an extremely common and successful way of stealing a victim’s credentials. A website is crafted to look like the target site, such as a fake banking application with the aim of tricking a user into entering their credentials. Often URLs leading to phishing sites are embedded in web applications waiting for a user to click on them and enter their sensitive data. Machine learning algorithms are able to analyse the URL and classify it as malicious or benign. Other attributes such as geolocation, website contents and word analysis can improve the accuracy of the prediction.

There are two main challenges for machine learning algorithms when detecting phishing pages:

  • Firstly, if a page looks genuine, it is difficult to identify that it is malicious. A good example of this is the lack of “https” in the URL. Human eyes usually see this as a clear indicator that the website may not be genuine or safe, however an algorithmic approach may overlook it.
  • Secondly, while certain patterns and indicators can be used to distinguish a genuine page from a compromised page, they are context-specific and may not apply to every case. For example, a logo at the top of a genuine website might appear slightly differently on a malicious page. This can make it hard for automated detection algorithms to identify which page is genuine and which is disingenuous.

Bot detection

One of the most prolific and devastating DDoS attacks was due to the Mirai botnet which used thousands of devices to perfect large scale attacks. One way of preventing such an attack again is to analyse the traffic of all devices on a network. When a botnet attack occurs, the traffic will deviate from standard use. This anomaly-based detection has had over 99.9% accuracy using some machine learning models.

User behaviour analysis

Similar to bot detection, the day-to-day traffic of a user can be monitored. If there’s a large anomaly in a users activity, this may indicate a compromised account. A standard users network footprint will be varied and complex due to the wide range of applications in use. As such, the false positive rate may be high. This still allows for security teams to be notified and act accordingly to the potential threat.

An issue frequently raised is how ethical is mass surveillance of employees? Students in Australia have voiced privacy concerns over software intended to analyse their actions during examinations from home. Such software should strike a balance between user privacy and effective detection of malicious activity. Data anonymization may help alleviate some of the concerns.

Optimising the human analysis

Machine learning is not removing the need for security analysts. Instead, it is completing the easier tasks and empowering the security analysts to draw from the highest quality data they can. For example, machine learning models are able to generalise trends from logs and point out points of interest for the security analyst. Another common issue with security analysts is alarm fatigue. This phenomenon results in repeatedly seeing false positive threats, meaning when a legitimate threat arises they are not mentally prepared to deal with it. By giving the analyst higher quality data and reducing the noise, can eliminate the fatigue.


It is evident that machine learning is being used effectively in cyber security in a wide range of applications. But how are criminals using the same technology to their advantage? In the next article, we will explore how machine learning is used in malicious contexts.

Frequently Asked Questions

Can machine learning algorithms detect zero-day attacks?

Yes, machine learning algorithms can detect zero-day attacks. While traditional signature-based approaches struggle to identify new and unknown threats, machine learning models can analyze large datasets and identify patterns and characteristics that may indicate zero-day attacks. By continuously learning from new data, machine learning algorithms can detect and mitigate zero-day attacks effectively.

Is machine learning only used for detecting threats?

No, machine learning is not only used for detecting threats. While threat detection is a significant application of machine learning in cyber security, machine learning algorithms are also employed in areas such as fraud detection, user behavior analytics, and vulnerability management. Machine learning has a wide range of applications that enhance overall cyber security posture.

Are there any limitations to using machine learning in cyber security?

Yes, there are some limitations to using machine learning in cyber security. One common challenge is the potential for false positives and false negatives. Machine learning models may occasionally misclassify benign activities as malicious or fail to detect sophisticated attacks. Additionally, machine learning models require regular updates and fine-tuning to adapt to evolving threats, which can be time-consuming and resource-intensive.

How can organizations integrate machine learning into their existing cyber security infrastructure?

Organizations can integrate machine learning into their existing cyber security infrastructure by leveraging specialized machine learning tools and platforms. These tools often provide pre-built models and algorithms that can be customized to the organization’s specific needs. Additionally, organizations can collaborate with data scientists and machine learning experts to develop custom models tailored to their unique security requirements.

What are the future prospects of machine learning in cyber security?

The future prospects of machine learning in cyber security are highly promising. As cyber threats continue to evolve, organizations need advanced tools and techniques to combat them effectively. Machine learning, with its ability to analyze vast amounts of data and learn from it, will play a crucial role in future cyber security strategies. Moreover, advancements in areas such as deep learning and natural language processing are likely to further enhance the capabilities of machine learning in cyber security.