Artificial Intelligence vs. Machine Learning in Cybersecurity

What’s the first thing you think of when you hear “artificial intelligence,” “machine learning” or “deep learning”? If you’re thinking about sentient robots that overtake cities or super cyborgs that surpass human intelligence, you may have a few misconceptions about what artificial intelligence (AI), machine learning (ML) and deep learning (DL) are and how they’re used.

AI, ML and DL are used in a variety of capacities already and have the potential to increase productivity across fields. In cybersecurity, ML is the most common term for the practical applications of general AI. In this piece, we explore these current and potential applications within the cybersecurity sphere — how they work as well as their pros and cons. We’ll also briefly cover what sets AI, ML and DL apart from one another. You can use the menu below to jump ahead to the specific topic that interests you.

What Are the Differences Between Artificial Intelligence and Machine Learning?

AI, ML and DL are often confused with each other. They share a lot of similar traits because deep learning is a subset of machine learning, which is a subset of artificial intelligence. Therefore, what is DL is also AI, but what’s AI is not necessarily DL. Not confusing at all, right? See the diagram below for a visual of this idea.

The overarching theme or goal of artificial intelligence is to create computer programs that have the ability to perform intelligent, human-like functions. There are elements that differentiate ML and DL from AI that we will explore further in the sections below. Bear in mind that there are varying opinions across the tech and science communities. We did our best to synthesize these theories and beliefs to provide a high-level (not too in-depth) view of the topic.

How Does Artificial Intelligence Work?

As mentioned above, artificial intelligence is computer programming with the ability to work as a human brain does. This is an incredibly difficult feat and most would agree that we’re far from reaching that level of programming. Due to the foggy boundaries that define AI, it’s hard to say when the first AI programming was conceived, but the interest in finding the connection between human and machine intelligence can be traced back to the late 1940s into the early 1950s.

One big development in AI was John McCarthy’s creation of LISP (list processing) language in 1957. This high-level language is still used today by those who work with AI. Thus far, the computer program that’s come closest to achieving this goal and embodying the idea of a programmed humanoid is Sophia, the AI robot who made waves when “she” debuted in 2016.

There are still parts of the human brain that we’re trying to understand, making perfect replication impossible at this time. The more useful and everyday occurrences of AI include things like:

Speech recognition and understanding languages
Smart home devices like those that can be set to function independently
Travel assistance with updates to mapping systems that use AI to scan roads and create the most efficient route possible

How Does Machine Learning Work?

Machine learning is an approach to the science of artificial intelligence. Some aspects that set ML apart from other types of programming are the abilities to learn from large amounts of data using human-built algorithms to accomplish tasks. The algorithms help them learn and adapt to new data so that the machine can think and act more like a human. It can also be thought of as a type of data mining since it processes large amounts of data.

The various algorithms that are used can be grouped under:

Supervised learning — the computer is given parameters or examples to compare data inputs against.
Unsupervised learning — the computer is fed data and finds the relationships between the data on its own.

One of the trailblazers in ML, Arthur Samuel, stated in his 1959 IBM paper that “programming computers to learn from experience should eventually eliminate the need for much of this detailed programming effort.” ML programs improve independently based on new data inputs and past data patterns but require some human guidance to propel their overall improvability.

Some examples of machine learning include:

Cybersecurity algorithms that flag unnecessary access and other security risks
Music-generating algorithms that predict the best song choice for a user based on their preferences
GE’s power program that predicts maintenance needs and other optimization metrics to help businesses run more smoothly

Python and Machine Learning

To better understand machine learning, it’s also important to understand which programming languages are used in conjunction with ML like Python and C++. Python is one of the most popular programming languages due in large part to its role in machine learning, and C++ is often used in machine learning projects as well.

Python is a great choice due to its widespread popularity, available resources, simple syntax and how intuitive it is compared to other languages. Once you understand the basic syntax and inner workings of Python or C++, they can be used to in various ML projects in your organization. Python also has a large collection of libraries that make the creation process easier — much of the practical ML work is around libraries that are developed and released by different groups, such as:

Machine Learning vs. Statistics

Some people question whether the ML processes are revolutionary or are just using basic math and statistics. It’s no doubt that machine learning and data science go hand-in-hand, but what sets ML apart from statistics if they both use data patterns to show the probability or nonprobability of different situations?

Marketing hype outran the practical reality of artificial intelligence for literally decades. As the combination of faster hardware, better models, and a more robust understanding of what machine learning systems were suitable for we have seen a shift in the hype cycle. Mundane software applications using nothing but Statistics 101 level math are rebranding themselves as “Artificial Intelligence-Driven” as a means of differentiating themselves in the marketplace.

This muddying of what systems are doing makes reasoning about their impact and how to strategically approach the broader topic of machine learning much more difficult.

Machine learning and artificial intelligence systems should be a last resort to be applied only when traditional methods of organization, pattern matching, and statistics have failed.

How Does Deep Learning Work?

Deep learning is a specific subset of machine learning, or techniques used to implement ML. It functions in nearly the same way as ML but is able to correct itself whereas ML requires human intervention to make certain adjustments. It’s becoming a bit outdated but understanding how it works can help add context to AI and ML.

DL is based on neural networks, which DeepAI defines as, “a computational learning system that uses a network of functions to understand and translate a data input of one form into a desired output, usually in another form. [This] concept… was inspired by human biology and the way neurons of the human brain function together to understand inputs from human senses.”

Some examples of deep learning include:

Image, video and pattern recognition
Self-driving cars and their paths
Medical diagnosis using image recognition to determine certain pathogenic cells

The Role of AI and ML in Cybersecurity

When it comes to cybersecurity and the science of artificial intelligence, machine learning is the most common approach and term used to describe its application in cybersecurity. Although there are some deep learning techniques being used under the umbrella of ML as well, many would say DL is becoming outdated in cybersecurity applications. Machine learning shows great promise in cybersecurity, although it does have some drawbacks. We’ll explore what this technology can achieve in cybersecurity, its pros and cons, as well as future possibilities.

Machine Learning and Cybersecurity

It’s important to understand that machine learning is not a panacea, but it can provide a great deal of support to a cybersecurity or IT team. While ML may have a long way to go before it can be used for threat detection on its own without human intervention, there are many tasks it can handle to level up security. Some benefits of ML include:

Classification – Programs classify data based on predetermined parameters.
Clustering – For data that doesn’t fit preset parameters, ML groups data based on their similarities or anomalies.
Recommendations — Programs learn from past choices, inputs and associations to recommend approaches and decisions.
Generative frameworks — Based on past data inputs, programs generate possibilities that can be applied to data that hadn’t encountered those specific inputs before.
Predictions — Programs forecast based on data sets and past outcomes.

How Your Organization’s Security Can Benefit

Machine learning can benefit your cybersecurity practices which should be amongst every organization’s top priorities. Data breaches, thefts and other attacks are becoming incredibly common and can cause huge financial strains and loss of business. Depending on the size and needs of your organization, ML-based security software could be a great use of your cybersecurity budget. If your security team feels stretched thin, plus has trouble maintaining internal data governance and your security perimeter, these types of solutions could be great options.

Varonis’s permissions recommendations feature is a great example of machine learning in data security. The software builds a behavioral baseline for every user in an organization by mapping and monitoring the types of data they access. After a period of time, the system automatically identifies when someone no longer needs access to sensitive information and recommends revocation. It develops these recommendations not simply because they haven’t been using that data, but because they no longer resemble other users in the company that do.

Microsoft’s Azure Machine Learning is another great example of machine learning that’s currently being applied to cybersecurity. It gives organizations the opportunity to build, train and manage their own ML models.

If you aren’t equipped or don’t have the time to manage your own ML-based platforms, there are externally managed solutions for data classification, data security, security audits and threat detection and response that are personalized to fit your needs.

Here are some examples of how to apply the benefits laid out above to cybersecurity-related tasks:

Phishing and spam filtering with classification. It can help your team classify internal data, spam and malicious activity with predetermined parameters.
Forensic analysis from clustering because it helps shed light on the method of an attack and what was compromised. Clustering gathers data without needing to know the classes and groups it so your team can piece together the elements of a breach or attack.
Incident response and risk management from recommendation methods that come from association rule learning. The program learns from the different actions that are implemented on different types of incidents. As it picks up patterns and associations, it can make recommendations to mitigate risk and how to respond to incidents.
Pentesting with generative frameworks. Similar to how a pentester would check for vulnerabilities, programs can generate inputs to test for weaknesses.
Prevention and threat modeling using predictions. ML is used to gather data to help predict fraudulent activity which can help your security team address the liability before it evolves into a costly breach or data theft.

Pros

As seen above, there are many cybersecurity tasks that can be made easier or more efficient with the implementation of ML algorithms. When used optimally, this technology can lighten the weight of a heavy cybersecurity workload and reduce human error and oversights. Another bonus of these adaptive approaches and techniques is that they can be modified to the specific requirements of a company or organization.

Cuts down the workload for the security team
Reduces human error and oversights
Can be molded to fit the specific needs of your organization

Cons

Most methods in use today require human intervention because there are hackers who try (and succeed at) taking down ML programs through deceptive data inputs. Through algorithms, machines are taught about what constitutes a malicious file or code, which can usually snuff attacks. Some hackers find ways to disguise their malware and trick the ML into thinking it’s a normal file or folder. Other hackers use ML to carry out ill-willed tasks of their own.

ML and AI are not invincible
Some algorithms can be deceived
This technology can be used maliciously

Future of AI and Machine Learning in Cybersecurity

With more legislation like the General Data Protection Regulation (GDPR) and the California Consumer Protection Act (CCPA) being introduced and signed into law, it’s imperative to nail down your security and data governance. This will help organizations adhere to compliance laws and help prevent them from incurring costly fines and reputational damage. GDPR takeaways from its first year showed that early detection and reporting is key, something that ML can assist with.

One aspect of GDPR, CCPA and other compliance legislation that clashes with AI, ML and DL is the right to be forgotten. Under the users’ rights to privacy and data erasure, data on that user must be removed from the controller’s (organization’s) databases, including AI, ML and DL programming. This seems to be an issue that will grow as data privacy legislature and artificial intelligence become more prevalent.

The key to setting yourself up for the future of compliance in conjunction with AI is solid data governance — the ways in which you classify and manage your data. In the event where user privacy laws become more stringent, you need to have a way to sort out sensitive and “private” data from your database so that it can be removed upon request. The more time you put into classifying data now, the more agile you will be in the future, and ML classification techniques can support this.

To view a visual with insights and tips from the piece click the button below.

AI and ML are becoming integral to cybersecurity, and already are in many ways. While they can be used at different levels and capacities, there are algorithms and techniques that can make your organization’s security run more smoothly and free up your security team’s time for other important tasks.

If you’re interested in seeing where your cybersecurity practices stand, try out a free data risk assessment and then see how your security compares to others in your industry through our data risk report study. This assessment will help uncover overexposed data, access issues, stale data, inconsistent permissions and more risks to your security. Be prepared for the future of cybersecurity by taking action today.

Additional Sources: BuiltIn | Forbes | Towards Data Science

What should I do now?

Below are three ways you can continue your journey to reduce data risk at your company:

Schedule a demo with us to see Varonis in action. We'll personalize the session to your org's data security needs and answer any questions.

See a sample of our Data Risk Assessment and learn the risks that could be lingering in your environment. Varonis' DRA is completely free and offers a clear path to automated remediation.

Follow us on LinkedIn, YouTube, and X (Twitter) for bite-sized insights on all things data security, including DSPM, threat detection, AI security, and more.

Rob Sobers Rob Sobers is a software engineer specializing in web security and is the co-author of the book Learn Ruby the Hard Way.