Sharath Honnaiah
10 min readApr 28, 2022

--

An approach to Prioritize the Network Security Vulnerabilities

Abstract

This paper presents the proposal of a machine learning approach to prioritize vulnerabilities in systems. Vulnerability management involves the detection, prioritization, and closure of points through which a system can be compromised. The use of machine learning ensures that the entire process of vulnerability management is automated. The machine learning model proposed in this paper identifies vulnerabilities through automated penetration testing. These vulnerabilities are then prioritized by making comparisons based on a selected factor, including age, threat level, frequency, and historical impacts. The vulnerabilities are then addressed in the proposed order until the system is free of vulnerabilities and immune from cyberattacks.

Keywords: Vulnerability, prioritization, machine learning, penetration testing, cyberattacks

An approach to Prioritize the Network Security Vulnerabilities

1. Introduction

As the world’s digital landscape and technology continue to advance, cybersecurity becomes more important because cybercrime rates increase at a directly proportional rate. The main reason behind the increase in global cyberattacks is that many vulnerabilities are exposed as companies grow. Vulnerabilities are weaknesses in information technology systems that cybercriminals can use to gain unauthorized access (Tunggal, 2021). Cyberattacks are often executed by installing malware or malicious code and retrieving sensitive, private data (Tunggal, 2021). The number of vulnerabilities in a system is proportional to the risk of data breaches in software and websites.

There are thousands of vulnerabilities in IT systems in the world every year. For instance, a total of 14238 vulnerabilities were identified in 2018, meaning nearly 300 vulnerabilities occurred every week (Pompon, 2020). These high numbers mean that vulnerability teams are very busy identifying, examining, prioritizing, and patching these points of weakness to enhance cybersecurity. Since vulnerabilities vary in size and threat levels, prioritization must ensure that the most dangerous ones are addressed first. This task can be very daunting for human teams since they have to go through each vulnerability, and the time wasted can lead to increased cyberattacks. Automated vulnerability identification and prioritization system would be ideal for solving this problem and reducing the risk of malicious attacks on organizations. Pompon (2020) states that approximately 55% of all vulnerabilities are considered high-risk. Projecting this figure onto the 2018 data shows that high-risk vulnerabilities occur every 8–9 hours throughout the year (Pompon, 2020). This frequency is overwhelming for human vulnerability teams to manage, and the rate of cyberattacks would be high.

This paper describes the use of machine learning to identify and prioritize vulnerabilities in systems to enhance cybersecurity.

2. Preliminaries: Concepts and Methods

In vulnerability management, it is ideal to identify and prioritize the weak points with the highest risks (Pompon, 2020). Patching the most exploitable vulnerabilities ensures that cybersecurity is optimized since it is a waste of time to address inconsequential weak points. The main problem is analyzing vulnerabilities and ranking them from the most to the least exploitable (Tunggal, 2021). The most popular ranking system is the Common Vulnerability Scoring System (CVSS), which ranks vulnerabilities on a 1–10 scale (Pompon, 2020). However, separate analyses have shown that the CVSS system is not always accurate, and a score awarded by the system does not truly indicate the exploitability of a vulnerability (Pompon, 2020). These findings led to the emergence of new vulnerability scoring systems, such as the Exploit Prediction Scoring System (EPSS) (2021). These systems are not accurate representations of vulnerability exploitability because they do not consider the causative factors those points of weakness. According to Tunggal (2021), the main causes of vulnerability include:

(i) System complexity

There is a direct relationship between the complexity of a system and the probability of a flaw. Complex systems have a greater risk of developing flows than simpler configurations.

(ii) Familiarity with the system

Some software or websites are developed from common code, such as code from the Internet. Malicious attackers familiar with the code can take advantage of it to find loopholes and invade the system.

(iii) The level of connectivity

Internet connectivity determines the level of exposure for a device to external attacks. For instance, attackers can infiltrate systems through WiFi connections to retrieve sensitive information.

(iv) Password management

The use of common or repeated passwords exposes a system to attack. Experienced hackers can also break down simple passwords.

(v) Flaws in operating systems

Most operating systems are designed to detect and expel any form of malware from a device. If the OS has its flows, malware can easily bypass its firewall and invade information systems.

(vi) The Internet

There are thousands of malware on the Internet that are disguised in the form of links, images, or software. Downloading these links and software allows ransomware to attack a device and retrieve vital information.

(vii) Software bugs

The source code of various software could contain bugs that are accidentally left during development. Bug often represent weak points in the code, and they can be exploited to inflict damage on the system.

(viii) User input

Software users intentionally or unintentionally input commands that cause damage to the system. For instance, failure to install a proper security system in a software’s infrastructure can expose its database to various potentially damaging commands. Tunggal (2021) states that SQL databases are the most affected by this form of vulnerability.

Vulnerability detection is the first and most important process in cybersecurity management. The three main detection methods include vulnerability scanning, penetration testing, and Google hacking (Tunggal, 2021). The choice of a detection method depends on the type of system being analyzed.

Vulnerability Scanning

Vulnerability scanners are software designed to assess information networks and computers to determine exploitable points of weakness in the system (Tunggal, 2021). Modern vulnerability scanners are integrated with artificial intelligence and machine learning for greater performance efficiency and accuracy (SecPoint AI Machine Learning vulnerability scanner, n. d). This system’s function is to automate processes such as report checking and the analysis of results, activities that would take a human much more time to complete (Oriol & Paquette, 2021). AI also contains modules that determine false-positive reports and remove them from the system for more accurate results. This process ensures that only exploitable vulnerabilities are identified for patching. An example of such as system is the SecPoint Penetrator (SecPoint AI Machine Learning vulnerability scanner, n. d). Machine learning is integrated to ensure that the vulnerability sorting process is as accurate and quick as possible (Montuno, 2018). The machine learning module creates a classification system to categorize each vulnerability according to its risk level (SecPoint AI Machine Learning vulnerability scanner, n. d). The potential cause of each vulnerability is higher accuracy in estimating how exploitable a weak point is.

Penetration Testing

Also known as ethical hacking, penetration testing involves attacking a system the same way malware would identify vulnerabilities that may be exploited (Tunggal, 2021). This method ensures that all vulnerabilities are secured before an actual cyberattack happens. Most systems have inbuilt software that routinely runs penetration testing protocols to provide all-around protection, but others prefer to do it manually during scheduled checks and maintenance sessions (Tunggal, 2021). It is ideal for carrying out ethical hacking exercises whenever there is any significant change to the IT infrastructure of a company, such as during software updates and data migration to maximize security.

The frequency of penetration testing also depends on other factors such as company size, the amount and sensitivity of data, and the firm’s regulations (Rosencrance, 2021). Various strategies can be adopted in penetration testing, including targeted testing, internal testing, blind testing, black-box testing, and white box testing (Rosencrance, 2021). However, since most modern IT systems are cloud-based, the best penetration testing is the Pen Testing as a Service (PTaaS) which is used to provide consistent feedback on vulnerabilities in systems. According to Rosencrance (2021), PTaaS is used in conjunction with other tools such as Wireshark, John the Ripper, Metasploit, and Nmap for effective penetration testing and real-time cybersecurity.

Google Hacking

This method involves the use of search engines to detect vulnerabilities in a system. Google and Microsoft Bing are the most popularly used search engines for this function (Tunggal, 2021). Google hacking targets vulnerabilities in search operators due to cloud misconfigurations and improperly hidden information (Tunggal, 2021). Search engines are public spaces that can easily expose important data to preying attackers; consequently, all vulnerabilities associated with Google, Yahoo, and Bing must be closed by fixing all misconfigurations (Tunggal, 2021). However, Google hacking does not match the effectiveness and universality of penetration testing and vulnerability scanning.

3. Proposed Approach

Model Overview

The proposed approach uses a machine learning module to identify vulnerabilities in a system and prioritize them in order of significance and potential impact. This method does not put CVSS into consideration due to its previously discussed shortcomings but instead focuses on several factors that are more important but are often ignored. This machine learning module focuses on the following aspects when prioritizing vulnerabilities:

(i) The age of the vulnerability in question.

A vulnerability’s age is the amount of time that it has existed within the system. Older vulnerabilities are more susceptible to attack since they may have grown more noticeable over time. Additionally, such vulnerabilities may have spread to affect multiple parts of the system, making them a greater threat than others. Therefore, once an old vulnerability is detected, it should be prioritized for closure before it becomes an avenue for a cyberattack. New vulnerabilities are generally undetectable, but they can grow in magnitude and threat level quickly. Therefore, while old vulnerabilities are prioritized over new ones, the latter should not be completely ignored.

(ii) The number and types of products or information that are threatened by the vulnerability.

Some vulnerabilities affect a few aspects of the system, while others are much larger and threaten a broader spectrum than the former. Vulnerabilities that have a high threat level are considered dangerous, and they can become catastrophic to the system in the event of a cyberattack. For instance, a vulnerability that exposes the personal details and earnings of individuals in an organization carries a higher level of threat than one that reveals some product listings. Therefore, solving vulnerabilities in order of their level of danger is vital in IT systems.

(iii) The rate of occurrence of a specific type of vulnerability.

The machine learning module prioritizes recurring vulnerabilities that attack the system. Such vulnerabilities are also highlighted so that they can be addressed to find lasting solutions against them. However, new vulnerabilities are not completely ignored since they may carry a new form of threat that can be potentially damaging. Such vulnerabilities are subjected to additional filters depending on the other factors.

(iv) Historical impacts of specific types of vulnerabilities.

The previous impacts of recuring vulnerabilities are used to estimate their potential threat to the system and how fast they should be eliminated. For new vulnerabilities, their effects on other systems can be determined by scraping information from the Internet. The retrieved data is then used for vulnerability prioritization. The ML model is equipped with these functionalities to ensure that it works adequately against all vulnerabilities in the system.

Data Collection

The proposed ML model must collect unbiased data, ensure that all information used is accurate, must be able to change trends depending on the system being checked, be resource-efficient, and provide explainable results. The model must collect data relative to all the factors stated above. Data collection is achieved through the following actions:

(a) The model performs penetration testing on the system to detect vulnerabilities and their scopes.

(b) Information on the vulnerabilities is obtained through scanning and web scraping. The ML model shall perform both of these functions preemptively.

© All information is presented in their correct labels for further analysis by different parts of the model.

Specific algorithms control all the data collection activities includes in the machine learning model. This functionality ensures that the ML model works automatically and as accurately as possible.

Data Evaluation

The data is analyzed concerning the main factors that have been specified for the machine learning model. An algorithm-controlled scoring system is then used to rank all the vulnerabilities for prioritization. Exposures that get the highest score are given the highest prioritization while others follow in descending order.

4. Expected Results

According to their level of priority, the primary output of this machine learning model is a list of all vulnerabilities in the system. This result kickstarts the secondary phase of the model, which involves addressing these vulnerabilities to make the system safer. Vulnerabilities are addressed in the following major steps:

(i) Preventive protection

This functionality involves detecting weak spots in the system and subsequently fortifying them to prevent future breaches.

(ii) Breach detection

Vulnerabilities with the highest risk and those that have already facilitated attacks are rapidly blocked to keep the system safe. Any lost data is retrieved through the ML model’s data security functionality.

(iii) Automated investigation

The system automatically analyzes parts of the system that are likely to be exploited during a cyberattack. Exploitable vulnerabilities are also identified this way.

(iv) Attack surface reduction

This method involves fortifying the entire system’s infrastructure to reduce the number of spaces that can give rise to vulnerabilities and be exploited by outside attackers.

The machine learning model is expected to run checks through the system in specified internals and solve all problems automatically. All activities are kept in logs for assessment purposes.

5. Conclusion

A machine learning model is proposed to aid in the prioritization of vulnerabilities in information systems. The model detects vulnerabilities through penetration testing before ranking them depending on age, history, frequency, and threat level. Those with the highest priority are addressed before the rest. The ML model provides a way of automatically enhancing cybersecurity.

Acknowledgments

I want to express my appreciation to my family, for their support during the research phase.

References

Montuno, D. (2018). Machine Learning in vulnerability assessment. Defence Research and Development Canada. https://cradpdf.drdc-rddc.gc.ca/PDFS/unc336/p810098_A1b.pdf

Oriol, P., & Paquette, S. (2021). Leveraging AI to modernize vulnerability management and remediation. Secureworks. https://www.secureworks.com/blog/leveraging-ai-to-modernize-vulnerability-management-and-remediation

Pompon, R. (2020). Prioritizing vulnerability management using machine learning. F5 Labs. https://www.f5.com/labs/articles/cisotociso/prioritizing-vulnerability-management-using-machine-learning

Rosencrance, L. (2018). Pen test (penetration testing). SearchSecurity. https://searchsecurity.techtarget.com/definition/penetration-testing#:~:text=Penetration%20testing%2C%20also%20called%20pen,software%20applications%20or%20performed%20manually.

SecPoint AI Machine Learning vulnerability scanner. (n. d). Retrieved May 3, 2021, from https://www.secpoint.com/ai-machine-learning-vulnerability-scan.html

Tunggal, A. (2021). What is a vulnerability? UpGuard. https://www.upguard.com/blog/vulnerability

--

--

Sharath Honnaiah

I am an Engineering Leader. I do have many years of Software Professional Experience. I use this medium to express my random thoughts.