Confusion Matrix and Cyber Attack

Harshal Kondhalkar
6 min readJun 5, 2021

--

Cybercrime is defined as a crime where a computer is the object of the crime or is used as a tool to commit an offense. A cybercriminal may use a device to access a user’s personal information, confidential business information, government information, or disable a device. It is also a cybercrime to sell or elicit the above information online. New technologies create new criminal opportunities but few new types of crime. What distinguishes cybercrime from traditional criminal activity? Obviously, one difference is the use of the digital computer, but technology alone is insufficient for any distinction that might exist between different realms of criminal activity. Criminals do not need a computer to commit fraud, traffic in child pornography and intellectual property, steal an identity, or violate someone’s privacy. All those activities existed before the “cyber” prefix became ubiquitous. Cybercrime, especially involving the Internet, represents an extension of existing criminal behavior alongside some novel illegal activities.

Most cybercrime is an attack on information about individuals, corporations, or governments. Although the attacks do not take place on a physical body, they do take place on the personal or corporate virtual body, which is the set of informational attributes that define people and institutions on the Internet. In other words, in the digital age our virtual identities are essential elements of everyday life: we are a bundle of numbers and identifiers in multiple computer databases owned by governments and corporations. Cybercrime highlights the centrality of networked computers in our lives, as well as the fragility of such seemingly solid facts as individual identity.

To avoid this one of the solution is using machine learning algorithms

MACHINE LEARNING IN CYBERSECURITY

Machine learning has become a vital technology for cybersecurity. Machine learning preemptively stamps out cyber threats and bolsters security infrastructure through pattern detection, real-time cyber crime mapping and thorough penetration testing.

As algorithms increasingly make decisions about human affairs, it is important that these algorithms and the data they rely on be fair and unbiased. One of the diagnostics for algorithmic bias is the Confusion Matrix. The Confusion Matrix is a table that shows what kinds of errors are made in predictions. While everyone who works with data knows what a Confusion Matrix is, it is a more subtle matter to gain intuition for how it behaves under different kinds of distributions of predictions and outcomes and the range of possible decision thresholds.

When we get the data, after data cleaning, pre-processing and wrangling, the first step we do is to feed it to an outstanding model and of course, get output in probabilities. But wait! How in the world can we measure the effectiveness of our model. Better the effectiveness, better the performance and that’s exactly what we want. And it is where the Confusion matrix comes into the picture. Confusion Matrix is a performance measurement for machine learning classification.

What is Confusion Matrix and why you need it?

Well, it is a performance measurement for machine learning classification problem where output can be two or more classes. It is a table with 4 different combinations of predicted and actual values.

Let’s understand TP, FP, FN, TN in terms of pregnancy analogy.

True Positive:

Interpretation: You predicted positive and it’s true.

You predicted that a woman is pregnant and she actually is.

True Negative:

Interpretation: You predicted negative and it’s true.

You predicted that a man is not pregnant and he actually is not.

False Positive: (Type 1 Error)

Interpretation: You predicted positive and it’s false.

You predicted that a man is pregnant but he actually is not.

False Negative: (Type 2 Error)

Interpretation: You predicted negative and it’s false.

You predicted that a woman is not pregnant but she actually is.

Just Remember, We describe predicted values as Positive and Negative and actual values as True and False.

Type I error:

This type of error can prove to be very dangerous. Our model or system predicts that there is no attack but actually there is one. In this case no notification reaches the security team and nothing can be done to prevent it. The False Positive error thus falls in this category and one of the aim of model is to minimize or avoid this error.

Type II error:

This type of error are not very dangerous as our system is protected in reality but model predicts an attack. Also the team would get notified and can check for any malicious activity. This doesn’t cause any harm. They can be termed as False Alarm. False negative type of error falls in this category.

We can use confusion matrix to calculate various metrics:

  1. Accuracy: The values of confusion matrix are used to calculate the accuracy of the model. It is the ratio of all correct predictions to overall predictions (total values)

Accuracy = (TP + TN)/(TP + TN + FP + FN)

2. Precision: (True positives / Predicted positives) = TP / TP + FP

3. Recall: (True positives / all actual positives) = TP / TP + FN

4. Specificity: (True negatives / all actual negatives) =TN / TN + FP

5. Misclassification: (all incorrect / all) = FP + FN / TP + TN + FP + FN

It can also be calculated as -> 1-Accuracy

Case Study:

Bhubaneshwar: The Covid-induced lockdown and more use of online platforms for purchase of essentials and other commodities during the difficult times have promoted cyber criminals to strike impunity.

At least 256 people were duped of nearly Rs 70.78 lakh by unknown cyber criminals in the capital city during the lockdown 1.0 and 2.0 in May this year.

The cyber help desk (7440006709) of the commissionerate Police had received complains from the 256 people last month. “We found that most of the complains loss their money while placing online orders, either food, liquor, household articles and home appliances. Altogether 63 people loss their money through UPI fraud and 48 people were cheated while making transactions through debit and credit cards. Our help desk personnel promptly intervened and recovered around Rs 12 lakh”, Deputy commissioner of police Uma Shankar Dash said.

Police sources said most victims found fraudulent withdrawal of money from there accounts after they clicked phishing and suspicious mails or links on cash-back offers. In some cases the victims, who were not conversant with online transactions, shared there passwords after being tricked by the cyber crooks. “We have been sensitizing people not to share their account details and PIN with strangers. Banks too have sending awareness messages to their customers”, Dash said.

General steps one should consider to be safe from the cyber attacks:

Become vigilant when browsing websites.

Flag and report suspicious emails.

Never click on unfamiliar links or ads.

Use a VPN whenever possible.

Ensure websites are safe before entering credentials.

Keep antivirus/application systems up to date.

Use strong passwords with 14+ characters.

Thank you!

--

--

No responses yet