Tuesday, December 14, 2021

Basic Git Commands cheat sheet

 

git config

Syntax: git config –global user.name “[name]”

Syntax: git config –global user.email “[email address]”

This command sets the author name and email address respectively to be used with your commits.

git init

Syntax: git init [repository name]

This command is used to start a new repository.

git add

Syntax: git add [file]

This command adds a file to the staging area.

Syntax: git add *

This command adds one or more to the staging area.

git clone

Syntax: git clone [url]

This command is used to obtain a repository from an existing URL.

git commit

Syntax: git commit -m “[ Type any commit message of your choice]”

This will record or snapshot the file permanently in the version history.

Syntax: git commit -a

This will commit any files you’ve added with the git add command and also will commit any files you’ve changed since then.

git reset

Syntax: git reset [file]

This command unstages the file, but it saves/preserves the file contents.

Syntax: git reset [commit]

This command undoes all the commits after the specified commit and preserves the changes locally.

Syntax: git reset –hard [commit]

This command discards all history and goes back to the specified commit.

git status

Syntax: git status

This command lists all the files that have to be committed.

git show

Syntax: git show [commit]

This command shows the metadata and content changes of the specified commit.

git branch

Syntax: git branch

This command lists all the local branches in the current repository.

Syntax: git branch [branch name]

This command creates a new branch.

Syntax: git branch -d [branch name]

This command deletes the feature branch.

git checkout

Syntax: git checkout [branch name]

This command is used to switch from one branch to another.

Syntax: git checkout -b [branch name]

This command creates a new branch and also switches to it.

git remote

Syntax: git remote add [variable name] [Remote Server Link]

This command is used to connect your local repository to the remote server.

git push

Syntax: git push [variable name] master

This command sends the committed changes of master branch to your remote repository.

Syntax: git push [variable name] [branch]

This command sends the branch commits to your remote repository.

Syntax: git push –all [variable name]

This command pushes all branches to your remote repository.

Syntax: git push [variable name] :[branch name]

This command deletes a branch on your remote repository.

git pull

Syntax:  git pull [Repository Link]

This command fetches and merges changes on the remote server to your working directory.

git tag

Syntax: git tag [commitID]

This command is used to give tags to the specified commit.

git merge

Syntax: git merge [branch name]

This command merges the specified branch’s history into the current branch.

Types of Error in Statistics


A statistical error, in simple words, is the difference between a measured value and the actual value of the data that was gathered.

A hypothesis test can result in two types of errors.

1. Type 1 error

2. Type 2 error


Type 1 Error: A Type-I error occurs when sample results reject the null hypothesis despite being true.

Type 2 Error: A Type-II error occurs when the null hypothesis is not rejected when it is false.


In other words, In statistics, a Type I error is a false positive conclusion, while a Type II error is a false negative conclusion.

The significance level, or alpha (), determines the likelihood of a Type I error, whereas beta () determines the likelihood of a Type II error. These risks can be reduced by carefully designing the layout of your study.

Example: Type I vs Type II error:

You have mild symptoms of COVID-19 and your doctor advised you to go for a test. The following two errors could potentially occur:

Type I error (false positive): the test result says you are COVID positive, but you actually don’t.

Type II error (false negative): the test result says you are COVID negative, but you actually do

Thursday, December 9, 2021

What is Confusion Matrix?

An N x N matrix called a confusion matrix is used to assess the effectiveness of a classification model, where N is the total number of target classes. In the matrix, the actual goal values are contrasted with those that the machine learning model anticipated. This provides us with a comprehensive understanding of the classification model's performance and the types of mistakes it is making.

A 2 x 2 matrix with 4 values is what we would have for a binary classification problem:

Now let's interpret the matrix:

• Positive or negative values can be assigned to the target variable.

The target variable's actual values are shown in the columns, while its anticipated values are shown in the rows.

True Positive (TP): When the model's predicted value matches the actual value and the actual value was positive.
True Negative (TN): When the model's prediction and the observed value coincide. When the observed value was negative and the model had anticipated a negative value.

Type 1 error: False Positive (FP)
Also known as the Type 1 mistake, this error occurs when the anticipated value is incorrectly forecasted, the actual value is negative, while the model projected a positive value.

Type 2 error: False Negative (FN)

Also known as the Type 2 error, these conditions include: the predicted value was incorrectly forecasted; the actual value was positive although the model projected a negative value; and

To help you grasp this better, let's use an example. Consider a classification dataset that contained 10000 data points. We apply a classifier to it and obtain the confusion matrix shown below:

The Confusion matrix's various values would be as follows:

True Positive (TP) = 6500, indicating that the model accurately categorised 6500 positive class data points.

True Negative (TN) = 2300, indicating that the model properly identified 2300 data points in the negative class.

False Positive (FP) = 700, which means that the model misclassified 700 negative class data points as being in the positive class.

500 positive class data points were mistakenly assigned to the negative class by the model, resulting in 500 false negatives (FN)

Evaluation parameters

  1. Accuracy
  2. Precision
  3. Recall
  4. F1-Score
The following are the evaluation parameters considered:

Accuracy: The number of all accurate predictions divided by the overall dataset size yields accuracy (ACC). The accuracy ranges from 0.0 to 1.0, with 1.0 being the best. You can alternatively calculate it by using 1 - ERR.
Technically, Accuracy is calculated as the total number of two correct predictions (TP + TN) divided by the total number of a dataset (P + N).

Accuracy=(TP+TN)/(TP+FP+FN+TN)

Precision: Precision is an evaluation metric that combines relevant and successfully retrieved items over all of the results that were successfully obtained. When the likelihood of a false-positive prediction is large, it is mostly employed.
Precision (TNR) = TP/(TP+FP)

Recall: Recall is a measure when a False negative is considered.

Recall (Sensitivity or TPR) = TP/(TP+FN)

F1-Score: F1-Score is an evaluation technique that maintains a balance between precision and recall.

F1-Score = 2 * (Precision*Recall)/(Precision+Recall)


What is the Purpose of a Confusion Matrix?

Let's consider a classification issue before we respond to this query.

Consider the scenario where you want to segregate those who are infected with an infectious virus from the healthy population before they begin to exhibit symptoms. Our goal variable would have the following two values: Sick and Not Sick.

You're probably thinking why we need a confusion matrix when we already have Accuracy, our go-to companion in any situation. Let's see where accuracy falls short.

Let's take an example of an unbalanced dataset. The negative class has 947 data points, while the positive class has only three. We'll calculate the accuracy as follows:




Clustering in Machine Learning

Clustering is a type of unsupervised learning in machine learning where the goal is to group a set of objects in such a way that objects in...