ML Strategy for Machine Learning Projects

This post is inspired by Andrew Ng’s deep learning specialization so if you are really interested in this project do check the course.

So here comes the questions why ML stratergy? well the answer can be pretty motivating suppose you are working on a criminal identification app via deep learning and you are getting on an average 90% accurscy which is a really good success but not a real world applicable system in Airport so you are trying to bring the success via a lot of different ideas like

implementing this ideas will take anything from 6 month to 2 years and wouldn’t it be nice to solve problems like this quick and effective way

Orthogonalization:

One of the challenges with building machine learning systems is that there’s so many things you could try, so many things you could change. Including, for example, so many hyper-parameters you could tune so that one can achieve that level of performance that is required.

One of the things I’ve noticed is about the most effective machine learning people is they’re very clear-eyed about what to tune in order to try to achieve one effect. This is a process we call Orthogonalization.

Why single Number Evaluation metrics works better ?

machine learning projects are consists of idea , code and experiment loops let’s look @ this

Now if we are doing classification and getting a result based on 2 classifier such that

we are confused with Recall and Precision and can’t decide with which one should e proceeds we will develop a metrics like F1 score that will combine the both matrices

As F1 score suggest we can go ahead with A and drops B

Satisficing and Optimizing Matric?

suppose we choose a classifier that gives a good accuracy and we tune hyperparameter that improves or degrade the accuracy but now what is most important to you apart from accuracy is the running time so your goal is to build a cost that will maximize the accuracy and ata the same time minimize the accuracy

here the accuracy is the optimizer and the running time is satifizer like suppose with in 90ms you are getting 93% accurate pic again with 1000 ms you are getting 94% of accuracy so this 1% in cost of 900ms is not required when you are in real world application.

Train/Test/dev set distribution?

The way you set up your training dev, or development sets and test sets, can have a huge impact on how rapidly you or your team can make progress on building machine learning application.The same teams, even teams in very large companies, set up these data sets in ways that really slows down, rather than speeds up, the progress of the team. Let’s take a look at how you can set up these data sets to maximize your team’s efficiency .

like when you are dealing with a classifier in multi region and you are not confident of your data always take a even portion of your data for train and test and dev set.

Size will vary fro a typical machine learning projects it could be 60%-40% and again for a typical projects it could be 99% -1% .

Target should be focused on valid set error if it is not decreasing you should stop your training because our aim in any ml projects is to classify the unknown and not the known.

Avoidable Bias?

Now can you identify a cat correctly obviously you do except when you are completely drunk even in that time also you can identify a cat so what is avoidable error in ML ???????

but now let’s imagine that the picture has brightness issue and the human expertize reduces to 7.5% what is this …this is basically nothing but human proxy for bayes error

so the difference between the train and human aka bayes error is the bais and the difference between the dev and train is the variance .

So here the actual bias is not 8% but 0.5% as we according to the theory can’t dig down below bayes error hence it is what we call avoidable bias.

Now here is a question how would you define what is a human level error well lets just show you a slide of a case study:

Think for a moment the more advance the people the less error they get but what if there are doctor who can reduce the error even further thats why as human error is the proxy to bayes error we will keep the error as less than equal to 0.5% .

Now here is the solution of the problem we just stated above:

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s