Machine Learning Guide and Tutorial for Software Engineers

Machine Learning Guide and Tutorial for Software Engineers

This article was written by Nam Vu on GitHub. 

What is it? 

This is my multi-month study plan for going from mobile developer (self-taught, no CS degree) to machine learning engineer.My main goal was to find an approach to studying Machine Learning that is mainly hands-on and abstracts most of the Math for the beginner. This approach is unconventional because it’s the top-down and results-first approach designed for software engineers.

Please, feel free to make any contributions you feel will make it better.

Table of Contents

To check out all this information, click here. For other articles about machine learning, click here

Top DSC Resources

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge


Machine Learning Guide and Tutorial for Software Engineers

6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python)

6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python)

This article was posted by Sunil Ray. Sunil is a Business Analytics and BI professional.

Source for picture: click here

Introduction

Here’s a situation you’ve got into:

You are working on a classification problem and you have generated your set of hypothesis, created features and discussed the importance of variables. Within an hour, stakeholders want to see the first cut of the model.

What will you do? You have hunderds of thousands of data points and quite a few variables in your training data set. In such situation, if I were at your place, I would have used ‘Naive Bayes‘, which can be extremely fast relative to other classification algorithms. It works on Bayes theorem of probability to predict the class of unknown data set.

In this article, I’ll explain the basics of this algorithm, so that next time when you come across large data sets, you can bring this algorithm to action. In addition, if you are a newbie in Python, you should be overwhelmed by the presence of available codes in this article.

Table of Contents

  1. What is Naive Bayes algorithm?
  2. How Naive Bayes Algorithms works?
  3. What are the Pros and Cons of using Naive Bayes?
  4. 4 Applications of Naive Bayes Algorithm
  5. Steps to build a basic Naive Bayes Model in Python
  6. Tips to improve the power of Naive Bayes Model

To check out all this information, click here

Top DSC Resources

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge


6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python)

New Approaches to Unsupervised Domain Adaptation

New Approaches to Unsupervised Domain Adaptation

This article was contributed by Nikita Johnson. 


The cost of large scale data collection and annotation often makes the application of machine learning algorithms to new tasks or datasets prohibitively expensive. One approach circumventing this cost is training models on synthetic data where annotations are provided automatically. 

However, despite their appeal, such models often fail to distinguish synthetic images from real images, necessitating domain adaptation algorithms to manipulate these models before they can be successfully applied. Dilip Krishnan, Research Scientist at Google, is working on two approaches to the problem of unsupervised visual domain adaptation (both of which outperform current state-of-the-art methods.)

What you can find in the full article: 

  • Tell us more about your work, and give us a short teaser to your session?
  • What started your work in deep learning?
  • What are the key factors that have enabled recent advancements in deep learning?
  • Which industries do you think deep learning will benefit the most and why?
  • What advancements in deep learning would you hope to see in the next 3 years?

To read the original article, click here.

Top DSC Resources

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge


New Approaches to Unsupervised Domain Adaptation

How do I compute document similarity using Python?

How do I compute document similarity using Python?

This presentation gathers together video+python. It was written by Jonathan Mugan. Dr. Mugan specializes in artificial intelligence and machine learning.

How do I find documents similar to a particular document?

We will use a library in Python called gensim.

Let’s create some documents. 

We will use NLTK to tokenize.

A document will now be a list of tokens.

We will create a dictionary from a list of documents.

A dictionary maps every word to a number.

What you will find in the full presentation:

  • Create corpus
  • Create tf-idf model
  • Similarity measure object
  • Convert query document
  • Similar documents
  • Exercises

To check out all this information, click here. For more articles about Python, click here

DSC Resources

Popular Articles

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge


How do I compute document similarity using Python?

How well do facial recognition algorithms cope with a million strangers?

How well do facial recognition algorithms cope with a million strangers?

This article was written by . Co-authors include UW computer science and engineering professor Steve Seitz, undergraduate student and web developer Evan Brossard and former student Daniel Miller.

The MegaFace dataset contains 1 million images representing more than 690,000 unique people. It is the first benchmark that tests facial recognition algorithms at a million scale.University of Washington 

In the last few years, several groups have announced that their facial recognition systems have achieved near-perfect accuracy rates, performing better than humans at picking the same face out of the crowd.
But those tests were performed on a datasetwith only 13,000 images — fewer people than attend an average professional U.S. soccer game. What happens to their performance as those crowds grow to the size of a major U.S. city?
University of Washington researchers answered that question with the MegaFace Challenge, the world’s first competition aimed at evaluating and improving the performance of face recognition algorithms at the million person scale. All of the algorithms suffered in accuracy when confronted with more distractions, but some fared much better than others.

“We need to test facial recognition on a planetary scale to enable practical applications — testing on a larger scale lets you discover the flaws and successes of recognition algorithms,” said Ira Kemelmacher-Shlizerman, a UW assistant professor of computer science and the project’s principal investigator. “We can’t just test it on a very small scale and say it works perfectly.”

The UW team first developed a dataset with one million Flickr images from around the world that are publicly available under a Creative Commons license, representing 690,572 unique individuals. Then they challenged facial recognition teams to download the database and see how their algorithms performed when they had to distinguish between a million possible matches.

Read more in the original article. For more articles about recognition algorithms, click here.

DSC Resources

Popular Articles

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge


How well do facial recognition algorithms cope with a million strangers?