27 Great Articles About Machine Learning Algorithms

27 Great Articles About Machine Learning Algorithms

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, Hadoop, decision trees, ensembles, correlation, outliers, regression, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, time series, cross-validation, model fitting, dataviz, AI and many more. To keep receiving these articles, sign up on DSC.

Source for picture: article flagged with a +

DSC Resources

27 Great Articles About Machine Learning Algorithms

Deep Learning Cheat Sheet for Beginners

Deep Learning Cheat Sheet for Beginners

This article was written by  Ian Goodfellow, Yoshua Bengio, and Aaron Courville. It consists of summaries, dozens of formulas, and numerous small sections that will help the beginner quickly grasp the essentials of deep learning. The presentation style is very similar to a cheat sheet

Example of bad data science: over-fitting


  • Machine Learning
  • Generalization and Overfitting
  • Feedforward Networks
  • Designing the Output Layer
  • Finding θ
  • Choosing the Cost Function
  • Regularization
  • Deep Feedforward Networks
  • Designing Hidden Layers
  • Optimizaton Methods
  • Simplifying the Network
  • Convolution Networks
  • Pooling
  • Recurrent Networks
  • Useful Data Sets
  • Autoencoders
  • Representation Learning
  • Practical Advice
  • Appendix: Probability

To read the full original article click here. For more deep learning related articles on DSC click here.

DSC Resources

Popular Articles

Deep Learning Cheat Sheet for Beginners

Introduction to ggplot2 — the grammar

Introduction to ggplot2 — the grammar

There are several thousands of languages in the world and they all have in common that they are defined and explained by some set of rules. This set of rules is so called grammar and with its help individual and separate words (like nouns and verbs) are combined into the right and meaningful sentences. The similar methodology can be applied on creating graphics. If we imagine that each graph is made up from its basic parts — components, we can define basic set of rules by which we will define and combine components into right and meaningful visualizations (graphs). This set of rules is so called grammar of graphics and in this article we’ll explain the methodology and syntax for one of the most famous graphics packages in R — ggplot2.

Graph as a composition of its individual components

The main idea that lies behind grammar of graphics is that each plot can be made from the same few components. Those components are:

  • aesthetics
  • geometrical shapes
  • statistical transformations
  • scaling
  • faceting
  • themes

Each component has its own set of rules and specific syntax (so called grammar of components) and together they are forming one single entity called graph.

In the next section we will introduce set of rules on two levels:

  • set of rules and ggplot2 syntax on component level
  • set of rules and ggplot2 syntax on graph level

Set of rules on component level

The first step in mastering the grammar is understanding individual components and their set of rules that are needed for proper definition and control. Below we are presenting short explanations for each of the components.


                                                         Data frame object as input data source

Description: Through this component we are defining input data set that will be used in visualization.

Syntax: data set name is defined inside ggplot() function. Ggplot() function initializes a new graph and data set name is one of its necessary arguments:

          #graph initialization and data source

Set of rules: ggplot2 requires that data are stored in a tidy data frame object. It is the most popular R data type object used for storing tabular data. You can imagine data frame as a table which has variables in the columns and observations in the rows. Any other objects like matrices, lists or similar are not accepted by the ggplot2.


                                    Mapping x,y to age and amount variable. Third variable gender is shown through color

Description: Component represents mapping between variables and visual properties like axes, size, shape and color. What will represent the axes on my plot? Beside variables that will represent axes do we want to see some additional informations? Through this component two variables will be mapped to horizontal and vertical axis. Additional informations (variables) can be added through color, shape and size.

Syntax: aesthetics are defined inside aes() function.

Set of rules: aes() function can be defined inside ggplot() or it can be defined inside other components like geometrical shapes and statistics. If aes() is defined inside ggplot() function then its definition is common for all components (for example x and y axis will be the same for all geometrical shapes on the graph). Otherwise, its definition is recognized only inside specific component.

          #aesthetics that is common for all components - points and text
ggplot(df1, aes(x=col1,y=col2))+geom_point()+geom_text()
#aesthetics that is specified only for points

Geometrical shapes


Description: Plot type definition. More precise, we are defining how our observations (points) will be displayed on the graph. There are many different types (like bar chart, histogram, scatter-plot, box-plot ,…) and each type is defined for specific number and types of variables.

Syntax: Syntax starts with geom_* and here are the most used shapes:

Set of rules:

  • Each geom shape can be defined with its own dataset and aesthetics that will be valid only for that shape. In that case data frame and aes() are defined inside geom_*() function.
  • Each shape have its own specific aesthetics arguments. For example hjust and vjust are arguments specific only for geom_text() and linetype is an argument specific only for line graphs.
  • It is possible to combine geometrical shapes which means that each graph can have one or more geom shapes. For example, sometimes it is useful to show on one graphics bar plot and a line plot or let’s say scatter plot and a line plot.
          #two geom shapes - geom_line and geom_point used on one graph
ggplot(GOT, aes(x=Episode,y=Number_of_viewers, colour=Season, group=Season)) + geom_line()+geom_point()

Statistical transformations

                                                Famous statistical transformation — smoothing

Description: Component is used to transform the data (summarize the data in some matter) before visualization. Many of those transformations are used “behind the scene” during geometrical shapes creation. Often we don’t define them directly, ggplot2 is doing that for us.

Syntax: Syntax depends on a used transformation. Below are often used statistics:

Set of rules:

  • with statistical components additional variables are created (usually some aggregate values or similar). To visualize those data we need to use some geom_*() function. Otherwise the newly created variables will not be visible on the screen.
  • there are two ways to use statistical functions. First way is to use stat_*() function and define geom shape as an argument inside that function (geom argument). The second way is to use geom_*() and define statistical transformation as an argument inside that function (stat argument). Here is an example:
          #define stat_*() function and geom argument inside that function
#define geom_*() function and stat argument inside that function


                                                                   Controlling the colors with scaling

Description: With aesthetics we define what we want to see on the graph and with scaling we define how we want to see those aesthetics. We can control colors, sizes, shapes and positions. Scales also provide the tools that let us to read the plot: the axes and legends (we can customize axis titles, labels and their positions). Ggplot2 creates automatically default predefined scales for each aesthetics that we define. However, if we want to customize scales we can modify each scale component by ourselves.

Syntax: Basic syntax is following:

                                              Basic scaling syntax

Here are the scales for different types of aesthetics:

Set of rules:

  • There are no specific rules — just appropriate function name needs to be chosen. Scaling syntax is a little bit more complex because for each aesthetics scaling we need to know aesthetics name (x,y, color, size, shape), type of a variable (continuous, discrete) and arguments that are specific for each scale function. Keep in mind that you will use these functions only when you are not satisfied with predefined scheme (default scaling that is created by ggplot2).


                                                    Sub-plotting histogram

Description: With faceting we are dividing the data into subsets by some discrete variable and displaying the same type of a graph for each data subset.

Syntax: Facet_wrap() or facet_grid() function is used for displaying subsets of data.

                               Faceting — sub-plotting by col1 variable

          ggplot(data_set, aes(col1,col2))+ geom_point()+

Set of rules:

  • Faceting can be used in a combination with different geom shapes, there is no restriction at all. The main idea with faceting is that once you make a graph you can easily split the data (by some criteria)and display sup-graphs which are going to be visible on the screen.


                                                  Changing background color of the plot

Description: With themes it is possible to control non-data elements on the graph. With this component we don’t change a type of graph, scaling definition or used aesthetics. Instead of that, we are changing things like fonts, ticks, panel strips and background colors.

Syntax: There are several predefined themes and here is the list of some of them:

Each of this themes will change all theme elements to values which are designed to work together harmoniously (complete theme is changed, not just individual elements). However, if we want to change individual elements (for example just background color or just font of our title) we can use theme() function and specify the exact element we want to change.

                                                                        Theme and element function

Set of rules: Each theme element (that is controlled via theme() arguments) is associated with an element function, which describes visual properties of that element. For example, if you want to set up background color you will need to define background color argument inside element_rect() function. If you decide to change axis labels you will need to define new labels inside element_text() function. Each argument in theme function needs to be defined with the help of one of these element_*() functions.

There are four basic element_functions and each is used in a combination with specific theme arguments:


Here is an example how we combine arguments with element functions:

          ggplot(data_set_name, aes(col1,col2)) + geom_point() + 
theme_bw() +
#panel background is used with element_rect()
theme(panel.background = element_rect(fill = "white",colour = "grey"))

Usually you’ll use predefined themes but it is useful to know that you can change each individual element using theme() function.

With that said, we explained basic rules related to each component of the graph. The next question which we ask ourselves is:” How are these components combined into one single entity called graph?”

Set of rules on graph level

                                                      Combining the components

After we defined each component separately we need to combine them together and create a proper and meaningful composition called graph.

Basic set of rules for combining:

  • Each new graph is initialized with ggplot() function.
  • Ggplot() is used to declare input data frame name and also to specify the set of plot aesthetics intended to be common throughout all geometrical shapes that will be used on one graph.
  • any component that is used in graph building will be added with ‘+’ sign
  • each component has its own corresponding function name and arguments that are related only for that component.
  • we can combine different components, we are not limited to certain combinations
  • each component will use the same input data frame and aesthetics that are defined inside ggplot() function (unless otherwise stated)
  • aesthetics can be defined inside ggplot() function or inside any geometrical shape. If defined inside ggplot() they will be common for all shapes. Otherwise, they will be defined for one specific component/shape.
  • each component has its own special arguments, rules and syntax. In some cases, two components can define special arguments that are unique only for that combination. For example, if geom_text() is used then special arguments inside aes() function are hjust and vjust. They are typical just for geom_text() object (we don’t use those arguments with other shapes).
  • stat_*() component needs to be combined with geom_* component. The reason lies in the fact that statistical transformations are only creating new variables. In order for them to be visible on the screen we must define the corresponding geom_* type which will visualize the new data.

Pseudo code is presented below:

          ggplot(data_frame_name, aes()) + 
component_for_geom1_*() +
component_for_geom2_*() ++
#optional components
component_for_scaling_*() +
component_for_faceting_*() +
component_for_themes_*() + ...

For the end we are presenting one real example:

                                                                 ggplot2 — sub-plotting bar-charts

Result is a graph that looks like this:

                                                                                  ggplot2 — faceting bar-charts


In this article we showed in what way ggplot2 relies on grammar of graphics. It may seem complex at the beginning because there a lot of rules and topics to master. Firstly you need to understand each component separately — meaning, syntax and rules for each of them independently. After that, you need to additionally learn how to properly combine those component in a one single entity called graph. There is a lot of theory behind the scene. But once you overcome this theory you can control and modify anything you like on your plot so that is nothing left to chance. After mastering the grammar distance from mind to “paper” becomes really short — almost every your idea can be accurately transposed on the screen.

To read original blog , click  here.

Introduction to ggplot2 — the grammar

The Role of AI in Assisting Customer Experience

The Role of AI in Assisting Customer Experience

From being the plots of sci-fi thrillers to being seen as threats by the working populace, Artificial Intelligence (AI) has during the last few years jumped into the headlines as it has become a part of reality.

People are confused – AI is either considered to be a giant leap towards modernization or a leap towards massive levels of unemployment. It can turn out to be a sign of prosperity for many, thought of as a loyal servant, or it can turn the tide and become the master. The future of AI and what it transpires into, lies in the hands of those controlling the characteristics that bind this innovation.

With AI, the horizons have greatly broadened. We can now imagine and think of ideas which might have made you a laughing stock a few years ago. From a computer system playing chess with the masters to driverless cars, the possibilities associated with AI are many.

How Machine Learning (ML) and AI Augment Humans

The debate regarding the role of AI in augmenting human capabilities has been around for quite a while now. There are numerous theories regarding the myriad of opportunities that are presented by AI when it comes to augmenting humans.

Machines are considered better than the human brain at numerous tasks. The superiority of machines over humans can be specifically understood through their handling of work with regard to speed and scale of computation. The completion of most major tasks involves several components, such as prediction, data gathering, judgment, and then the action taken. Humans are still better than machines at judgment.

Considering the high skill levels of machines with AI, the technology can be used in numerous fields to expand human capabilities, to optimize the use of resources, and to enhance productivity. A few of the industries which benefit and can benefit even more with the use of AI are:

Everyday Living

The use of Amazon Alexa comes to mind when you think of how AI can assist and automate our everyday life. Alexa, a device by Amazon that can take commands and enact them, is considered to be well ahead of its time. A few of the demands that Alexa can take and follow are to set a timer, play music, and answer general questions among many other interactive features.

The more you use Alexa, the more it will adjust according to your speech patterns and your tone. All these analytics and data are stored by Alexa on the cloud to have a better understanding of your preferences.

Moreover, smart sensors in collaboration with the Internet of Things (IoT) are also making everyday living easier for humans. These smart sensors take input from the environment beside them and then follow built-in resources to carry out predefined functions. They also make sure of better collection of data, and are used in numerous analytics centers.


Manufacturing businesses can largely benefit from the help of intelligent and collaborative robots that safely work with humans and handle tasks that are normally considered repetitive, unsafe, and difficult. This collaboration from robots is already in its implementation phase, and can further be improved in the near future.


Robotic technology that assists in managing and automating repetitive and mundane tasks is happening across numerous functions and is expected to penetrate deeper during the next decade. Wearables with Augmented Reality will help accomplish tasks that are expensive or unsafe in the world nowadays.

Transportation and Logistics

There is a fierce race between various technology companies to achieve fully automated vehicles. The race has been very public, and technologists have not shied away from the possibility of innovations in this regard.

While fully automated vehicles are expected in the future, there is an opportunity for reduced driving load in mundane situations such as driving long distances on highways. This assistance will help reduce error rates, and result in better fuel efficiency and traffic flow.

Over time, numerous advancements will affect the fabric of urban life, and these are bound to create opportunities for people open to them.


Machines with inherent abilities and knowledge can help medical professionals and doctors come up with wider, more accurate, and personalized diagnosis techniques. Humans with the help of AI will better manage the flow of patients and senior citizens requiring a monotonous medicine regime.


Agriculture has been boosted with numerous farming robots, automated irrigation systems, pest warning systems, and crop optimization methods. These practices will help reduce the burden from humans and will also help increase agricultural productivity.

The Change in Retail

Imagine if a device on your coffee table could tell just how much of your supplies are left and where you could get them at the cheapest rates. Once you confirm your order, the device would let you know when you can pick them up from the retailer. While AI has useful applications in many other areas as well, its suitability in retail is such that it looks like it was crafted for this purpose. Giants such as Amazon, Apple, and Google are exceeding all consumer expectations by delivering the unexpected with the aid of AI.

The responsiveness of retail is at a peak, and people like every bit of it. Predicting the need for a product or service, just when you require it, is what makes AI perfect for retail. The use of such technology though depends on predictive retail. Research and authentic data suggests that predictive retail has been used by retail firms for the last 2-3 years. The smart assistants of the next generation will predict information based on the habits of the user, and make a smart guess as to what will be required, when it will be required, and where.

Humans Need to Augment Algorithms


The one bit of blame on old school AI was that it did not adapt to changing times and follow the environment. AI could not understand its environment, and there was a serious lack of adjusting to trends.

With the addition of Machine Learning (ML), AI can now adapt to the changing environment and predict how user preferences can vary because of that. Examples of ML can be found in healthcare where patients can be differentiated on the basis of what disease they are suffering from. ML, through its advancements in judging the environment, can predict the symptoms of a disease and form a solution based on its knowledge.

However, it is imperative that the algorithms and control of AI be democratized and provided to all stakeholders. The algorithms will work best when they are augmented and controlled by humans. This assistance and control will pave the way for better control and management over the algorithms that define Artificial Intelligence. While the future holds a lot, it can be safely said that AI, once implemented, will open doors to endless opportunities and possibilities.


 About the Author

Ronald van Loon is an Advisory Board Member and Big Data & Analytics course advisor for Simplilearn. He contributes his expertise towards the rapid growth of Simplilearn’s popular Big Data & Analytics category.

If you would like to read more from Ronald van Loon on the possibilities of Big Data and Artificial Intelligence please click “Follow” and connect on LinkedIn and Twitter.




The Role of AI in Assisting Customer Experience

Big Data for FinTech & InsureTech

Big Data for FinTech & InsureTech

Abstract – Last Sunday I was at big retail store in Harare and it was a very busy day due to the fact it was month end and people got paid. Grocery shopping was in full swing, I also bought some groceries for my self. When I was in the queue for payment and collection, I saw almost every one making payment either by swiping the magic plastic card or struggling on their mobile handset by punching few numbers etc. The electronic payment queue was moving fast compared to the cash payment queue where I saw only a handful of people with just one/two small item/items. The thought came to my mind out of this whole picture was “Whats happening here besides the payments through mobile and plastic”? Data, More Data, Lots of Data so called BIG DATA was getting generated.

Introduction – Without the right security and encryption solution in place; big data is a very big problem. A smart Big Data factory should take smart approach to this costly, sensitive and critical asset maintenance and management. Before we go further let me explain un short what is Big Data, I am sure most of us knows the answer already; Big data is term that means a huge amount of Digital Data. This data is unorganized and unstructured because it is capture from different sources. So it is difficulty to analysis. For instance cardholder data should be managed in highly secured data vault, using multiple encryption keys with split knowledge and dual/triple control. In todays time with the help of Artificial intelligence, data security took another angle where now actual data is mapped to dummy data and actual data never gets into internet black hole as this data store cannot be connected to/via the internet but remains at back seat. A data thief would not be able to make use of information stolen from a database without also having multiple level of keys.

Main Story – Big data presents a tremendous opportunity for enterprises across multiple industries especially in the tsunami like data flow industry of “Payments”. FinTech, InsureTech, MedTech are major data generating industries i.e massive group of factories. According to some data from Google it shows technology based innovative insurance companies pays $0.60-$0.65 of each dollar in claims, with the rest covering costs of admin, marketing and reinsurance. Next questions were “Who owns this data?”, “What is the use of this data?” and “How secure is this data?”. My payment data with all my sensitive information is it secured and in safe hands? What about privacy of my sensitive information?. Thousands of questions started spinning my head. There is a massive scope of big data security. This presents a significant opportunity for disruption. With improvements in technology which anyways happening every day without demand and this will bring reduction in each of these cost items.

To read the original article in full, click here

====================== About the Author =================================

Read about Author  at : About Me   

Thank you all, for spending your time reading this post. Please share your feedback / comments / critics / agreements or disagreement.  Remark for more details about posts, subjects and relevance please read the disclaimer.

FacebookPage                Twitter                          ContactMe                          LinkedinPage   

Big Data for FinTech & InsureTech