IOTA – The potential to drive Data Science for IoT

IOTA – The potential to drive Data Science for IoT

I have a close circle of clued-on/tech savvy friends whose views I take seriously. For the last few weeks, one of these friends has been sending me emails extolling the merits of something called IOTA – which calls itself as the next generation Blockchain.  At first, I thought of IOTA as yet another cryptocurrency. A whole flock of people are rebranding themselves as Bitcoin/ Blockchain / ICO experts and spamming me! So, I was initially sceptical of something that can be the ‘next generation blockchain’. But some more investigation over the holiday season has convinced me that IOTA could be a game changer. In this post, I explore the significance of IOTA and its implications for IoT analytics.


I explore such concepts in my course  Implementing Enterprise AI course using TensorFlow and Keras


What is (and what is not) IOTA

Before we proceed, this discussion is not about Bitcoin. I am not an expert on Bitcoin. For a discussion on Bitcoin, see what is bitcoin and why it matters from MIT tech review. IOTA is a cryptocurrency. But I am not an expert on cryptocurrencies either (ex the factors driving the price of the currency).  I am more interested in the problem IOTA solves and it’s disruptive potential – especially for IoT. To understand the disruptive potential of IOTA, we first need to understand what IOTA is (and what it is not).

Like blockchain, IOTA is a distributed ledger technology – but it aims to go beyond  blockchain. Bitcoin uses blockchain technology and a distributed ledger system to conduct transactions. But IOTA does not use Blockchain. Instead it uses something called the Tangle distributed Ledger which comprises of directed acyclic graphs, or DAGs. The activity of users of the system propagates the Tangle distributed ledger. This does not need fees, miners or creation of new tokens. In contrast, to propagate the Bitcoin ledger, miners need to perform computational work (or pay for previously mined Bitcoins).

Also, Bitcoin has a scalability problem because as more people start to use the system, it gets slower and it costs higher to process a transaction.  In contrast, the cost of using the IOTA ledger(Tangle) is the cost of the user’s computational effort to verify two randomly selected existing prior tangle ledger sites (transactions).

Source: IOTA-Whitepaper The main point is: “Every new transaction must approve two other transactions.” In this sense, IOTA is an attempt to potentially create a superior cryptocurrency platform by overcoming the limitations of Blockchain.

What problem does IOTA solve and why does that matter

If all IOTA created was a ‘better blockchain’ It would be interesting but not disruptive. But IOTA can be really disruptive with IoT. For example, an IoT sensor in a car could retrieve data from the factory automatically. IoT devices can connect and transact with each other in a peer to peer mode.  Potentially, it could help 50 billion devices to connect.  IOTA is getting traction(despite some recent hiccups). The IOTA Foundation announced that Robert Bosch Venture Capital GmbH (RBVC) — the corporate venture capital company of the Bosch Group — has purchased a significant number of IOTA tokens. Dr. Hongquan Jiang, partner at RBVC, will also join the IOTA Foundation’s advisory board. The core feature of IOTA is the ability ford devices  to transfer data through the Tangle. With recent extensions to the Core, IOTA can even operate in ‘one to many’ mode i.e. the ability to broadcast messages to devices.

What is the implication of IOTA for Data Science for IoT

The ability to manage and share data securely has profound implications for next generation IoT applications such as self driving cars, Drones etc. Such devices would need to collaborate within peers. A leasing model for devices could arise instead of an ownership model. That leasing/ collaboration model could also extend to data arising from IoT devices. Furthermore, interaction between devices can happen autonomously. IOTA could thus be the backbone for IoT applications.  

If “Data is the next Oil” makes you cringe .. this is for you

The term ‘Data is the next Oil’ often makes me cringe .. because it’s your data and their Oil! But for a long time, there was not much you could do about it. At least for sensor data, IOTA offers a potentally disruptive way out and yet help to foster an ecosystem.  If you want to work with me on ideas such as this, I explore such concepts in my course  Implementing Enterprise AI course using TensorFlow and Keras

Image source: IOTA

Comment and Disclosure

a)  The post is narrowly confined to the potential of IOTA for IoT and DataScience for IoT

b)  I do not claim any expertise or knowledge of IOTA as a cryptocurrency

c)  The cryptographic security discussions re IOTA are also not in scope (and I am not an expert on this)

d)  I do not hold any IOTA currency at the time of writing

IOTA – The potential to drive Data Science for IoT

Enterprise AI: Learning from the evolution of Robotic Process Automaton

Enterprise AI: Learning from the evolution of Robotic Process Automaton

In 2017, the Robotic Process Automation / RPA  market has matured.


Learning from the evolution of RPA, in this post, we explore the wider implications for Enterprise AI i.e. the deployment of Artificial Intelligence to the Enterprise



The post is based on my course on Implementing Enterprise Artificial Intelligence (AI) course where we explore these ideas in detail.


For this article, we consider AI to be based on Deep Learning technologies. In contrast to Machine Learning, Deep Learning implies the automatic detection of features. Features could be either a large number of possible impacting characteristics or a hierarchical set of features. As an application, RPA uses various Deep Learning technologies (such as Natural Language Processing).


For many of us, AI implies machines taking over jobs.


Dilbert explains this best!



 In the case of RPA, the current technology is designed to take over specific tasks (as opposed to jobs). Initially, these are mundane tasks. But increasingly, with the deployment of AI – we can also automate complex tasks – potentially full job functions.


Thus, leaving aside the moral arguments, we can consider RPA as a foundation set of technologies to understand the wider deployment of AI to the Enterprise.


The current definition of ‘RPA’ as per the Institute of Robotics Processing Automation emphasizes the status quo: “the application of technology that allows employees in a company to configure computer software or a ‘robot’ to capture and interpret existing applications for processing a transaction, manipulating data, triggering responses and communicating with other digital systems.”


However, if we consider AI with RPA we get a more complex picture because we can interject decision making into existing processes. This allows us to automate higher order tasks which previously needed perception and judgement from humans.




Here are three things which can infer from the evolution of RPA to the wider deployment of AI in the Enterprise


1) What can AI emulate by observation?

If a process can be observed or simulated, it can be duplicated.

So, the logical question follows: how can processes in an Enterprise be observed?

Here are some of the ways:


  • Natural Language processing and automation of forms – NLP can automate many forms in an Enterprise. This allows us to learn a process based on automation of forms
  • Identity: KYC++ and Blockchain to understand workflow – Both KYC and Blockchain allow for better Identity – hence richer data and a greater understanding of the process flow interlinking multiple Identities  
  • Testing as learning: automated software testing products like selenium can help to identify process flow
  • Observing processes and workflows through APIs: APIs and Integration software like mulesoft could provide us a greater understanding of workflows
  • Observing processes and workflows in the front office through unstructured data and chatbots: In communication intensive industries like Financial services, Healthcare and Insurance – the deployment of chatbots can be the first step to understanding workflow.
  • Observing systems as they are being built: Finally, it could be possible to observe systems as they are being built – leading to cheaper support costs


2) How could deeply coupled AI systems impact workflow?

The current success of RPA is due to loosely coupled systems – leading to quicker deployment. The ‘robot’ essentially logs on as a traditional user and mimics the behavior of the user. This means existing systems do not have to change. As these systems start to incorporate AI, RPA could be more deeply integrated. In theory, it is possible for an RPA system to learn different sub systems and then mimic a higher-level process through deeper integration


3) How could we use anomaly detection to drive more inductive processing?

To create complex use cases based on reasoning – we need to incorporate inductive processes. Anomaly detection is a driver to greater inductive processes in the Enterprise.


To recap, Deductive reasoning starts with a hypothesis/theory. Based on a theory, we then try to predict observations. If the observations validate the theory – then the theory is valid. In contrast, inductive reasoning starts with specific observations from which we aim to build the broad generalizations.  Thus, there is an interplay between inductive and deductive reasoning. Inductive reasoning allows scientists to form hypotheses. In turn, Deductive reasoning allows scientists to apply the theories to specific situations.


Anomaly detection techniques (both in Machine learning and deep learning anomaly detection) provide opportunities to interject more reasoning into processes




In a report (“Where machines could replace humans—and where they can’t (yet)”), McKinsey Consulting states that almost 33% of the time spent across all US occupations is related to Data processing. The report also indicates that we can automate more than 60% of this work with currently available technologies. They categorize work into the following classes: Managing others, Applying expertise, Stakeholder interactions, Unpredictable physical work, Data collection, Data processing and Predictable physical work


As RPA becomes more integrated and can observe more of the Enterprise processes, it can potentially mimic and emulate complex processes which require reasoning covering many of the categories of work listed above.


Once RPA ROI has been demonstrated at one level, it will be scaled and integrated deeper into the Enterprise. This will lead to continuous improvement and self-healing processes. It would also radically change the nature of work itself. The first group of workers to feel the impact of RPA and AI as described above will be offshore workers (including offshore developers). Ironically, these same companies are the largest adopters of RPA today


Ultimately, we could undertake a radical form of process engineering. We cover these ideas in the Implementing Enterprise Artificial Intelligence (AI) course

Enterprise AI: Learning from the evolution of Robotic Process Automaton

Non-traditional strategies for mid-career switch to #Datascience and #AI

Non-traditional strategies for mid-career switch to #Datascience and #AI

In this post, I explore strategies to switch to Data Science mid-career. This switch is not easy, but based on the experience of many who I have taught/mentored/recruited – it is possible. Most people consider PhD/MooC etc for switching their career to Data Science. But here, I will explore some non-traditional/unorthodox ways of switching to Data Science. I draw upon my personal experience as a teacher, data scientist and in recruiting data scientists – especially in creating personalized AI / Data Science courses


So, here are my insights


1)    Consider Data Engineering instead of Data Science: Data Engineers are the relatively less known cousins of Data Scientists but are rapidly growing in importance as Data Science matures. More importantly, depending on your experience, a transition to data engineering may be easier (ex if you had previous ETL/ SQL experience)

2)    Draw on your business knowledge: Business knowledge will be valuable in Data Science especially with many areas like feature engineering. Also, most algorithms improve previous benchmarks – but the task itself remains the same. For example, Churn prevention / Fraud detection etc are well defined industry problems. AI/ Machine learning simply improve the previous benchmarks but the domain knowledge is still valuable.

3)    Github: Probably the best way you can differentiate. People study for MooCs or even PhDs but they cannot demonstrate that they can build anything. You need a Github repo which will put you far ahead of many

4)    Niche: Focus on a niche in Data Science. For example, I am working with Tensorflow mobile. Considering the current success of Tensorflow – it’s a no-brainer that tensorflow mobile will be interesting. Apple is following a similar strategy with Coreml for AI on iPhone devices

5)    Focus on AI: This may sound unusual. But let me explain. I consider a boring definition of AI. AI is (mostly) based on Deep Learning. Deep Learning is a set of complex (and math based) techniques and are used for automatic feature engineering. AI will be become increasingly pervasive. In doing so, many companies will come forward to simplify AI. Therein lies the opportunity. We see this already in cases like Driverless AI from This means, at some point in the near future you can implement AI without knowing Deep Learning in detail.  

6)    Look for tangential algorithm applications: I can explain this best with examples from my personal experience. I started off with IoT (which I still work with). However, I have also worked with fintech and healthcare applications I did not have a substantial background with Healthcare or fintech – however IoT is mostly based on Time series. As are also parts of fintech and Healthcare.

7)    Choose the right books: If you are learning Data Science, broadly there are two types of books. An example of the first type of book is by Hastie (large pdf book). Another type of book is – Deep Learning with Keras by Antonio Gulli and Sujit Pal. The former is heavy on concepts and maths. The latter is very pragmatic. With each chapter based on code and with a github repository. You need both types but you definitely need the later.  

8)    Give yourselves a year (at least) –  This switch will not be easy In my view, it needs a year but it’s worth it!

9)    Keras: One word .. Keras ..DI/ML are hard enough as it is. You need the best strategy to make your life simple but also to cover depth. Hence Keras. PS I note gluon from Microsoft and Amazon which sounds like a similar approach to Keras but I am not personally familiar with it yet

10)  Develop end to end problem solving skills Ultimately, tools don’t matter as much as the ability to use data and algorithms to solve problems. This great post by Vincent Granville on forecasting meteorite hits shows the end to end skills needed for problem solving in data science. I believe many people work on specifics (ex an algorithm) but miss how to solve problems end to end


I hope you found these strategies useful If you want to know more about my work, please see my work in creating personalized #DataScience and #AI courses


Image shutterstock

Non-traditional strategies for mid-career switch to #Datascience and #AI

Behavioural Biometrics, IoT and AI

Behavioural Biometrics, IoT and AI


Biometrics is defined as the science of establishing the identity of an individual based on physical, chemical or behavioural attributes of the person. We see the deployment of Biometrics in many industries such as smart homes, automotive, banking, healthcare etc. According to Gartner, biometric sensors such as premise security entry consoles will total at least 500 million IoT connections in year 2018. Acuity Market Intelligence forecasts that within three years, biometrics will become a standard feature on smartphones as well as other mobile devices. IoT (Internet of Things) connects the Physical world to the virtual world – and in doing so – provides elements of Biometric Data. To discuss these issues, please join a new meetup group in London Behavioural Biometrics IoT and AI


Behavioural Biometrics

While Physical Biometric techniques (like fingerprint recognition, IRIS scans etc) are well established, Behavioural biometrics systems are still emerging. According to the IBIA White paper, Behavioural biometrics provides a new generation of user security solutions that  identify individuals  based on the  unique way they interact with computer devices  like smartphones, tablets or mouse-screen-and-keyboard. By measuring everything from how the  user holds  the  phone or how they swipe the  screen, to which keyboard or gestural shortcuts they use,  software algorithms build a unique user profile, which can then be used to confirm  the  user’s identity  on subsequent interactions.


Currently,  behavioral biometrics are  deployed as an additional layer to enhance identity  authentication and fraud  detection systems but they provide a number of advantages over traditional biometric technologies.

  • They can be collected non-obtrusively or even without the knowledge of the user
  • They do not need any specialized hardware
  • Behavioural biometrics are completely frictionless because users can be enrolled in the  background during  normal interactions – they do not slow, interrupt or interfere with the  user experience. 
  • Because there are  dozens and  dozens of data points collected, and  any combination of them can be used to identify a user, identification is accurate and  precise and  users cannot practicably be impersonated.
  • Because authentication happens throughout the  entire course of the  transaction, behavioural biometrics provides powerful protection against insider  threats and  account takeover, as well as identity  theft.
  • Behavioural biometrics does not  replace the  password or other legacy forms  of identity  authentication, but  it does reduce the burden placed on them to protect sensitive data.


Behavioural Biometrics techniques

In Behavioural biometrics: a survey and classification, Yampolskiy & Govindaraju provide a survey of behavioural biometric techniques.  They classify Behavioural biometrics into five categories based on the type of information about the user being collected.

  • Category one is made up of authorship based biometrics (ex examining a piece of text produced by a person).
  • Category two consists of human computer interaction (HCI)-based biometrics ex the use of Keystroke biometrics.
  • Category Three involves using events that can be obtained by monitoring user’s HCI behaviour indirectly via observable low-level actions of computer software(for example audit logs).
  • Category four involves tracking of motor skills of the users in performing certain tasks. Finally,
  • Category five involves purely behavioural biometrics. such as the way an individual walks.

 The authors also present a generalized algorithm for implementing behavioural biometric with the following steps:

  • Pick behaviour
  • Break-up behaviour into component actions
  • Determine frequencies of component actions for each user
  • Combine results into a feature vector profile
  • Apply similarity measure function to the stored template and current behaviour
  • Experimentally determine a threshold value
  • Verify or reject user based on the similarity score comparison to the threshold value.


Behavioural Biometrics – IoT and AI

So, with this background, what is the relationship between Behavioural biometrics, IoT and AI?

  • Behavioral biometrics relies on increasingly ubiquitous, mobile  and IoT devices  to capture data points  that  will authenticate the  user.
  • Increasingly, IoT and Mobile devices provide continuous Authentication over the session.
  • The individual pattern/profile is hard to spoof because it ties to your unique behaviour comprising of physiology and considering other factors like social, psychological and health factors.
  • Rather than focusing on an activity’s outcome, behavioral biometrics focuses on how  a user conducts the specified activity. This means, real time AI algorithms can be used to validate detect behaviour even as the activity progress (and is yet to complete). For example – keyboard metrics can detect behaviour as the transaction progresses without waiting for it to complete
  • Finally, behavioral biometrics are agnostic of personally identifiable information or PII. I don’t need to know anything about you to be sure it’s you. I just need to ensure that you are the same person who logged in the last time. Hence, there is scope to create new algorithms which are PII protecting by using Behavioual biometrics  .


To discuss these issues, please join a new meetup group in London  Behavioural Biometrics IoT and AI

 Image Shutterstock

Ajit Jaokar conducts a course at Oxford University on Data Science for Internet of Things. He also is a Research Data Scientist working on Behavioural Biometrics

Behavioural Biometrics, IoT and AI

Book review: The Mathematical Corporation: Where Machine Intelligence and Human Ingenuity Achieve the Impossible

Book review: The Mathematical Corporation: Where Machine Intelligence and Human Ingenuity Achieve the Impossible



I heard about  The Mathematical Corporation: Where Machine Intelligence and Human Ingenuity Achieve the Impossible  by Josh Sullivan and Angela Zutavern through a tweet by Kirk Bourne.


I conduct a course at Oxford University on Data Science for IoT. I have also recently launched a course on AI for fintech. The Mathematical Corporation covers many issues that I have encountered in my teaching and consulting.


About the book


In essence, Mathematical Corporation calls for new leadership traits in the world of AI.


The book postulates a world where Leaders will develop new thinking skills for working with machines.


Leaders will address Big, unknown questions by querying the universe of data.


They will aspire to work towards breakthrough or impossible strategies


Leaders embrace complexity (“Complexity is the new treasure”, “Complexity is a boon not a burden”)


Leaders will prefer complex models which solve Big problems over over-simplified models which could lose the essence. (The word ‘impossible’ features many times in the book – ex impossible problems)


The most significant statement for me was “Scientific leaders do not excel because of what they know but rather because of how much they inquire about what they don’t know”


The book proposes that Business leaders adopt a similar mindset – for example  every investigation could begin with a literature review


My impressions ..

Based on my experience, I very much agree to the ideas in the book


For example: I recently heard about a start-up who wanted to deploy complex AI models – but did not have any Data (on which to train their Deep Learning models)  


This happens more often than you think!


AI is a buzzword – but technology behind AI has a lot of potential.

To exploit the true potential of AI, we need leaders to think about solving complex / impossible problems and adopt new ways of thinking


Like all good books, The Mathematical Corporation makes you think and raises questions


For example:

  • Will corporations address complex problems which may not agree with the short term / quarterly focus?
  • Will leaders choose a narrow tech focus and solve incremental problems (v.s. complex ones)?
  • Will existing leaders adapt or should we wait for a new set of leaders?


Time will tell ..


The book link is The Mathematical Corporation: Where Machine Intelligence and Human Ingenuity Achieve the Impossible  by Josh Sullivan and Angela Zutavern

Book review: The Mathematical Corporation: Where Machine Intelligence and Human Ingenuity Achieve the Impossible

Data readiness strategies of AI Start-ups

Data readiness strategies of AI Start-ups

Last week, at an event on AI, I asked the panel about how investors evaluate the Data readiness of AI start-ups. This subject is close to my work and my teaching. I teach a course on Implementing Enterprise AI and also teach Data Science for IoT at the University of Oxford.  Below are my perspectives.  


Professor Neil Laurence has proposed a concept of Data readiness levels. The highest level of Data readiness represents Data which is most useful to make predictions i.e. “Can we use this data to prove the efficacy of a drug?”

In many cases, start-ups do not have data that is useful for making predictions. This applies very much to AI start-ups.

 AI is based on Deep Learning algorithms. Deep Learning involves automatic feature detection from data. To do so, by definition, we need a lot of Data. More specifically, we need a lot of labelled data to train the Deep Learning algorithm layers.

Many start-ups/companies do not have this data – and hence may not be able to solve the problem they set out to solve. Hence, one could argue that most AI start-ups are actually not Data ready.

I believe that there are various ways to address this problem

Data readiness strategies

  1. Unsupervised learning ex autoencoders which can be used to create a structure similar to PCA  for example the image processing example using autoencoders
  2. Semi supervised learning: Using unlabelled data with small amounts of labelled data explained in a good paper by Yoshua Bengio
  3. Newer solutions like nanonets  
  4. Synthetic data strategies  
  5. Free or available data to initially train the model
  6. Model zoos 
  7. With less data, one would run a mix of Deep learning and machine learning algorithms – so  feature selection and transformation strategies would apply

My overall  impression is:

AI is a very new field and there is competitive advantage to first movers. Thus, many companies are adopting variants of the above strategies and will move forward even when they have limited data initially. But, by the same token, companies must have a clear set of strategies in place as they address investors. I discuss these ideas in the Implementing Enterprise AI course

Data readiness strategies of AI Start-ups