Measuring Semantic Relatedness using the Distance and the Shortest Common Ancestor and Outcast Detection with Wordnet Digraph in Python

Measuring Semantic Relatedness using the Distance and the Shortest Common Ancestor and Outcast Detection with Wordnet Digraph in Python

The following problem appeared as an assignment in the Algorithm Course (COS 226) at Princeton University taught by Prof. Sedgewick.  The description of the problem is taken from the assignment itself. However, in the assignment, the implementation is supposed to be in java, in this article a python implementation will be described instead. Instead of using nltk, this implementation is going to be from scratch.

 

The Problem

 

  • WordNet is a semantic lexicon for the English language that computational linguists and cognitive scientists use extensively. For example, WordNet was a key component in IBM’s Jeopardy-playing Watson computer system.
  • WordNet groups words into sets of synonyms called synsets. For example, { AND circuitAND gate } is a synset that represent a logical gate that fires only when all of its inputs fire.
  • WordNet also describes semantic relationships between synsets. One such relationship is the is-a relationship, which connects a hyponym (more specific synset) to a hypernym (more general synset). For example, the synset gatelogic gate } is a hypernym of { AND circuitAND gate } because an AND gate is a kind of logic gate.
  • The WordNet digraph. The first task is to build the WordNet digraph: each vertex v is an integer that represents a synset, and each directed edge v→w represents that w is a hypernym of v.
  • The WordNet digraph is a rooted DAG: it is acyclic and has one vertex—the root— that is an ancestor of every other vertex.
  • However, it is not necessarily a tree because a synset can have more than one hypernym. A small subgraph of the WordNet digraph appears below.

 

wordnet-event.png

The WordNet input file formats

 

The following two data files will be used to create the WordNet digraph. The files are in comma-separated values (CSV) format: each line contains a sequence of fields, separated by commas.

  • List of synsets: The file synsets.txt contains all noun synsets in WordNet, one per line. Line i of the file (counting from 0) contains the information for synset i.
    • The first field is the synset id, which is always the integer i;
    • the second field is the synonym set (or synset); and
    • the third field is its dictionary definition (or gloss), which is not relevant to this assignment.

      wordnet-synsets.png

      For example, line 36 means that the synset { AND_circuitAND_gate } has an id number of 36 and its gloss is a circuit in a computer that fires only when all of its inputs fire. The individual nouns that constitute a synset are separated by spaces. If a noun contains more than one word, the underscore character connects the words (and not the space character).
  • List of hypernyms: The file hypernyms.txt contains the hypernym relationships. Line i of the file (counting from 0) contains the hypernyms of synset i.
    • The first field is the synset id, which is always the integer i;
    • subsequent fields are the id numbers of the synset’s hypernyms.

      wordnet-hypernyms.png

      For example, line 36 means that synset 36 (AND_circuit AND_Gate) has 42338 (gate logic_gate) as its only hypernym. Line 34 means that synset 34 (AIDS acquired_immune_deficiency_syndrome) has two hypernyms: 47569 (immunodeficiency) and 56099 (infectious_disease).

 

 

The WordNet data type 

 

Implement an immutable data type WordNet with the following API:

wn.png

 

  • The Wordnet Digraph contains 76066 nodes and 84087 edges, it’s very difficult to visualize the entire graph at once, hence small subgraphs will be displayed as and when required relevant to the context of the examples later.

 

  • The sca() and the distance() between two nodes v and w are implemented using bfs (bread first search) starting from the two nodes separately and combining the distances computed.

 

Performance requirements 

  • The data type must use space linear in the input size (size of synsets and hypernyms files).
  • The constructor must take time linearithmic (or better) in the input size.
  • The method isNoun() must run in time logarithmic (or better) in the number of nouns.
  • The methods distance() and sca() must make exactly one call to the length() and ancestor() methods in ShortestCommonAncestor, respectively.


The Shortest Common Ancestor
 

 

  • An ancestral path between two vertices v and w in a rooted DAG is a directed pathfrom v to a common ancestor x, together with a directed path from w to the same ancestor x.
  • shortest ancestral path is an ancestral path of minimum total length.
  • We refer to the common ancestor in a shortest ancestral path as a shortest common ancestor.
  • Note that a shortest common ancestor always exists because the root is an ancestor of every vertex. Note also that an ancestral path is a path, but not a directed path.

wordnet-sca.png

  • The following animation shows how the shortest common ancestor node 1 for thenodes 3 and 10  for the following rooted DAG is found at distance 4 with bfs, along with the ancestral path 3-1-5-9-10.sca2.gif


      

  • We generalize the notion of shortest common ancestor to subsets of vertices. A shortest ancestral path of two subsets of vertices A and B is a shortest ancestral path over all pairs of vertices v and w, with v in A and w in B.
  • The figure (digraph25.txt) below shows an example in which, for two subsets, red and blue, we have computed several (but not all) ancestral paths, including the shortest one.wordnet-sca-set.png

    Shortest common ancestor data type 

    Implement an immutable data type ShortestCommonAncestor with the following API:

    sca.png

 

Basic performance requirements 

The data type must use space proportional to E + V, where E and V are the number of edges and vertices in the digraph, respectively. All methods and the constructor must take time proportional to EV (or better).

 

Measuring the semantic relatedness of two nouns

Semantic relatedness refers to the degree to which two concepts are related. Measuring semantic relatedness is a challenging problem. For example, let’s consider George W. Bushand John F. Kennedy (two U.S. presidents) to be more closely related than George W. Bush and chimpanzee (two primates). It might not be clear whether George W. Bush and Eric Arthur Blair are more related than two arbitrary people. However, both George W. Bush and Eric Arthur Blair (a.k.a. George Orwell) are famous communicators and, therefore, closely related.

Let’s define the semantic relatedness of two WordNet nouns x and y as follows:

  • A = set of synsets in which x appears
  • B = set of synsets in which y appears
  • distance(x, y) = length of shortest ancestral path of subsets A and B
  • sca(x, y) = a shortest common ancestor of subsets A and B

This is the notion of distance that we need to use to implement the distance() and sca() methods in the WordNet data type.

wordnet-distance.png

Finding semantic relatedness for some example nouns with the shortest common ancestor and the distance method implemented

 

apple and potato (distance 5 in the Wordnet Digraph, as shown below)

dag_apple_potato.png

As can be seen, the noun entity is the root of the Wordnet DAG.

beer and diaper (distance 13 in the Wordnet Digraph)

dag_beer_diaper.png

 

beer and milk (distance 4 in the Wordnet Digraph, with SCA as drink synset), as expected since they are more semantically closer to each other.

dag_beer_milk.png

bread and butter (distance 3 in the Wordnet Digraph, as shown below)

dag_bread_butter.png

cancer and AIDS (distance 6 in the Wordnet Digraph, with SCA as disease as shown below, bfs computed distances and the target distance between the nouns are also shown)

dag_cancer_AIDS.png

 

car and vehicle (distance 2 in the Wordnet Digraph, with SCA as vehicle as shown below)

dag_car_vehicle.png
cat and dog (distance 4 in the Wordnet Digraph, with SCA as carnivore as shown below)

dag_cat_dog.png

cat and milk (distance 7 in the Wordnet Digraph, with SCA as substance as shown below, here cat is identified as Arabian tea)

dag_cat_milk.png

Einstein and Newton (distance 2 in the Wordnet Digraph, with SCA as physicist as shown below)

dag_Einstein_Newton

Leibnitz and Newton (distance 2 in the Wordnet Digraph, with SCA as mathematician)

dag_Leibnitz_Newton

Gandhi and Mandela (distance 2 in the Wordnet Digraph, with SCA as national_leader synset)dag_Gandhi_Mandela

laptop and internet (distance 11 in the Wordnet Digraph, with SCA as instrumentation synset)dag_laptop_internet

school and office (distance 5 in the Wordnet Digraph, with SCA as construction synset as shown below)

dag_school_office

bed and table (distance 3 in the Wordnet Digraph, with SCA as furniture synset as shown below)
dag_table_bed

Tagore and Einstein (distance 4 in the Wordnet Digraph, with SCA as intellectual synset as shown below)

dag_Tagore_Einstein

Tagore and Gandhi (distance 8 in the Wordnet Digraph, with SCA as person synset as shown below)

dag_Tagore_Gandhi

Tagore and Shelley (distance 2 in the Wordnet Digraph, with SCA as author as shown below)
dag_Tagore_Shelley

text and mining (distance 12 in the Wordnet Digraph, with SCA as abstraction synset as shown below)
dag_text_mining

milk and water (distance 3 in the Wordnet Digraph, with SCA as food, as shown below)dag_water_milk

Outcast detection

 

Given a list of WordNet nouns x1x2, …, xn, which noun is the least related to the others? To identify an outcast, compute the sum of the distances between each noun and every other one:

di   =   distance(xix1)   +   distance(xix2)   +   …   +   distance(xixn)

and return a noun xt for which dt is maximum. Note that distance(xixi) = 0, so it will not contribute to the sum.

Implement an immutable data type Outcast with the following API:

outc.png

Examples

As expected, potato is the outcast  in the list of the nouns shown below (a noun with maximum distance from the rest of the nouns, all of which except potato are fruits, but potato is not). It can be seen from the Wordnet Distance heatmap from the next plot, as well as the sum of distance plot from the plot following the next one.
outcast_apple_pear_peach_banana_lime_lemon_blueberry_strawberry_mango_watermelon_potato

Again, as expected, table is the outcast  in the list of the nouns shown below (a noun with maximum distance from the rest of the nouns, all of which except table are mammals, but table is not). It can be seen from the Wordnet Distance heatmap from the next plot, as well as the sum of distance plot from the plot following the next one.

outcast_horse_zebra_cat_bear_table

Finally, as expected, bed is the outcast  in the list of the nouns shown below (a noun with maximum distance from the rest of the nouns, all of which except bed are drinks, but bed is not). It can be seen from the Wordnet Distance heatmap from the next plot, as well as the sum of distance plot from the plot following the next one.

outcast_water_soda_bed_orange_juice_milk_apple_juice_tea_coffee


Measuring Semantic Relatedness using the Distance and the Shortest Common Ancestor and Outcast Detection with Wordnet Digraph in Python

Some Variational Image Processing: Poisson Image Editing and its applications in Python

Some Variational Image Processing: Poisson Image Editing and its applications in Python

Poisson Image Editing

The goal of Poisson image editing is to perform seamless blending of an object or a texturefrom a source image (captured by a mask image) to a target image. We want to create a photomontage by pasting an image region onto a new background using Poisson image editing. This idea is from the P´erez et al’s SIGGRAPH 2003 paper Poisson Image Editing.

The following figures describe the basic theory. As can be seen, the problem is first expressed in the continuous domain as a constrained variational optimization problem and then can be solved using a discrete Poisson solver

f7.png

f8.png 

As described in the paper , the main task of Poisson image editing is to solve a huge linear system Ax = b (where I is the new unknown image and S and T are the known images).

 

Seamless Cloning

The following images are taken from an assignment , where the Poisson image editing had to be used to blend the source inside the mask inside the target image. The next few figures show the result obtained.

Source Imagebearhttps://sandipanweb.files.wordpress.com/2017/10/bear.png?w=150&h=82 150w” sizes=”(max-width: 701px) 100vw, 701px” />

Mask Imagemaskhttps://sandipanweb.files.wordpress.com/2017/10/mask.png?w=150&h=82 150w” sizes=”(max-width: 687px) 100vw, 687px” />

Target Imagewaterpool

Output Gray-Scale Image with Poisson Image Editing
pe_waterpoolhttps://sandipanweb.files.wordpress.com/2017/10/pe_waterpool.png?w=150&h=82 150w” sizes=”(max-width: 676px) 100vw, 676px” />

The next animation shows the solution gray-scale images obtained at different iterations using Conjugate Gradient method when solving the linear system of equations.

peditmit1

Output Color Image with Poisson Image Editingpe_waterpool_color1

The next animation shows the solution color images obtained at different iterationsusing Conjugate Gradient method to solve the linear system of equations, applying the discrete Poisson solver on each channel.

pbear.gif

 

 

The following images are taken from another on computational photography. Again, the same technique is used to blend the octopus from the source image to the target image.

Source Imagesource

Mask Imagemask

Target Imagetarget

Output Imagepe_target_color

The next animation shows the solution color images obtained at different iterationsusing Conjugate Gradient method to solve the linear system of equations, applying the discrete Poisson solver on each channel.

poct.gif

 

Again, Poisson gradient domain seamless cloning was used to blend the penguins’ image inside the following target image with appropriate mask.

Source Image
peng1

Target Image                                                                                                            trekking

Output Imagepe_trekking.jpg

The next animation again shows the solution color images obtained at different iterations using Conjugate Gradient method to solve the linear system of equations, applying the discrete Poisson solver on each channel.

pe-trekk

 

The next figures show how a source bird image is blended into the target cloud image with seamless cloning.

Source Image
bird1

Target Imagecloud

Output gray-scale imagepe_cloud

The next animation shows the solution gray-scale images obtained at the first few iterations using Conjugate Gradient method when solving the linear system of equations.

pedit2

 

Finally, the next figures show how the idol of the Goddess Durga is blended into the target image of the city of kolkata with seamless cloning. As can be seen, since the source mask had its own texture and there is a lots of variations in the background texture, the seamless cloning does not work that well.

Source Image
madurga

Target Image
kol

Output Image
pe_kol_color

The next animation shows the solution gray-scale images obtained at the first few iterations using Conjugate Gradient method when solving the linear system of equations.

pedit

The next animation again shows the solution color images obtained at different iterations using Conjugate Gradient method to solve the linear system of equations, applying the discrete Poisson solver on each channel.

pe-madurga

Gradient Mixing: Inserting objects with holes

Again, the next figures are taken from the same paper, this time the source objects contain holes. As can be seen from the following results, the seamless cloning does not work well in this case for inserting the object with holes into the target, the target texture is lost inside the mask after blending.

Source Image                                            Target Image
srch            dsth

Output Image with Poisson Seamless Cloning
pe_tran1

The next animation again shows the solution color images obtained at the first few iterations using Conjugate Gradient method while solving the linear system of equations for seamless cloning, applying the discrete Poisson solver on each channel.

pe-tran1

Using the mixing gradients method the blending result obtained is far better, as shown below, it preserves the target texture.

Output Image with Poisson Mixing Gradients
pe_tran

The next animation again shows the solution color images obtained at the first few iterations using Conjugate Gradient method while solving the linear system of equations for mixing gradients, applying the discrete Poisson solver on each channel.

pe-tran

 

The next few figures show the results obtained using mixing gradients on another set of images, the seamless cloning does not work well in this case, but mixing gradientworks just fine.

 

Source Image
liberty1

Target Image
vic

Output Image with mixing gradients
pe_vic

The next animation again shows the solution color images obtained at the first few iterations using Conjugate Gradient method while solving the linear system of equations for mixing gradients, applying the discrete Poisson solver on each channel.

pe-vic

Creating Cartoon-like images: Texture Flattening

As illustrated in the paper, by retaining only the gradients at edge locations, before integrating with the Poisson solver, one washes out the texture of the selected region, giving its contents a flat aspect.

The following figures show the output cartoon-like image obtained using texture flattening, using the canny-edge detector to generate the  mask.

Source Image
pe_face1

Mask Image created with Canny edge detector
mface1

Output cartoon obtained with texture flattening from the source with the mask
pe_cart1

The next animation shows the solution color images obtained at the first few iterationsusing Conjugate Gradient method while solving the linear system of equations for texture flattening, applying the discrete Poisson solver on each channel.

pe-cart1

 

Again, the next figures show the output cartoon-like image obtained using texture flattening, using the canny-edge detector to generate the  mask on my image.

Source Image
me2

Output image obtained with texture flattening
pe-cart-me

The next animation again shows the solution color images obtained at the first few iterations using Conjugate Gradient method while solving the linear system of equations for texture flattening, applying the discrete Poisson solver on each channel.

pe-cart-me


Some Variational Image Processing: Poisson Image Editing and its applications in Python

Using Bayesian Kalman Filter to predict positions of moving particles / objects in 2D (in R)

Using Bayesian Kalman Filter to predict positions of moving particles / objects in 2D (in R)

In this article, we shall see how the Bayesian Kalman Filter can be used to predict positions of some moving particles / objects in 2D.

This article is inspired by a programming assignment from the coursera course Robotics Learning by University of Pennsylvania, where the goal was to implement a Kalman filter for ball tracking in 2D space. Some part of the problem description is taken from the assignment description.

  • The following equations / algorithms are going to be used to compute the Bayesian state updates for the Kalman Filter.

    kfalgo.png

    th_kf.png

  • For the first set of experiments, a few 2D Brownian Motion like movements are simulated for a particle.
    • The noisy position measurements of the particle are captured at different time instants (by adding random Gaussian noise).
    • Next, Kalman Filter is used to predict the particle’s position at different time instants, assuming different position, velocity and measurement uncertainty parameter values.
    • Both the actual trajectory and KF-predicted trajectory of the particle are shown in the following figures / animations.
    • The positional uncertainty (as 2D-Gaussian distribution) assumed by the Kalman Filter is also shown as gray / black contour (for different values of uncertainties).

      kfout.png

      motion1.1motion1.2motion1.3

  • The next set of figures / animations show how the position of a moving bug is tracked using Kalman Filter.
    • First the noisy measurements of the positions of the bug are obtained at different time instants.
    • Next the correction steps and then the prediction steps are applied, after assuming some uncertainties in position, velocity and the measurements with some Gaussian distributions.

      motion3

      • The position of the bug as shown in the figure above is moving in the x and ydirection randomly in a grid defined by the rectangle [-100,100]x[-100,100].
      • The next figure shows how different iterations for the Kalman Filter predicts and corrects the noisy position observations. The uncertain motion model p(x_t|x_{t-1}) increases the spread of the contour.  We observe a noisy position estimate z_tThe contour of the corrected position p(x_t) has less spread than both the observation p(z_t|x_t) and the motion p(x_t|x_{t-1})  adjusted state.

      motion3

  • Next the GPS dataset from the UCI Machine Learning Repository is used to get the geospatial positions of some vehicles at different times.
    • Again some noisy measurement is simulated by adding random noise to the original data.
    • Then the Kalman Filter is again used to predict the vehicle’s position at different time instants, assuming different position, velocity and measurement uncertainties.
    • The position and measurement uncertainties (σ_p,  σ_m) are in terms of latitude / longitude values, where uncertainty in the motion model is σ_v.
    • Both the actual trajectory and KF-predicted trajectory of the vehicle are shown in the following figures / animations.
    • As expected, the more the uncertainties in the position / motion model, the more the actual trajectory differs from the KF-predicted one.

      kfout1kfout2kfout3

      motion2.2motion2.3motion2.4motion2.1.gif

Using Bayesian Kalman Filter to predict positions of moving particles / objects in 2D (in R)

SIR Epidemic model for influenza A (H1N1): Modeling the outbreak of the pandemic in Kolkata, West Bengal, India, 2010

SIR Epidemic model for influenza A (H1N1): Modeling the outbreak of the pandemic in Kolkata, West Bengal, India, 2010

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Summary

In this report, the spread of the pandemic influenza A (H1N1) that had an outbreak in Kolkata, West Bengal, India, 2010 is going to be simulated. The basic epidemic SIR model will be used, it describes three populations: a susceptible population, an infected population, and a recovered population and assumes the total population (sum of these 3 populations) as fixed over the period of study.

There are two parameters for this model: namely the attack rate (β) per infected person per day through contacts and the recovery rate (α). Initially there will be a small number of infected persons in the population. Now the following questions are to be answered with the simulation / analysis:

1. Whether the number of infected persons increase substantially, producing an epidemic, or the flue will fizzle out.
2. Assuming there is an epidemic, how will it end? Will there still be any susceptibles left when it is over?
3. How long will the epidemic last?

The Euler’s method will be primarily used to solve the system of differential equations for the SIR model and compute the equilibrium points (along with some
analytic solution attempts for a few simplified special cases). Here are the
conclusions obtained from the simulations:

1. When the recovery rate α is ≈ 0 or very very low compared to the attack rate β, the flu will turn out to be an epidemic and the entire population will be infected first (the higher β is the quicker the epidemic outbreak).

2. To be more precise, when the initial susceptible population S(0) is greater than the inverse of the basic reproduction number 1/ R0 = α / β, a proper epidemic will break out.

3. When the initial susceptible population S(0) is less than the inverse of the basic reproduction number 1/R0=α/β, then a proper epidemic will never break out.

4. If the initial susceptible population is non-zero, in the end (at equilibrium) there will always be some susceptible population.

5. When there is an epidemic, it will eventually end in the equilibrium point with 0 infected population, how fast it reaches the equilibrium depends upon the recovery rate (the higher α is the quicker the infection removal).

6. The time to reach the equilibrium can be computed using Euler’s method, it depends on the parameters α (the higher the quicker) and β (the higher the quicker) and the initial infected populated size I(0) (the higher the quicker).

lv.png

Introduction

In 2010, the pandemic influenza A (H1N1) had an outbreak in Kolkata, West Bengal, India. An increased number of cases with influenza like illness (ILI) were reported in Greater Kolkata Metropolitan Area (GKMA) during July and August 2010, as stated in [3]. The main motivation for this research project will be to understand the spread of the pandemic, compute the equilibrium points and find the impact of the initial values of the infected rate and the attack / recovery rate parameters on the spread of the epidemic, using simulations using the basic epidemic SIR model.

The Euler’s method will be primarily used to solve the system of differential equations
for SIR model and compute the equilibrium points. First a few simplified special cases will be considered and both analytic and numerical methods (with Euler method) will be used to compute the equilibrium points. Then the solution for the generic model will be found. As described in [6], the SIR model can also be effectively used (in a broader
context) to model the propagation of computer virus in computer networks, particularly for the networks with Erdos-Renyi type random graph topology.


SIR Epidemic Model

The SIR model is an epidemiological model that computes the theoretical number of people infected with a contagious illness in a closed population over time. One of the basic one strain SIR models is Kermack-McKendrick Model. The Kermack-McKendrick Model is used to explain the rapid rise and fall in the number of infective patients observed in epidemics. It assumes that the population size is fixed (i.e., no births, no deaths due to disease nor by natural causes), incubation period of the infectious agent
is instantaneous, and duration of infectivity is the same as the length of the disease. It also assumes a completely homogeneous population with no age, spatial, or social structure.

The following figure 2.1 shows an electron microscope image of the re-assorted H1N1influenza virus photographed at the CDC Influenza Laboratory. The viruses are 80 − 120 nm in diameter [1].

virus.png

2.1 Basic Mathematical Model

The starting model for an epidemic is the so-called SIR model, where S stands for susceptible population, the people that can be infected. I is the already infected population, the people that are contagious, and R stands for the recovered population, people who are not contagious any more.

2.1.1 Differential Equations

The SIR model can be defined using the following ordinary differential equations 2.1:

dif1

• The terms dS/dt , dI/dt , dR/dt in the differential equations indicate the rate of change of the susceptible population size, the infected population size and the recovered population size, respectively.

• The terms β and α indicate the attack rate (number of susceptible persons get infected per day) and the recovery rate of the flu (inverse of the number of days a person remains infected), respectively.

• High value of α means a person will be infected by the flu for less number of days and high value of β means that the epidemic will spread quickly.

• Also, as can be seen from below, from the differential equations it can be shown that the population (S + I + R) is assumed to be constant.

dif2.png

2.1.2 Collected data, units and values for the constants

• As can be seen from the following figure 2.2, the focus of this analysis will be limited to  the population in Kolkata Metropolitan Corporation (KMC, XII) area where the population can be assumed to be ≈ 4.5 million or 4500 thousands, as per [7].

• Units

– All population (S, I, R) units will be in thousands persons (so that total population
N = 4500).
– As can be derived from the differential equations 2.1, the unit of β will be in
10^(−6) /persons/day (β = 25 will mean 25 persons in a million gets infected by
susceptible-infected contact per infected person per day).
– Similarly, the units of α will be in 10^(−3) / day (α = 167 will mean 167 × 10^(−3) /
day gets recovered from the flu per day).

• The attack rate is 20-29/100000 and the number of days infected (i.e. the inverse of recovery rate) = 5−7 days on average (with a few exceptions), as per [3].

• Typical values for β and α can be assumed to be 25 /person / day and 10^3/6 ≈ 167 / day, respectively.

f1.jpg

2.2 Simplified Model 1 (with α = 0)

• At first a simplified model is is created assuming that α = 0 (/ day) and that R = 0, so once infected, a person stays contagious for ever. Because S(t) + I(t) + R(t) = S(t) + I(t) = N is constant (since population size N is fixed), S(t) can be eliminated and a single differential equation in just I(t) is obtained as shown in the equation below 2.2.

f2

• Also, let the (fixed) population size N = 4500 = S(0) + I(0), (in thousand persons), initially the number of persons infected = I(0) = 1 (in thousand persons) and number of persons susceptible S(0) = N −I(0) = 4499 (in thousand persons), respectively. Let β = 25 × 10^(−6) /persons / day) to start with.

2.2.1 Analytic Solution

• The analytic solution can be found by following the steps shown in the Appendix A and the final solution is shown in the below equations 2.3:

f3.png
• The following figure 2.3 shows the logistic (bounded) growth in I(t) (in thousands persons) w.r.t. the time (in days) for different values of attack rate β×10^(−6) (/ person / day). As expected, the higher the attack rate, the quicker all the persons in the population become infected.

f4.png

2.2.2 Finding the equilibrium points for I

• The equilibrium points are the points where the rate of change in I is zero, the points that satisfy the following equation

f5https://sandipanweb.files.wordpress.com/2017/07/f5.png?w=150&h=86 150w” sizes=”(max-width: 236px) 100vw, 236px” />

• Considering a small neighborhood of the equilibrium point at I = 0, it can be seen from the figure 2.4 that whenever I > 0, dI/dt > 0, so I increases and goes away from the equilibrium point.

• Hence, the equilibrium point at I = 0 is unstable.

• At I = N = 4500 (in thousand persons) it is a stable equilibrium. As can be seen from the following figure 2.4, in a small neighborhood of the equilibrium point at I = 4500, it always increases / decreases towards the equilibrium point.

• In a small  ε > 0 neighborhood at I = 4500 (in thousand persons),

1. dI/dt > 0, so I increases when I <= 4500 − ε .
2. dI/dt > 0, so I decreases when I >= 4500 +  ε .

• The same can be observed from the direction fields from the figure 2.5.

• Hence, the equilibrium at I = 4500 is stable.

f6.png

f7.png

2.2.3 Numerical Solution with the Euler’s Method

• The algorithm (taken from the course slides) shown in the following figure 2.6 will be used for numerical computation of the (equilibrium) solution using the Euler’s method.

f8.png

• As can be seen from the figure 2.6, then the infection at the next timestep can be (linearly) approximated (iteratively) by the summation of the the infection current timestep with the product of the difference in timestep and the derivative of the infection evaluated at the current timestep.

2.2.3.1 Finding the right step size (with β = 25 × 10^(−6)/person/day)

• In order to decide the best step size for the Euler method, first the Euler’s method is run with different step sizes as shown in the figure 2.7.

• As can be seen from the following table 2.1 and the figure 2.7, the largest differences in the value of I (with two consecutive step sizes) occurs around 78 days:

f9.png

f10.png
• As can be seen from the table in the Appendix B, the first time when the error becomes less than 1 person (in thousands) is with the step size 1/512 , hence this step size will be used for the Euler method.

2.2.3.2 Computing the (stable) equilibrium point

• Now, this timestep will be used to solve the problem to find the equilibrium time teq(in days). Find teq such that N − I(teq) < ε = 10^(−6) , the solution obtained is teq = 272.33398 days ≈ 273 days.

• Now, from the analytic solution 2.3 and the following figure 2.8, it can be verified that the teq solution that the Euler’s method obtained is pretty accurate (to the ε tolerance).

f11.png

2.2.3.3 Results with β = 29 × 10^(−6) / person / day, I(0) = 1 person

• Following the same iterations as above, the steepest error is obtained at t = 67 days in this case, as shown in the figure 2.9.

• The first time when error becomes less than one person for t = 67 days with the Euler ‘s method is with step size 1/512 again.

• The solution obtained is teq = 234.76953125 days ≈ 235 days, so the equilibrium (when all the population becomes infected) is obtained earlier as expected, since the attack rate β is higher.

f12.png

2.2.3.4 Results with β = 25 × 10−6 / person / day, with different initial values for infected persons (I(0))

• Following the same iterations as above, the equilibrium point is computed using the Euler’s method with different values of initial infected population I(0), as shown in the figure 2.10.

• The solutions obtained are teq = 272.33, 258.02, 251.85, 248.23, 245.66, 245.66 days for I(0) = 1, 5, 10, 15, 20 days, respectively. So the equilibrium is obtained earlier when the initial infected population size is higher, as expected.

f13.png

2.3 Simplified Model 2 (with β = 0)

• Next, yet another simplified model is considered by assuming that β = 0 and that α > 0, so the flu can no more infect anyone (susceptible, if any, possibly because everyone got infected), an infected person recovers from flu with rate α. This situation can be described again with a single differential equation in just I(t) as shown in the equation below 2.4.

f14https://sandipanweb.files.wordpress.com/2017/07/f14.png?w=150&h=30 150w” sizes=”(max-width: 351px) 100vw, 351px” />

• Also, let the the entire population be infected, N = 4500 = I(0), (in thousand persons), initially the number of persons susceptible = S(0) = 0, respectively. Let α = 167 × 10^(−3)
(/ day) to start with.

2.3.1 Analytic Solution

• The analytic solution can be found by following the steps shown in the below equations 2.5:

f15

• The following figure 2.11 shows the exponential decay in I(t) (in thousand persons)  w.r.t. the time (in days) for different values of recovery rate α × 10^(−3)  (/ day). As expected, the higher the recovery rate, the quicker all the persons in the population get rid of the infection.

f16.png

• Now, I(t) + R(t) = N (since S(t) = 0 forever, since no more infection) and I(0) = N, combining with the above analytic solution I(t) = I(0).exp(−αt) = N.exp(−αt), the following equation is obtained:
f17

• The following figure 2.12 shows the growth in R(t) (in thousand persons) w.r.t. the time (in days) for different values of recovery rate α × 10^(−3) (/ day). As expected, the higher the recovery rate, the quicker all the persons in the population move to the removed state.

f18.png

2.3.2 Numerical Solution with the Euler’s Method

2.3.2.1 Solution with α = 167 × 10−3 / day

• Following the same iterations as above, the steepest error is obtained at t = 6 in this case, as shown in the figure 2.16.

• The first time when error becomes less than one person for t = 67 with the Euler’s method is with step size 1/256 .

• The solution obtained with the Euler’s method is 133.076171875 days ≈ 133 days to remove the infection from population with 10^(−6) tolerance. From the analytic solution,
I(133) = N.exp(−αt) = 1.016478E−06, similar result is obtained.

2.3.2.2 Results

The following figure 2.16 shows the solutions obtained with different step sizes using the Euler’s method.

f19.png

2.4 Generic Model (with α, β > 0)

First, the numeric solution will be attempted for the generic model (using the Euler’s method) and then some analytic insights will be derived for the generic model.

2.4.1 Numerical Solution with the Euler’s  Method

• The following algorithm 2.14 shown in the next figure is going to be used to obtain the solution using Euler method (the basic program for Euler’s method, adapted to include three dependent variables and three differential equations).

f20.png

• As can be seen from the figure 2.14, first the vector X(0) is formed by combining the three variables S, I, R at timestep 0. Then value of the vector at the next timestep can be (linearly) approximated (iteratively) by the (vector) summation of the vector value at the current timestep with the product of the difference in timestep and the derivative of the
vector evaluated at the current timestep.

2.4.1.1 Equilibrium points

• At the equilibrium point,

f21

There will be no infected person at the equilibrium point (infection should get removed).

• As can be seen from the following figure 2.15 also, I = 0 is an equilibrium point, which is quite expected, since in the equilibrium all the infected population will move to the removed state.

• Also, at every point the invariant S + I + R = N holds.

• In this particular case shown in figure 2.15, the susceptible population S also becomes 0 at equilibrium (since all the population got infected initially, all of them need to move to removed state) and R = N = 4500 (in thousand persons).

f22.png

2.4.1.2 Results with Euler method

• As explained in the previous sections, the same iterative method is to find the right stepsize for the Euler method. The minimum of the two stepsizes determined is
∆t = 1/512 day and again this stepsize is going to be used for the Euler’s method.

• The following figures show the solutions obtained with different values of α, β with the initial infected population size I(0) = 1 (in thousand persons). Higher values for the parameter β obtained from the literature are used for simulation, since β = 25 × 10^(−6) /person /day is too small (with the results not interesting) for the growth of the epidemic using the Euler’s method (at least till ∆t = 1/ 2^15), after which the iterative Euler’s
method becomes very slow).

• As can be seen, from the figures 2.16, 2.17 and 2.19, at equilibrium, I becomes zero.

• The solution (number of days to reach equilibrium) obtained at α = 167×10^(−3) /day and β = 25×10^(−5) /person /day is teq = 143.35546875 ≈ 144 days with I(0) = 1 (in thousand persons), the corresponding figure is figure 2.16.

• The solution (number of days to reach equilibrium) obtained at α = 167 × 10^(−3) /day and β = 5 × 10^(−5) /person /day is teq ≈ 542 days with I(0) = 1 (in thousand persons), the corresponding figure is figure 2.17.

• Hence, higher the β value, the equilibrium is reached much earlier.

• The solution obtained at α = 500 × 10^(−3) /day and β = 25 × 10^(−5) /person /day is
teq ≈ 78 days with I(0) = 1 (in thousand persons), the corresponding figure is figure 2.19.

• Hence, higher the α value, the equilibrium is reached earlier.

• The solution obtained at α = 167×10^(−3) /day and β = 25×10^(−5) /person /day is
teq = 140 days with I(0) = 10. Hence, as expected, higher the number of initial infected population size, quicker the equilibrium is reached.

• At equilibrium, S does not necessarily become close to zero, since sometimes the entire  population may not get infected ever, as shown in the figure 2.17, where at equilibrium the susceptible population is non-zero.

f23f24f25f26

• As can be seen from the phase planes from following figure 2.21, at equilibrium, the infected population becomes 0.

f27.png

2.4.2 Analytic Solution and Insights

2.4.2.1 Basic Reproduction Number (R0)

The basic reproduction number (also called basic reproduction ratio) is defined by
R0 = β / α (unit is /day). As explained in [2], this ratio is derived as the expected number of new infections (these new infections are sometimes called secondary infections) from a single infection in a population where all subjects are susceptible. How the dynamics of the system depends on R0 will be discussed next.

2.4.2.2 The dynamics of the system as a function of R0

• By dividing the first equation by the third in 2.1, as done in [2], the following equation is obtained:

f28.png

• Now, at t → ∞, the equilibrium must have been already reached and all infections must have been removed, so that lim (t→∞) I(t) = 0.

• Also, let R∞ = lim (t→∞) R(t).

• Then from the above equation 2.7, R∞ = N − S(0).exp(R0.(R∞−R(0)))
.
• As explained in [2], the above equation shows that at the end of an epidemic, unless
S(0) = 0, not all individuals of the population have recovered, so some must remain  susceptible.

• This means that the end of an epidemic is caused by the decline in the number of infected individuals rather than an absolute lack of susceptible subjects [2].

• The role of the basic reproduction number is extremely important, as explained in [2]. From the differential equation, the following equation can be obtained:

f29

 S(t) > 1/R0 ⇒ dI(t)/dt > 0 ⇒ there will be a proper epidemic outbreak with an increase of the number of the infectious (which can reach a considerable fraction of the population).

• S(t) < 1 R0 ⇒ dI(t) dt < 0 ⇒ independently from the initial size of the susceptible population the disease can never cause a proper epidemic outbreak.

• As can be seen from the following figures 2.21 and 2.22 (from the simulation results obtained with Euler method), when S(0) > 1/R0 , there is a peak in the infection curve, indicating a proper epidemic outbreak.

• Also, from the figures 2.21 and 2.22, when S(0) > 1/R0 , the higher the the gap between S(0) and 1/R0 , the higher the peak is (the more people get infected) and the quicker the peak is attained.

• Again, from the figure 2.22, when 4490 = S(0) < 1/R0 = 5000, it never causes a proper epidemic outbreak .

f30https://sandipanweb.files.wordpress.com/2017/07/f30.png?w=197&h=300 197w” sizes=”(max-width: 619px) 100vw, 619px” />

f31.png 

• Again, by dividing the second equation by the first in 2.1, the following equation is obtained:

f32.pngf33.png 

• As can be noticed from the above figure 2.23 that because the formulas differ only by an additive constant, these curves are all vertical translations of each other.

• The line I(t) = 0 consists of equilibrium points.

• Starting out at a point on one of these curves with I(t) > 0, as time goes on one needs to travel along the curve to the left (because dS/dt < 0), eventually approaching at some positive value of S(t).

• This must happen since on any of these curves, as I(t) → ∞, as S(t) → 0, from equation 2.8.

• So the answer to question (2) is that the epidemic will end as with approaching some positive value and thus there must always be some susceptibles left over.

• As can be seen from the following figure 2.24 (from the simulation results obtained with the Euler’s method), when S(0) > 1/R0 , lesser the the gap between S(0) and 1/R0 , the higher the population remains susceptible at equilibrium (or at t → ∞).

f34.png

Conclusions

In this report, the spread of the pandemic influenza A (H1N1) that had an outbreak in Kolkata, West Bengal, India, 2010 was simulated using the basic epidemic SIR model.Initially there will be a small number of infected persons in the population, most of the population had susceptible persons (still not infected but prone to be infected) and zero removed persons. Given the initial values of the variables and the parameter (attack and recovery rates of the flu) values, the following questions were attempted to be answered with the simulation / analysis:

1. Whether the number of infected persons increase substantially, producing an epidemic, or the flue will fizzle out.

2. Assuming there is an epidemic, how will it end? Will there still be any susceptibles left when it is over?

3. How long will the epidemic last?
The following conclusions are obtained after running the simulations with
different values of the parameters and the initial values of the variables:

1. When the recovery rate α is ≈ 0 or very very low compared to the attack rate β (so that R0 = β / α >> 1) and I(0) > 1, the flu will turn out to be an epidemic and the entire population will be infected first (the higher β is the quicker the epidemic break out).

2. To be more precise, when the initial susceptible population S(0) is greater than the inverse of the basic reproduction number 1/R0 = α / β, a proper epidemic will break out.

3. When the initial susceptible population S(0) is less than the inverse of the basic reproduction number 1/R0 = α/β, then a proper epidemic will never break out.

4. If the initial susceptible population is non-zero, in the end (at equilibrium) there will always be some susceptible population.

5. When there is an epidemic, it will eventually end in the equilibrium point with 0 infected population, how fast it reaches the equilibrium depends upon the recovery rate (the higher α is the quicker the infection removal).

6. The time to reach the equilibrium can be computed using Euler method, it depends on the parameters α (the higher the quicker) and β (the higher the quicker) and the initial infected populated size I(0) (the higher the quicker).

7. Scope of improvement: The SIR model could be extended to The Classic Endemic Model [5] where the birth and the death rates are also considered for the population (this will be particularly useful when a disease takes a long time to reach the equilibrium state).

f35.pnghttps://sandipanweb.files.wordpress.com/2017/07/f35.png?w=137 137w, https://sandipanweb.files.wordpress.com/2017/07/f35.png?w=274 274w” sizes=”(max-width: 669px) 100vw, 669px” />

f36.png

f37.png


SIR Epidemic model for influenza A (H1N1): Modeling the outbreak of the pandemic in Kolkata, West Bengal, India, 2010