Laplacian Experiments/Stochastic

From Worden

Jump to: navigation, search

Contents

The stochastic operation

Previously we have looked at a continuous, deterministic system,

dxdt=fx,

where x is initially a graph's adjacency matrix, and subsequently is a matrix of real numbers expressing either a weight on each edge, or each edge's probability of existing.

[Note: see the Laplacian Paper draft for the current work on all this.]

Here, instead, we want to consider dynamics in which edges are strictly present or absent at all times, and come and go at stochastic rates given by fx or something related to it. This possibility was pointed out to me by Ma.

In particular, let's work with the Laplacianoid, x.

At any time, for any graph's adjacency matrix e,

-eeij=keikekj-eij+eik-eijekj

should be understood as a rate of change in the edge connecting vertices i and j, whether positive or negative. That is, if eij=1 and -eeij is negative, eij turns to 0 at rate eeij, and if the edge is 0 and the rate is positive, it turns to 1 at rate -eeij. Note that if eij is 0, the rate of change is always nonnegative, and if eij is 1, the rate of change is always nonpositive. (So we could also say that eij flips its state at rate wij=eeij.)

Simulations

The simulation indicates that this process has the ability to add a great deal of clustering (measured by transitivity of edges) to a graph. However, it also seems to have a tendency to reduce the graph density, which leads to a fairly trivial clustered graph. I would like to understand this better. Ideally we'd like to be able to produce complex graphs with certain desired qualities, such as a particular degree distributions, while adding clustering.

NOTE: there seems to be a bug introduced by the migration from lalashan to the yushan cluster, where the frames of the animations are out of order. I'll be fixing that shortly. -LW

ER20

First a directed random graph.

To run the simulation, click this link to remake the plots in the background; then reload this page. This shouldn't be necessary unless you're an editor working with the source code.

Click through to see the animation:
ER20.stochastic-experiment.animate.gif

(ER20.stochastic-experiment.transitivity.png)

(ER20.stochastic-experiment.density.png)

ER20.stochastic-experiment.Rstep
prereq: $(MF)/laplacian.functions.R $(MF)/graph.plotting.functions.R stochastic-laplacianoid.functions.R
library(igraph)
library(network)
 
pcs.prefix <<- 'ER20.stochastic-experiment'
pcs.n <<- 20
pcs.directed <<- TRUE 
lsp.next.event <- NULL
e.igraph <- erdos.renyi.game(pcs.n, 0.5)
e <- get.adjacency(e.igraph,type='both')
timeseries <- make.sequence.stochastic(e, laplacianoid.stochastic.process, 
  seq(0,4,by=0.002))

ER20 symmetric

Now an undirected random graph

Click for the animation:
[log]ER20.symmetric.stochastic-experiment.animate.gif

(ER20.symmetric.stochastic-experiment.transitivity.png)

(ER20.symmetric.stochastic-experiment.density.png)

ER20.symmetric.stochastic-experiment.Rstep
prereq: $(MF)/laplacian.functions.R $(MF)/graph.plotting.functions.R stochastic-laplacianoid.functions.R
library(igraph)
library(network)
 
pcs.prefix <<- 'ER20.symmetric.stochastic-experiment'
pcs.n <<- 20
pcs.directed <<- FALSE
lsp.next.event <- NULL
e.igraph <- erdos.renyi.game(pcs.n, 0.5)
e <- get.adjacency(e.igraph,type='both')
timeseries <- make.sequence.stochastic(e, laplacianoid.stochastic.process, 
   seq(0,10,by=0.1), directed=FALSE)

ER40 symmetric

A larger undirected random graph (40 vertices rather than 20)

Click for the animation:
ER40.symmetric.stochastic-experiment.animate.gif (click to remake in background)

(ER40.symmetric.stochastic-experiment.transitivity.png)

(ER40.symmetric.stochastic-experiment.density.png)

ER40.symmetric.stochastic-experiment.Rstep
prereq: $(MF)/laplacian.functions.R $(MF)/graph.plotting.functions.R stochastic-laplacianoid.functions.R
library(igraph)
library(network)
 
pcs.prefix <<- 'ER40.symmetric.stochastic-experiment'
pcs.n <<- 40
pcs.directed <<- FALSE
lsp.next.event <- NULL
e.igraph <- erdos.renyi.game(pcs.n, 0.5)
e <- get.adjacency(e.igraph,type='both')
timeseries <- make.sequence.stochastic(e, laplacianoid.stochastic.process, 
   seq(0,20,by=0.1), directed=FALSE)

PL100 symmetric

An undirected powerlaw graph of 100 vertices, sparse and treelike.

Animation:
PL100.symmetric.stochastic-experiment.animate.gif (click to remake in background)

(PL100.symmetric.stochastic-experiment.transitivity.png)

(PL100.symmetric.stochastic-experiment.density.png)

PL100.symmetric.stochastic-experiment.Rstep
prereq: $(MF)/laplacian.functions.R $(MF)/graph.plotting.functions.R stochastic-laplacianoid.functions.R
library(igraph)
library(network)
 
pcs.prefix <<- 'PL100.symmetric.stochastic-experiment'
pcs.n <<- 100
pcs.directed <<- FALSE
lsp.next.event <- NULL
e.igraph <- barabasi.game(pcs.n, directed=FALSE)
e <- get.adjacency(e.igraph,type='both')
timeseries <- make.sequence.stochastic(e, laplacianoid.stochastic.process, 
  seq(0,10,by=0.1), directed=FALSE)

the early times

The same experiment, but a close look at the first few moments, when the treelike structure collapses.

Animation:
PL100.symmetric.early.stochastic-experiment.animate.gif (click to remake in background)

(PL100.symmetric.early.stochastic-experiment.transitivity.png)

(PL100.symmetric.early.stochastic-experiment.density.png)

PL100.symmetric.early.stochastic-experiment.Rstep
prereq: $(MF)/laplacian.functions.R $(MF)/graph.plotting.functions.R stochastic-laplacianoid.functions.R
library(igraph)
library(network)
 
pcs.prefix <<- 'PL100.symmetric.early.stochastic-experiment'
pcs.n <<- 100
pcs.directed <<- FALSE
lsp.next.event <- NULL
e.igraph <- barabasi.game(pcs.n, directed=FALSE)
e <- get.adjacency(e.igraph,type='both')
timeseries <- make.sequence.stochastic(e, laplacianoid.stochastic.process, 
  seq(0,0.1,by=0.0003), directed=FALSE)

Average behavior

I'm having an issue with the makefiles on the wiki, but here's a result that I ran offline: the average of 100 realizations of the stochastic process on preferential-attachment graphs of 100 vertices.

Image:Laplacian_Experiments$PL100.100.stochastic.average.Rout.png

Math

Notes on the math below: why am I trying to do all this on directed graphs? The math that works for the ODE is for undirected graphs. If I do undirected graphs here, I will find that graph density is a martingale after all. In fact, now that I've fixed a bug in the simulation code, the undirected graphs do seem to behave as if density might be a martingale. Check the Laplacian Paper for the undirected-graph version of this analysis.

So density drifts downward, and transitivity goes to 1, which is not very useful. Why?

Ok, again

wij=eeij

with

ee=Loee+eLie

or

-eeij=keikekj-eij+eik-eijekj.

So if eij is 0 it increases to 1 at rate keikekj-eij+eik-eijekj and if eij is 1 it decreases to 0 at rate -keikekj-eij+eik-eijekj; that is,

𝔼eijt+Δt=Δtkeikekj-eij+eik-eijekj.

Roughly speaking,

deijdt=limΔt0𝔼ΔeijΔt=keikekj-eij+eik-eijekj
=2keikekj-eijkeik+kekj.

So the change in graph density is

ddt1N2ijeij=1N2ijk2eikekj-eikeij-eijekj.

We're probably not going to get very far because of cascading dependencies on higher moments. But let's press on.

I'm using a directed definition of graph transitivity:

Te=ijkeijejkeik/ijkeijejk,

that is, the proportion of directed length-2 paths eijejk whose transitive edge eik is present. Let's consider the numerator and denominator separately.

ddtpqrepqeqr=pqrepqddteqr+ddtepqeqr
=pqrkepq2eqkekr-eqkeqr-eqrekr+2epkekq-epkepq-epqekqeqr.
=pqrk2epqeqkekr-epqeqkeqr-epqeqrekr+2epkekqeqr-epkepqeqr-epqekqeqr.
=pqrk4epqeqkekr-epqeqkeqr-epqeqrekr-epkepqeqr-epqekqeqr.

Now the numerator:

ddtpqrepqeqrepr=pqrddtepqeqrepr+epqddteqrepr+epqeqrddtepr
=pqrk ([2epkekq-epkepq-epqekq]eqrepr+epq[2eqkekr-eqkeqr-eqrekr]epr
+epqeqr[2epkekr-epkepr-eprekr])
=pqrk (2epkekqeqrepr-epkepqeqrepr-epqekqeqrepr+2epqeqkekrepr-epqeqkeqrepr-epqeqrekrepr
+2epqeqrepkekr-epqeqrepkepr-epqeqreprekr)
=pqrk 6epkekqeqrepr-epkepqeqrepr-epqekqeqrepr-epqeqkeqrepr-epqeqrekrepr-epqeqrepkepr-epqeqreprekr

There are all these moments corresponding to different configurations of two, three, and four edges, all of which may have different densities, and they aren't going to close themselves.

What can we learn from these equations?

  • First: graph density is not necessarily conserved in expectation (i.e. not a martingale), unless the three different kinds of directed V shapes that appear in its derivative can be shown to have the same densities as the graph evolves.
    • If that could be shown, it would probably be true for more complicated combinations as well, and that would make for lots of good results.
  • Do we have to do the master equation now?

Not yet... let's look at those other Vs.

ddtijqeiqeij=ijkqeiq2eikekj-eikeij-eijekj+2eikekq-eikeiq-eiqekqeij
=ijkq4eiqeikekj-2eijeikeiq-2eiqeijekj
ddtijqeijeqj=ijkqeij2eqkekj-eqkeqj-eqjekj+2eikekj-eikeij-eijekjeqj
=ijkq4eijeqkekj-2eijeikeqj-2eijekjeqj.

These two right hand sides have a Z shape in common but none of the other terms, and each has one term in common with the epqeqr case. That is, there are 8 third moments that have to be considered in the motion of the 3 second moments that arise in the motion of the graph density. Those 8 third-order quantities do not include the one that is used in the transitivity.

Still, the Laplacian is a very powerful concept, and we ought to be able to use its properties, not just do low-level algebra.

We know from analysis of the ODE counterpart of this process that the rate of change goes to zero when the graph consists of one or more disjoint complete graphs.

What can we say about irreversible change? The no-change condition is

k2eikekj-eijeik+ekj=0 for every eij.

This condition is satisfied when for every eij

eij=0 and keikekj=0

or

eij=1 and k2eikekj-eik+ekj=-keik-ekj2=0.

The first condition is that if i and j are not connected, they are not connected by a 2-path, or conversely, every 2-path eikekj is accompanied by its transitive edge eij. This is sufficient to tell us that transitivity as defined above is 1 (if there are any 2-paths, that is; it's undefined if not). This condition implies inductively that any longer path is also accompanied by its transitive edge.

The second case is that if i and j are connected, every edge leaving i and every edge entering j must be part of a 2-path from i to j. That is, every out-neighbour of i must be an in-neighbour of j and vice versa. This is certainly true of a union of disjoint complete graphs, but there may be other ways to satisfy it as well.

Consider a graph with two vertices i and j and eij=1, and suppose it is an equilibrium of this process. If there are no other edges, the out degree of i and the in degree of j are each 1. Then there must be one 2-path from i to j. This is not consistent. In order to be an equilibrium, this graph must also include the self-edges eii and ejj. Given those, all degrees are 2, but there are three 2-paths from i to j. The only consistent equilibrium is the complete graph. How to generalize this argument? (It's probably not necessary, since I have a linear algebra argument that the complete graphs are the only solutions.)

Here's another way of looking at it. An edge is added (eij goes from 0 to 1) at some rate when

k2eikekj-eijeik+ekj=2keikekj>0.

This is only possible when it's the transitive closure of a 2-path. No other edges will be added to the graph.

An edge is removed at some rate when

k2eikekj-eijeik+ekj=-keik-ekj2<0

This is satisfied any time there are any edges entering j or leaving i that aren't part of a 2-path from i to j.

Any graph generated by this process must be a subgraph of the transitive closure of the initial graph, since no other edges can be added. It is unlikely to be the entire transitive closure unless the initial graph is quite dense, because edges are being removed. Once the graph is disconnected it cannot be reconnected. Thus E-R and Molloy-Reed random graphs, which are treelike, at least when not overly dense, are likely to become disconnected by this process and made into many small disjoint complete graphs.

This seems to imply a downward ratcheting of density, which seems to contradict the martingale property - maybe there's clarity to be had from writing Fokker-Planck equations?

  • Is it possible that if a graph isn't strongly connected, the only absorbing state it can reach is 0?

Stochastic Laplacianoid code

stochastic-laplacianoid.functions.R
make.sequence.stochastic <- function(n, evolution.function, times, plot.function
, directed=TRUE)
{ t.from <- NULL
  timeseries <- list()
  for (t.to in times)
  { if (!is.null(t.from))
    { e <- evolution.function(e, t.from, t.to, directed)
    }
    timeseries <- c(timeseries,list(list(t.to,e)))
    t.from <- t.to
  }
  return(timeseries)
}
 
laplacianoid.operation <- function(x, y)
{ L.o <- out.degree.laplacian(x)
  L.i <- in.degree.laplacian(x)
  Lx.y <- (L.o %*% y + y %*% L.i)
  return(Lx.y)
}
 
laplacianoid.stochastic.process <- function(e, t.from, t.to, directed=TRUE)
{ # evolve network from time t.from to time t.to
  if (is.null(lsp.next.event))
  { lsp.next.event <<- t.from
  }
  while(1)
  { if (lsp.next.event <= t.from)
    { Le.e <- laplacianoid.operation(e,e)
      # entries of Le.e are rates of change (positive or negative)
      # of the entries of e
      # a.Le.e is the switching rates
      lsp.switching.rates <<- abs(Le.e)
      # don't double count the rates on undirected edges
      if (!directed) 
      { lsp.switching.rates[lower.tri(lsp.switching.rates)] <<- 0
      }
      lsp.row.rates <<- rowSums(lsp.switching.rates)
      lsp.total.rate <<- sum(lsp.row.rates)
      if (lsp.total.rate <= 0)
      { return(e) 
      }
      wait.time <- rexp(1,lsp.total.rate)
      lsp.next.event <<- lsp.next.event + wait.time
    }
    if (t.to < lsp.next.event)
    { return(e)
    }
    if (t.from < lsp.next.event)
    { choose.edge <- runif(1) * lsp.total.rate
      ri <- 1
      while(lsp.row.rates[ri] < choose.edge)
      { choose.edge <- choose.edge - lsp.row.rates[ri]
        ri <- ri + 1
      }
      ci <- 1
      while( lsp.switching.rates[ri,ci] < choose.edge )
      { choose.edge <- choose.edge - lsp.switching.rates[ri,ci]
        ci <- ci + 1
      }
      e[ri,ci] <- 1 - e[ri,ci]
      if (!directed) { e[ci,ri] = e[ri,ci] }
      t.from <- lsp.next.event
    }
  }
}

plot-experiment-timeseries.Rstep
library(igraph)
library(network)
 
pcs.index <<- 0
pcs.coord <<- NULL
pcs.xaxis <- c()
pcs.density <- c()
pcs.directed.dyads <- c()
pcs.directed.triangles <- c()
pcs.transitivity <- c()
pcs.degree.distribution <- list()
pcs.spectral.gap <- c()
 
# the igraph transitivity function only handles undirected graphs.
# Here's a directed transitivity function that operates on the adjacency matrix,
# by counting how many directed 2-paths e_{ik}e_{kj} are accompanied by
# the transitive edge e_{ij}:
# T = sum_{ijk} e_{ik}e_{kj}e_{ij} / sum_{ijk} e_{ik}e_{kj}
directed.transitivity <- function(e)
{ # the ij entry of e.2 is sum_k e_{ik}e_{kj}.
  e.2 <- e %*% e
  # sum those over i and j.
  twopaths <- sum(e.2)
  # the ij entry of e.2 * e is e.2_{ij}e_{ij}.  We sum those over i and j.
  triangles <- sum(e.2 * e)
  return(triangles / twopaths)
}
 
plot.clustering.stochastic <- function(timeseries)
{ for (row in timeseries)
  { t <- row[[1]]
    # e.t is the adjacency matrix at time t
    e.t <- row[[2]]
    e.n <- network(e.t, directed=pcs.directed, loops=TRUE)
 
    if (exists('pcs.prefix') && pcs.prefix != FALSE) {
      pcs.coord <<- plot.network.png(e.n,
        paste(pcs.prefix,'animate.frame',format(pcs.index,width=5),'png',sep='.'),
        edgecol=matrix(sapply(e.t, make.edge.color),nrow(e.t),ncol(e.t)), 
        edgewid=matrix(sapply(e.t, make.edge.width),nrow(e.t),ncol(e.t)),
        coord=pcs.coord)
    }
    e.t.2 <- e.t %*% e.t
    e.t.tri <- e.t.2 * e.t
    dyads <- sum(e.t.2)
    triangles <- sum(e.t.tri)
    pcs.directed.dyads <<- 
      append(pcs.directed.dyads, dyads/(nrow(e.t)*nrow(e.t)*nrow(e.t)))
    pcs.directed.triangles <<- 
      append(pcs.directed.triangles, triangles/(nrow(e.t)*nrow(e.t)*nrow(e.t)))
    e.i <- graph.adjacency(e.t, mode='undirected', diag=TRUE)
    pcs.degree.distribution <<- c(pcs.degree.distribution, list(degree.distribution(e.i)))
 
    e.L <- standard.laplacian(e.t)
    e.L.eigen <- eigen(e.L)
    pcs.spectral.gap <<- append(pcs.spectral.gap,
        e.L.eigen[['values']][1] - e.L.eigen[['values']][2])
 
    if (1) {
      pcs.transitivity <<- append(pcs.transitivity, triangles/dyads)
      pcs.density <<- append(pcs.density, sum(e.t)/(nrow(e.t)*ncol(e.t)))
    } else if (pcs.directed) {
      pcs.transitivity <<- append(pcs.transitivity, directed.transitivity(e.t))
      pcs.density <<- append(pcs.density, sum(e.t) / (nrow(e.t)*ncol(e.t)))
    } else {
      pcs.transitivity <<- append(pcs.transitivity, transitivity(e.i))
      pcs.density <<- append(pcs.density, graph.density(e.i,loops=TRUE))
    }
    pcs.xaxis <<- append(pcs.xaxis, t)
    pcs.index <<- pcs.index + 1
  }
}
 
if (exists('pcs.prefix') && pcs.prefix != FALSE) {
  plot.clustering.stochastic(timeseries)
  png(paste(pcs.prefix,'transitivity.png',sep='.'))
  plot(pcs.xaxis, pcs.transitivity, type='l', xlab="time", ylab="transitivity")
  png(paste(pcs.prefix,'density.png',sep='.'))
  plot(pcs.xaxis, pcs.density, type='l', xlab="time", ylab="density")
}

stochastic-experiments.mk
ifndef LX
LX = $(CURDIR)
endif
%.stochastic-experiment.animate.gif : %.stochastic-experiment.animate.Rout
	convert -adjoin $*.stochastic-experiment.animate.frame.*.png -delay 2 $@
 
%.stochastic-experiment.animate.Rout : %.stochastic-experiment.RData $(LX)/plot-experiment-timeseries.R
	$(RM) $*.stochastic-experiment.{transitivity,density}.png
	$(run-R)
 
%.stochastic-experiment.transitivity.png : %.stochastic-experiment.animate.Rout ;
%.stochastic-experiment.density.png : %.stochastic-experiment.animate.Rout ;
 
remove-all-pngs :
	$(RM) *.png
 
remove-all-frames :
	$(RM) *.frame.*.png

stochastic-simulation.R
# need: 'seed', 'graph.name'
set.seed(seed)
library(igraph)
library(network)
pcs.directed <<- FALSE
pcs.prefix <<- FALSE
lsp.next.event <- NULL
e.network <- do.call(paste(graph.name,'graph',sep='.'), list())
e <- as.matrix(e.network, 'adjacency')
# undirected
e <- pmax(e, t(e))
timeseries <- make.sequence.stochastic(e, laplacianoid.stochastic.process, 
  seq(0,10,by=0.01), directed=FALSE)
plot.clustering.stochastic(timeseries)
 
#plot(result, type='l')
plot(pcs.transitivity, type='l')
assign(paste(graph.name,seed,'realization.n.vertices',sep='.'), pcs.n)
assign(paste(graph.name,seed,'realization.x.axis',sep='.'), pcs.xaxis)
assign(paste(graph.name,seed,'realization.transitivity',sep='.'), pcs.transitivity)
assign(paste(graph.name,seed,'realization.density',sep='.'), pcs.density)
assign(paste(graph.name,seed,'realization.directed.dyads',sep='.'), pcs.directed.dyads)
assign(paste(graph.name,seed,'realization.directed.triangles',sep='.'), pcs.directed.triangles)
assign(paste(graph.name,seed,'realization.degree.distribution',sep='.'), pcs.degree.distribution)
assign(paste(graph.name,seed,'realization.spectral.gap',sep='.'), pcs.spectral.gap)

Personal tools