Gaussian Processes
5th February 2016
This was our second week in of the STOR-i Research Topic Presentations. This week, there were 6 overall topics covering quite a
wide range of Statistics and Operational Research.
Monday's topic was Medical and Bio Statistics and involved four talks. The first Variational Bayes with applications in RNA sequencing,
statistical models to help Doctors make decisions and the third and fourth talks were talking about clinical trials (though not about
using Multi-Armed Bandits as in Why Not Let Bandits in to Help in Clinical Trials?).
On Tuesday, we were given an introduction to Extreme Value theory and we touched on how it can be used in environmental situations,
as well as how covariance and dependence can be incorporated into the modelling. Wednesday was a busy day, and included two research
topics. In the morning, Statistical/Machine Learning took the lead. We had four talks looking at how privacy can be given a mathematical
definition, the theory behind the classification problem, clustering problems using projections and Gaussian processes. After an hour
off for lunch we started all over again on Logistics, Transportation and Operations. Not surprisingly, this was a set of far more
practical talks, looking at various problems from the real world. These included disaster management and evacuations and airport
capacity. The order of the day on Thursday was on Business Forecasting. We had two speakers, the first talking about how traditional
methods for continuous cases can be adapted for discrete data predictions. The second speaker spoke about forecasting when the data
can be grouped in a hierachical manner. Our final research topic, today, was on simulation. The first talk spoke about how some work
on queueing theory and how infinite server queues can be used in health care modelling, whereas the second was about simulation itself.
In the next two blog posts, I intend to decribe two of these topics in more detail. The first of these I would like to discuss is
the section of Statistical Learning on Gaussian Processes (GP). This was given by
. If these
evaluations are costly though, then the choice of which points to evaluate is difficult. One way is to consider the performance
measure as a Gaussian Process. Each evaluation will change the mean and variance in different areas, and this can then be used to
choose where the next observation should be made to optimise the information gained from each trial (see Figure 2). Although this
seems like a
heuristic (not normally my cup of tea), it may be quite interesting to look into this area. GPs also have potential use in
probabilistic
numerical, where performance may be better than Monte Carlo methods, and in classification problems.
References
[1] Figure 1 copied from slide 5 of Chris Nemeth's presentation.
[2] Figure 2 copied from slide 254 of Chris Nemeth's presentation.
Figure 2