Students in class

Research Themes

Lancaster University’s academic expertise in STOR constitutes a comprehensive research base upon which to build STOR-i’s scientific agenda. We have internationally recognised expertise in statistical modelling and inference, forecasting, optimisation, simulation, and stochastic modelling with a particularly strong focus on computationally intensive approaches. There are a host of sub-areas in which we have particular strength including:

  • change-point analysis;
  • computational statistics;
  • extreme value theory;
  • heuristic methods;
  • mathematical programming;
  • spatial statistics;
  • statistical learning;
  • stochastic dynamic programming;
  • stochastic programming;
  • time series analysis.

Each of these research areas have real-world applications and work to solve important challenges, with a variety of settings exampled below:

Big Data and Networks

  • Networks can change in structure over time and, in some cases, the changes can be sudden. In a network of computer connections a sudden change could mean an attack by hackers or email spam. Predicting such a change could reduce the effect of such an attack.
  • Classification in a data stream setting differs from traditional classification due to the velocity of data streams and the variable nature of the underlying data distribution over time.
  • Clustering objects within large datasets into meaningful groups may allow the detection of outlying data points as well as the recognition of patterns. This is of particular importance in high dimensional datasets.
  • In an intelligence setting (see under ‘security’ below) a network of communicating participants has the potential to provide information of relevance to an intelligence question of interest. An issue which has both optimisation and inferential challenges concerns how best to interrogate such a network for maximum informational benefit.

Energy

  • Use of change-point models helps to predict changes in wind speed, which would allow us to predict when turbines need to be turned off and on. This would improve efficiency and maintenance.
  • The design of robust and reliable offshore sites is a key concern in the oil extraction process, to protect against both environmental pollution and staff endangerment. Design codes set specific levels of reliability, expressed in terms of annual probability of failure, which need to be met or exceeded by companies. Hence, it is essential to estimate efficiently the extreme conditions marine structures are likely to experience in their lifetime.

Environment

  • Accurate modelling extreme weather-related events such as floods and windstorms helps to reduce infrastructural damage, transport chaos and human fatalities. Extreme value models are used to estimate the rate of occurrence of events that are more extreme than those already observed.
  • Improving existing models for weather-related insurance claims by better accounting for the spatial variation of weather and geographical features and the mechanisms that lead to claims.
  • Heat waves and droughts occur when there are days that are very hot or very dry, consecutively. Estimation of how the frequency of such events changes due to global climate change is key for future planning and developing emissions targets.

Learning and Decision Making

  • Decisions are often taken in sequence over time as our knowledge of the problem evolves. Learning more about the problem can improve the quality of future decisions. Good long-term decisions therefore require choosing actions that yield useful information about the problem as well as being effective in the short-term.
  • When gambling (on the outcome of a sporting event, or investing on the stock markets), holding an ‘edge’ on the market gives investment opportunities. In such a situation, the problem is how to best invest to exploit this. Alternatively, if someone obtains an edge from more nefarious means, such as insider trading, how can we detect this from the patterns in the markets?
  • Machine learning refers to the idea of designing computer programs in a way that they improve their performance at predefined tasks the more experience they have. The computer programs improve as they gather more information. It is crucially important to design these programs so that they are adaptive and thus able to accommodate information change.

Logistics

  • A wide variety of work has to be scheduled into and through a complex facility, with each job requiring several operations at individual machines, usually in a pre-specified order. The scheduling of maintenance on all machines is designed to keep the facility operating well. However, machines do break down, necessitating the rescheduling of work when it happens. The facility is central to the success of the company it serves and its effective management is of huge importance to them.
  • Particular care has to be taken regarding the operation of fleets of vehicles which transport hazardous materials. Of particular importance is the determination of routes which serve customers well, yet minimise the risk of environmental damage if spillage occurs of the transported materials.
  • A region in which illicit activities are believed to being conducted is to be partitioned, with a single searcher allocated to each partition set. Such partition-based searches will provide information on the spatial distribution of such activities which will enable refinement of partition designs to render search efforts more effective.

Medicine

  • Trial design and sampling in pharmacokinetic studies with a view to reducing the use of animals in pre-clinical studies.
  • Use of bandit models to provide an idealised mathematical decision-making framework to optimally allocate patients to a number of treatments and allow the flexibility to incorporate patient wellbeing into human clinical trials.

Security

  • A smart video system with the ability to classify behaviour into normal and abnormal activities could allow the user to be alerted to anomalous behaviour in the monitored area without manually sifting through the recordings. Algorithms that help to do this separate foreground from background, allowing us to monitor the activities of interest clearly.
  • Limited assets are available to surveille vulnerable regions for signs of suspicious activity. The challenge of how to deploy such assets to maximum effect is modelled as a service control problem for a queueing system with customer abandonments. How should surveillance be conducted in the face of an adversary determined to avoid detection?
  • Items of intelligence information which may shed light on an intelligence question of interest are available in huge amounts from a range of sources. However, resources for the proper analysis of such items is limited. How should these sources be sampled during a short horizon so that analysts can be provided with an appropriate number of items of highest value?