How AI and humans can optimise air pollution monitoring

Air pollution is responsible for 7 million deaths per year according to World Health Organization (WHO). Thus, it is crucial to dedicate resources to learn and monitor air quality in cities to assist authorities in urban planning as well as bring awareness to people about the impact of air pollution to their everyday life. In our research, we provide the framework and the algorithms, utilising the power of Machine Learning to effectively monitor an environment over time.
In particular, our proposal relies on the willingness of people to participate in environmental air quality campaigns. People can use  mobile air quality devices to take readings in their city or their neighbourhood. However, the major issue is when and where these readings should be taken to efficiently monitor the city. People cannot provide an unlimited number of measurements and thus readings should be taken in a way such that information about the environment is maximised. In other words, we need to solve an optimisation problem constrained on the number of readings people can provide over a period of time to facilitate an efficient environment exploration.
In order to solve the problem, we need to model the environment in a certain way as well as a way to measure the information entailed in each reading (since we are interested in gaining the most information by taking a limited number of readings). To do that, we overlay a spatio-temporal stochastic process over the area of interest (Gaussian Processes). Gaussian processes can be used to interpolate over the environment, i.e., predict the air quality value at unobserved locations as well as predict the state of the environment into the future. Importantly, Gaussian Processes can also be used to provide a measure of uncertainty/information about each location in space and time (by utilising predictive variance).
The problem is evolved into taking a set of measurements such that a utility function, created based on predictive variance provided by Gaussian Processes, is maximised. Going a step forward, to solve this problem, we use techniques and algorithms from the broad areas of Artificial Intelligence and Multi-agent systems.
In particular, an intelligent agent can decide when and where measurements should be taken to maximise information gained about the air quality, while at the same time minimise the number of readings needed. The agent can employ greedy search techniques combined with meta-heuristics such as stochastic local search, unsupervised learning (clustering) and random simulations.
The main idea is to simulate the environment over time, asking what if kind of questions. What if i take a measurement now, and one in the night. What if i take measurement downtown or near the home. These kind of questions are answered  by running simulations on a cluster computing facility.
Finally, our findings indicate a significant improvement over other approaches.

Artificial Intelligence to save the environment or destroy the world?

In a recent post I briefly describe my experiences from the AAMAS conference in Turkey. What I haven’t talked about is the topic and the details of the paper of mine that got accepted there. This post aims to introduce my research and provide a summary of my recently published paper.

In 2014, the World Health Organization estimated that 7 million people have died by diseases associated with air pollution. These lives could have potentially been saved if measures were taken on time. But can we really take measures when we do not know where and when pollution is high but  only know vaguely that air pollution is caused by traffic and industrial pollutants released to the atmosphere? What I mean is that a more collective effort is required to really understand air pollution in terms of its spatial as well as its temporal distribution. In fact, there are indeed statics sensors scattered in cities all over the world. But are they enough? These are expensive sensors placed in areas away from pollution sources in order to estimate the average pollution in that area. Is that what we want? Sort of. How about the kid that goes to school, walking everyday for 10 minutes next to a congested road? How about the cyclists that chose to cycle to be environmental friendly and importantly to be more healthy? But, are they really healthy, cycling behind buses and cars? Well, I am sure that the air quality index displayed by the static sensor is nowhere near the reality of those  that spent their time in busy roads.

Here is the alternative! Give people the power to measure their own pollution exposure! Well, this is already happening and this is what participatory sensing is about. Citizens, carrying around sensors are taking all sort of measurements. Let’s think about that. Well, carrying around sensor… We all do. Our smartphone, that at least 7 out of 10 owns in the UK (according to studies in 2013), is a sensor. In fact, it is multiple sensors embedded in a single handheld device.  At the moment, they are not able to measure air pollution but we are getting there. I mean, phones can monitor your heart rate and for each generation of phones a new sensor is added. Even if monitoring air pollution from your phone might be a few years away, there are mobile sensors that could be easily paired to the phones via USB or Bluetooth.

However, people live their lives and follow their own daily agenda. They are not going to run around the city all day and night to take measurements in order to spatio-temporally cover their city.  Even if they want to, their mobile phone’s battery will betray them. How long can it last utilizing their battery draining sensors?

Enough of the introduction. My paper is focused on making these environmental campaigns that expect citizens to take measurements succeed.  How? We first of all assume that people have a cost for taking a measurement. This could represent the inconvenience that the user gets into in order to take a measurement. It could also represent a small payment if the environmental campaigns have the resources to do so. Or, it could even represent the battery life of user’s phone that it was just reduced because of the activation of multiple sensors such as GPS (and Bluetooth if it is paired with an air quality sensor).

Another factor that we consider is the mobility patterns of people. It is known (at least in the research community) that people are typically predictable in their daily routine. I do not know about you but this is definitely true for me. Except some times. Sometimes I deviate. Or so I think. Anyway, there is a lot of literature on this topic and I am not getting into details.

So, the big question is: Where, When and Who should take measurements in order to better monitor the environment for a period of time given that each user occurs a different cost for taking a single measurement? Well, this is the question that my algorithm addresses. It is about mapping each participant to a location at some point in time in a way that taking this suggested measurement would be as significant as possible in the effort of monitoring the environment. The good thing is that no one should deviate their route. Given that I always wake up and go to “work”,  the algorithm could tell me to take a measurement on my way at some point. This is the point of using human mobility patterns in the first place. To exploit some available intel.

Well, what do you think? I think this is better than having people walking around like zombies trying to take measurements for your experiments compensating them with 20 dollars each in a project that will cease to exist the day the funding is over and that no one will actually use it in practice after you have successfully published your paper.  Don’t you think?  Or, your phone could even deal with everything given that privacy concerns are met. For example, you could set it to make measurements where and when it is decided to without you explicitly knowing. These kind of ideas i am circling in my mind. It might not be that good, yet’ but this is the idea of my work. To make participatory sensing campaigns a thing.  We are still a bit far from a real-world trial but we are getting there. We need to get our facts right (in terms of assumptioms about the problem) and make it as good as possible given the uncertain environment and uncertain human behavior that AI will encounter.

So, will AI destroy the world? I don’t know. What I know is that the same research done to save the world, could easily be modified to destroy it. Imagine that 10 F-16 are deployed to bomb different terrorist bases. Or, a number of unmanned aerial vehicles sent with a pre-determined target. Now, someone could ask, where and when should these planes or drones release some bombs on their way to their targets or on their way back in order to maximize the damage caused to the enemy given that each bomb has a specific cost? Well, unfortunately,  the solution is already given by my AI algorithm. However, thankfully, the answer to this cannot be computed now as one important component is missing.

What does it mean when we say we collect information by taking a measurement? We imply that there is some sort of model over the environment that will give us a number or a value or something other than the air pollution index. Fortunately, for environmental monitoring there is a lot of work on how to do this. We choose to use Gaussian Processes because of their power, flexibility but most importantly because they give you the uncertainty over the locations of interest both in space and time. More about them in another post! So, to destroy the world you would need such a model.