Rob Emanuele is the maintainer of the open source geospatial library GeoTrellis, which provides geospatial capabilities to Apache Spark. He was the program chair for FOSS4G North America in 2015 and 2016. He is a member of the LocationTech Project Management Committee.
What is the average predicted temperature of San Francisco from 2050-2099 based on forecasting models? How many tweets containing the hashtag #devoxx were sent from California? In general: how do we ask questions concerning location to very large sets of geospatial data? To answer these types of questions, existing large data processing frameworks like Hadoop, Accumulo and Spark need to be "geospatially enabled". LocationTech is a working group inside of the Eclipse Foundation that is home to open source projects doing exactly that, including GeoTrellis, GeoWave, and GeoMesa (sense a pattern?). In this talk, I will give an introduction to what we mean when we talk about "geospatial data at scale", talk about some of the unique challenges in processing large sets of geospatial data, and talk about how LocationTech projects work with Apache projects to overcome those challenges, letting us get the most out of our large geospatial data.