Faculty Sponsor's Department(s):
The COVID-19 pandemic has made it evident that many people still lack the proper quality of internet required to perform tasks such as online learning and telemedicine. In order to improve the quality of the internet in these areas, millions of taxpayer dollars are invested every year; however, there lacks a systematic and generic approach that can be used to identify such unserved and underserved areas. There are several crowdsourced data sets that collect important metrics which relate to the quality of the internet in many of these areas. We hypothesize that by aggregating these otherwise independent data sets, we can get a better understanding of areas that lack proper quality of internet access and therefore require additional investment. In this work, we utilize a collection of crowdsourced data sets to conduct analysis on common internet performance related metrics such as throughput and latency to determine the underserved areas in the state of California. In order to achieve better location information, we will employ various geolocation techniques to allow us to zone in on areas that require further attention to improve the quality of the internet access. We will conduct various statistical analyses on the metrics to determine factors that lead to a particular area being underserved. Additionally, we will build a generalizable model that can predict an area being underserved by using these internet performance metrics. We envision our model can be employed for any other state to achieve a similar goal.