The scientific method consists of generating and analyzing data to create knowledge. Indeed, every materials scientist uses data from syntheses, characterization, and models to explain and optimize materials behavior. Yet, despite the centrality of data to progress in materials, the world’s immense body of materials data remains unstandardized, unstructured, and trapped in myriad publications, isolated repositories, and private computers. This disaggregation (the mishmash) not only prevents materials scientists from standing on the shoulders of giants, but also limits our ability to use large-scale data analytics to dramatically accelerate materials modeling, discovery, and manufacture (à la Moneyball).
Citrine Informatics is a team of Silicon Valley materials scientists dedicated to uniting all materials data on a single platform within a single data standard, and putting user-friendly, data-driven tools into the hands of all materials researchers. We intend to provide this ecosystem of data, visualizations, and models for free to academic and government researchers, while charging companies for access so that our platform is sustainable. In this talk, we will review the present state of affairs in materials data, notable progress to date, opportunities for the future, and the significant challenges likely to arise along the way.