Compression and Analytics of Molecular Dynamics Simulation Data

Presented by Shuai Liu

Thursday, November 29, 2018
4:00 p.m.
ICSI Lecture Hall


Molecular dynamics (MD) simulation is an approach to explore chemical or biological systems computationally when they cannot be investigated in nature. By modeling the interactions between atoms and molecules using Newtonian principles, hypotheses about chemical or biological properties of certain molecules as they are interacting with the environment can be tested. Given the simulation is done on molecular level, however, MD simulation data can easily generate terabytes of data. Working with MD data therefore not only requires a large amount of storage, it is also very difficult to transfer simulation outcomes over a network. In this talk, I report on my master thesis work. I designed and analyzed algorithms for MD simulation data compression in both time and space domain. I also ran experiments to understand the relationship of compression performance and underlying physical behavior of the simulation, for example, by analyzing the information entropy of MD simulation data in correlation with different physical parameters. As one result, I was able to build an automatic detector for phase transitions in MD simulation data, which in turn can be used to increase compression performance in future work.


Shuai Liu is a Ph.D. candidate in Chemistry department at UC Berkeley advised by Dr. Alexander Hexemer at LBNL working on materials characterization data analysis using machine learning. This work is part of an ongoing M.S. technical report advised by Profs. Gerald Friedland and Kannan Ramchandran. Before studying in Berkeley, Shuai got his B.S. degree from Peking University. His research interests are the applications of machine learning and information theory based methods on scientific data mining, especially for chemistry and materials data.