Wednesday, October 9, 2013

Molecular modelers won Nobel Prize in Chemistry

Martin Karplus, Michael Levitt, and Arieh Warshel won the 2013 Nobel Prize For Chemistry today "for the development of multiscale models for complex chemical systems."

The Royal Swedish Academy of Sciences said the three scientists' research in the 1970s has helped scientists develop programs that unveil chemical processes. "The work of Karplus, Levitt and Warshel is ground-breaking in that they managed to make Newton's classical physics work side-by-side with the fundamentally different quantum physics," the academy said. "Previously, chemists had to choose to use either/or." Together with a few earlier Nobel Prizes in quantum chemistry, this award consecrates the field of computational chemistry.

Incidentally, Martin Karplus is my postdoc co-adviser Georgios Archontis's thesis adviser at Harvard. Georgios is one of the earlier contributors to CHARMM, a widely-used package of computational chemistry. CHARMM was the computational tool that I used when working with Georgios almost 15 years ago. In collaboration with Martin, Georgios and I were studying glycogen phosphorylase inhibitors based on a free energy perturbation analysis using CHARMM. In another project with Spyros Skourtis, I wrote a multi-scale simulation program that couples molecular dynamics and quantum dynamics to study electron transfer in proteins and DNA molecules (i.e., use Newton's Equation of Motion to predict the trajectories of atoms, construct the Hamiltonian time series, and solve the time-dependent Schrodinger equation using the Hamiltonian series as the input).

We are thrilled by this news because much of the computational kernels of our Molecular Workbench software was actually inspired by CHARMM. The Molecular Workbench also advocates a multiscale philosophy and pedagogical approach, but for linking concepts at different scales with simulations in order to help students connect the dots and build more unified pictures about science (see the image above).

We are glad to be part of the "Karplus genealogy tree," as Georgios put it when replying my congratulatory email. We hope that through our grassroots work in education, the power of molecular simulation from the top of the scientific research pyramid will enlighten millions of students and ignite their interest and curiosity in science.

Saturday, October 5, 2013

Computational process analytics: Compute-intensive educational research and assessment

Trajectories of building movement (good)
Computational process analytics (CPA) differs from traditional research and assessment methods in that it is not only data-intensive, but also compute-intensive. A unique feature of CPA is that it automatically analyzes the performance of student artifacts (including all the intermediate products) using the same set of science-based computational engines that students used to solve problems. The computational engines encompass every single details in the artifacts and their complex interactions that are highly relevant to the nature of the problems students solved. They also recreate the scenarios and contexts of student learning (e.g., the calculated results in such a post-processing analysis are exactly the same as those presented as feedback to students while they were solving the problems). As such, the computational engines provide holistic, high-fidelity assessments of students' work that no human evaluator can ever beat -- while no one can track numerous variables students might have created in long and deep learning processes in a short evaluation time, a computer program can easily do the job. Utilizing disciplinarily intelligent computational engines to do performance assessment was a major breakthrough in CPA as this approach really has the potential to revolutionize computer-based assessment.

No building movement (bad)
To give an example, this weekend I am busy running all the analysis jobs on my computer to process 1 GB of data logged by our Energy3D CAD software. I am trying to reconstruct and visualize the learning and design trajectories of all the students, projected onto many
different axes and planes of the state space. To do that, an estimate of 30-40 hours of CPU time on my Lenovo X230 tablet, which is a pretty fast machine, is needed. Each step loads up a sequence of artifacts, runs a solar simulation for each artifact, and analyzes the results (since I have automated the entire process, this is actually not as bad as it sounds). Our assumption is that the time evolution of the performance of these artifacts would approximately reflect the time evolution of the performance of their designers. We should be able to tell how well a student was learning by examining if the performance of her artifacts shows a systematic trend of improvement, or is just random. This is way better than the performance assessment based on just looking at students' final products.

After all the intermediate performance data were retrieved through post-processing the artifacts, we can then analyze them using our Process Analyzer -- a visual mining tool being developed to show the analysis results in various visualizations (it is our hope that the Process Analyzer will eventually become a powerful assessment assistant to teachers as it would free teachers from having to deal with an enormous amount of raw data or complicated data mining algorithms). For example, the two images in this post show that one student went through a lot of optimization in her design and the other did not (there is no trajectory in the second image).