The simulate Action
The simulate action will be executed by row (typically via a job queue
on an HPC resource). Row determines which directories are eligible and
passes them to the simulate action. The command line parsing is explained
on the next page. Eventually, the simulate_one method is called with
a path to the directory containing the signac state point.
#![allow(unused)]
fn main() {
pub fn simulate_one(directory: &Path) -> anyhow::Result<()> {
}
This document highlights important sections of the code. Find the
complete code in src/simulate.rs.
Serialization and Wall Time Management
Most HPC resources limit the wall time your jobs may execute. To enable
long-running simulation models, this workflow template monitors the wall time
used and serializes the entire simulation state to a file a few minutes before
the time is up. At that point, your HPC job ends and row will indicate that
the directory is eligible again. When you submit the new job, simulate will
deserialize the simulation state and continue the simulation from where it
left off.
Note
This example uses the postcard format to store the simulation model. Postcard is a simple and efficient binary file format. You could use any format supported by serde that you like.
When called, the simulate_one method is given a directory. It first needs
to determine if this directory should be continued from a serialized state
or initialized from the state point. The get_model method implements the
necessary logic. It first checks if model.postcard exists. If it does,
it deserializes the simulation state and returns it. If not, it initializes
a new simulation model from the signac state point:
fn get_model() -> anyhow::Result<LennardJonesModel> {
match fs::read(MODEL_FILE) {
Ok(bytes) => {
debug!("Continuing simulation from `{MODEL_FILE}`.");
postcard::from_bytes(&bytes).with_context(|| format!("could not read `{MODEL_FILE}`"))
}
Err(error) => match error.kind() {
io::ErrorKind::NotFound => {
debug!("Constructing a new simulation model.");
let state_point_bytes = fs::read("signac_statepoint.json")
.context("unable to read `signac_statepoint.json`")?;
let state_point: StatePoint = serde_json::from_slice(&state_point_bytes)
.context("could not parse signac_statepoint.json")?;
let _ = HoomdGsdFile::create("trajectory.in-progress.gsd");
LennardJonesModel::new(state_point)
}
_ => Err(error).with_context(|| format!("Could not read `{MODEL_FILE}`")),
},
}
}
simulate_one breaks out of the main simulation loop when the wall time is nearly
up and then serializes the simulation state to a file:
let maybe_wall_time_limit = match env::var("ACTION_WALLTIME_IN_MINUTES") {
Ok(value) => {
let parsed_value = value
.parse::<f64>()
.context("error parsing ACTION_WALLTIME_IN_MINUTES")?;
debug!("Limiting wall time to {parsed_value} minutes.");
Some(parsed_value * 60.0)
}
Err(_) => None,
};
while model.step() < TOTAL_STEPS {
// ...
if let Some(wall_time_limit) = maybe_wall_time_limit
&& wall_time + WALL_TIME_BUFFER > wall_time_limit
{
info!("Stopping simulation, wall time limit reached.");
break;
}
}
let out_bytes: Vec<u8> = postcard::to_stdvec(&model)?;
let mut file = File::create(MODEL_FILE).context("failed to create `{MODEL_FILE}`")?;
file.write_all(&out_bytes)
.context("failed to write `{MODEL_FILE}`")?;
Row notifies the action of its wall time limit via the ACTION_WALLTIME_IN_MINUTES
environment variable.
GSD Trajectory
simulate_one appends to a GSD trajectory for offline visualization and
analysis:
let mut gsd_file = HoomdGsdFile::open("trajectory.in-progress.gsd")
.context("error opening trajectory.in-progress.gsd")?;
open works here because get_model created the GSD file.
While the simulation is active, it names the file trajectory.gsd.in-progress.
The row workflow configuration (in a later page) will use the existence of
trajectory.gsd as an indication that the simulation is complete (and therefore
no longer eligible). To achieve this, simulate_one closes the gsd file
and then renames it when the simulation has reached the target number of total
steps:
drop(gsd_file);
if model.step() == TOTAL_STEPS {
info!("Simulation complete.");
fs::rename("trajectory.in-progress.gsd", "trajectory.gsd")
.context("failed to rename `trajectory.in-progress.gsd` to `trajectory.gsd`")?;
}
Log File
Parquet is a convenient file format for logging because it is widely supported and the binary format keeps the full precision of every logged value. Unfortunately, they are difficult to use when continuing a simulation job because parquet files cannot be appended to.
The solution suggested by the parquet developers is to create multiple files.
The create_unique method creates log.parquet.0 on the first submission then
log.parquet.1 on the second, and so on:
let mut parquet_logger = ParquetLogger::<LogRecord>::create_unique("log.parquet")
.context("error creating `log.parquet`")?;
When you visualize or post-process the log later, you should concatenate
the data frames from all log.parquet.* files.
Error Handling
You might have noticed a recurring pattern in the above code:
.context("...")?. The ? operator is a shortcut that returns early whenever
the preceding code generates an error (see Recoverable Errors with Result
in The Rust Programming Language for details).
The .context method comes from the anyhow crate. Use it to describe what you
are doing that might cause an error. Without the .context, your program might
print I/O error as its entire output with no indication of what caused it. A later
page in this tutorial will show example error messages with context and show you
how to troubleshoot errors.
Driving the Simulation and I/O
Not shown here is the standard code for advancing the simulation with advance(),
writing to the GSD file, and writing to the log file. See the full code in
src/simulate.rs. Tutorials such as Applying Interactions and
Patchy Particle Self-Assembly explain in detail how to advance simulation
models and write to log and GSD files.
Development of hoomd-rs is led by the Glotzer Group at the University of Michigan.
Copyright © 2024-2026 The Regents of the University of Michigan.