Filesystem Structure#
notata defines a consistent, human-readable directory layout for each run. This enables reproducibility, introspection, and long-term reuse without custom tooling.
Each run is stored in a uniquely named directory:
log_<run_id>/
Contents#
Below is a breakdown of the expected structure and the purpose of each file or folder.
log_<run_id>/
log.txt
metadata.json
params.yaml (or params.json)
data/
plots/
artifacts/
Description of Each Element#
log.txt A plain text chronological log file. Each line follows:
[YYYY-MM-DDTHH:MM:SS] LEVEL message
metadata.json JSON file with lifecycle metadata. Fields include:
run_id: user-supplied identifierstatus: one ofinitialized,complete,failedstart_time,end_time: ISO timestampsruntime_sec: total wall timefailure_reason: optional, present if run failed
params.yaml / params.json Parameters used for the run, written as a single top-level dictionary.
data/ NumPy array outputs.
Single array: saved as .npy via log.array(…)
Multiple arrays: saved as .npz via log.arrays(…)
plots/ Saved matplotlib figures. Extensions may include
.png,.pdf, or.svg.Saved using log.plot(…).
artifacts/ Arbitrary outputs such as:
.json: structured output via log.json(…).txt: logs, notes, diagnostics via log.text(…).pkl: serialized objects via log.pickle(…)Other formats like
.binvia log.bytes(…)
Custom Structure via Manual Paths#
To organize outputs into custom subfolders, use the indexing interface:
log["data/intermediate/u_step100.npy"].write_bytes(...)
log["artifacts/eval/metrics.json"].write_text(json.dumps(metrics))
log["plots/debug/loss_curve.pdf"] # to store a figure manually
This ensures that parent directories are created as needed.
Conventions#
All paths are relative to the run directory
All save methods create parent directories if needed
All logs and metadata are written in plain text or JSON formats
This structure is intentionally flat, discoverable, and designed to support both manual inspection and programmatic tooling.
Searching and Inspecting Runs#
Because notata stores everything as plain text and structured files in the filesystem, you can inspect results using standard shell tools — no special API or viewer required.
Grep and Search Examples#
Find all completed runs:
grep -l '"status": "complete"' outputs/log_*/metadata.json
List all runs with a specific parameter:
grep -l 'omega: 2.0' outputs/log_*/params.yaml
Extract a specific metric from all runs:
If you save final metrics to a JSON artifact:
jq '.E' outputs/log_*/artifacts/final_state.json
Find all runs that failed:
grep -l '"status": "failed"' outputs/log_*/metadata.json
Find runs with energy above threshold:
jq 'select(.E > 1.0)' outputs/log_*/artifacts/final_state.json
Structured Exploration#
You can also use tools like:
findto filter files by path or typejqto slice structured JSON contentawk,cut, orgrep -rto scan logs for patternsdu -sh log_*to check size of each run
This design is intentional: it makes your experiments composable with Unix.
Rationale#
Unlike most ML tracking systems that store metrics in databases or hide them behind dashboards, notata emphasizes:
Unix-first reproducibility
Transparency over abstraction
Permanent, discoverable logs
If it’s in a file, it’s searchable. If it’s structured, it’s scriptable.