Latest Run

Benchmarks & Accuracy

Empirical performance and accuracy measurements comparing Siderust to established astronomy libraries, tested against ERFA/SOFA as the reference implementation.

Performance ↓ Accuracy ↓ Methodology ↓ Reproduce ↓

Key Findings

Up to 14×

Faster than ERFA (C)

Composite transforms like BPN rotation and equatorial→horizontal see 14× speedups. Single-step operations match bare C speed.

< 5 mas

Sub-milliarcsecond accuracy

Maximum angular error under 5 milliarcseconds across all tested coordinate transforms. GMST and Kepler solver achieve machine precision.

1,000

Randomized test inputs per experiment

Transparent, reproducible methodology with documented model alignment. ERFA as reference, deterministic inputs, and open-source tooling.

Performance

Time per operation across 7 representative astronomical computations. Each measurement is the median of 10 rounds over 50,000 iterations. Lower is faster.

Loading chart…

Reading the chart: Each group of bars is one experiment; bar height = nanoseconds per operation (log scale). Siderust (orange) is shorter = faster for composite transforms.

Loading chart…

Reading the chart: Bar length = how many times faster Siderust is versus ERFA's C implementation. Values > 1× (green) mean Siderust is faster; < 1× (amber) means ERFA C is faster. The dashed line marks equal speed.

What do these numbers mean?

Composite transforms (BPN rotation, equatorial→horizontal, Sun position) chain multiple steps: precession, nutation, sidereal time, spherical trigonometry. Siderust's architecture avoids intermediate allocations and redundant recomputations, yielding 2–14× speedups over ERFA's C code.

Single-step operations (GMST, Kepler solver, ecliptic transform, Moon position) are tight numerical kernels where Siderust performs within 20–45% of bare C; an expected result for safe Rust with bounds-checked arithmetic.

Astropy wraps ERFA via Python/pyerfa, adding interpreter overhead. Its performance here reflects the per-element Python loop cost, not the high-level astropy.coordinates stack.

Accuracy

p99 angular error versus ERFA reference across 1,000 randomized inputs per experiment. ERFA is the reference (zero error by definition). Lower is more accurate.

Loading chart…

Reading the chart: Bar height = p99 angular error in arcseconds (log scale). Shorter bars = better accuracy. ERFA is the reference and is not shown. GMST/ERA and Kepler solver are omitted (both achieve machine precision ~10⁻¹² arcsec).

Loading chart…

Reading the chart: Each dot is one library × experiment combination. Bottom-left is ideal (fast + accurate). Siderust (orange) clusters in the fast-and-accurate zone. Hover for details.

What does "accuracy" mean here?

Angular error is the angular separation (arcseconds) between a library's output and ERFA's output for the same input. It measures agreement with the IAU 2006/2000A reference models, not absolute truth.

p99 means "in 99% of the 1,000 test cases, the error was at or below this value." This is stricter than the mean and more robust than the single worst case.

Model differences explain large gaps. libnova uses older models (Meeus precession, IAU 1980 nutation) that diverge by arcseconds to degrees from IAU 2006. Siderust uses IAU 2006 precession with IAU 2000B nutation (77 terms vs 1365), yielding ~5 milliarcsecond maximum deviation from ERFA's IAU 2000A.

Methodology

What was measured

Seven representative astronomical computations, including coordinate frame rotations, sidereal time, ecliptic/horizontal transforms, Sun and Moon geocentric positions, and Kepler's equation, were tested across four libraries: Siderust (Rust), ERFA (C, the IAU reference implementation), Astropy/pyerfa (Python wrapping ERFA C), and libnova (C).

Test inputs

1,000 randomized inputs per experiment (seed = 42 for reproducibility), covering Julian Dates spanning ~1900–2100 and uniformly distributed sky positions. Each library received identical inputs in identical units (radians, JD TT).

Performance measurement

10 rounds × 50,000 iterations per round. Reported metric: median ns/op. Benchmarks ran on a single core with turbo boost disabled where possible. Warm-up rounds are discarded.

Hardware

Model alignment

Before comparing outputs, each experiment documents an explicit alignment checklist: units, time scales, Earth orientation parameters, refraction policy, and astronomical model identifiers. Where libraries cannot match models exactly, the experiment is tagged model-mismatch and accuracy differences are attributed to model choices, not bugs. Experiments achieving full model alignment are tagged model-parity.

Caveats & Transparency

⚠ Model mismatch. Most experiments compare different astronomical models (IAU 2000B vs 2000A nutation, Meeus vs IAU 2006 precession). Large accuracy differences reflect model choices, not correctness issues. Only equ_horizontal and kepler_solver achieve full model parity.
⚠ Astropy ≠ full stack. The "Astropy" results call ERFA C kernels directly via pyerfa in a Python loop. This does not measure the high-level astropy.coordinates framework, which adds overhead for unit handling, frame transforms, and time conversions.
⚠ Single machine. All benchmarks ran on one machine (Intel Core Ultra 9 185H). Performance ratios are more transferable than absolute ns/op values.
⚠ No EOP (Earth Orientation Parameters). All experiments set polar motion and UT1−UTC to zero. Accuracy in real applications depends on access to IERS bulletins for these corrections.
⚠ libnova uses different models. libnova implements Meeus-era algorithms. Its large errors (arcseconds to degrees) are expected and do not indicate bugs in libnova; it targets a different accuracy tier.

Why Siderust?

Performance where it counts. Composite transforms that are the bottleneck in real pipelines (pointing, observation planning, orbit propagation) run 2–14× faster than C ERFA, with no loss of accuracy.

Validated accuracy. Sub-milliarcsecond agreement with ERFA/SOFA across the J1900–J2100 epoch range, with publicly auditable test results.

Rust safety. Memory-safe, no undefined behavior, no data races. Critical for mission software and embedded systems where C/C++ bugs have real consequences.

Type-safe coordinates. Compile-time reference frame and unit enforcement; swapping RA for Dec or mixing J2000 with ITRF is a compile error, not a runtime bug.

no_std capable. Works on embedded targets without an allocator; ideal for spacecraft flight software, FPGAs, and resource-constrained systems.

Reproduce These Results

All benchmarks are generated by the open-source Siderust Lab repository. To reproduce:

 # Clone the lab repo (includes submodules for all libraries)

git clone --recursive https://github.com/Siderust/lab.git

cd lab

 # Build all adapters and run all experiments (N=1000, seed=42)

./run.sh

 # Results appear in latest_results/<timestamp>/
 # Each experiment directory contains per-library JSON files
 # with full accuracy percentiles and performance samples. 

Requires: Rust ≥ 1.80, GCC, Python ≥ 3.10, and ~2 GB disk. Full run takes approximately 5–10 minutes.