Datasets
ESPAI datasets are grouped into three categories: Physical Simulations (A),
Real Supernova Observations (B), and GenAI Simulator (C).
Each dataset comes with a minimal open sample (CSV, 5 rows) for quick inspection and the full files in compact formats (e.g. .parquet) for research use.
FAIR artefacts (metadata, README, provenance, dictionary, and citation) are being added incrementally and are clearly marked below.
Need the full files? See “Licence & Citation” for terms and preferred citation, then follow the repository or the contact instructions where noted.
A. Physical Simulations
A1. Physical models (4-parameters)
Overview. Semi-analytic explosions with four core parameters controlling energy, radius, ejecta mass, and radioactive contribution. The release includes time-dependent observables plus generation metadata.
Intended use: parameter-recovery benchmarks, uncertainty calibration, and sanity checks against semi-analytic expectations.
Primary files
Last updated: 2025-08-23
Preview (CSV)
First 5 rows from a tiny sample file for curves; the full dataset is available via the csv download above.
First 5 rows from a tiny sample file for parameters; the full dataset is available via the csv download above.
FAIR artefacts (status)
We’re progressively adding the artefacts below; items marked “Coming soon” will appear in the next releases.
- Metadata record· metadata.json
- README· README.md
- Data dictionary· dictionary.csv
- Provenance & methods· provenance.md
- Licensing & citation· LICENCE · citation.txt
- Supplementary materials· notebooks, diagrams, scripts
A2. Physical models (7-parameters)
Overview. An expanded physical grid with seven parameters to capture a broader range of progenitor and CSM scenarios. Useful for ablation studies and robustness tests when moving from compact to richer physical descriptions.
Intended use: stress-testing model generalisation and examining parameter identifiability under increased realism.
Primary files
Last updated:
Preview (CSV)
First 5 rows from a tiny sample file for curves; the full dataset is available via the .parquet download above.
First 5 rows from a tiny sample file for parameters; the full dataset is available via the .parquet download above.
FAIR artefacts (status)
We’re progressively adding the artefacts below; items marked “Coming soon” will appear in the next releases.
- Metadata record · metadata.json
- README · README.md
- Data dictionary · dictionary.csv
- Provenance & methods · provenance.md
- Licensing & citation · LICENCE · citation.txt
- Supplementary materials · notebooks, diagrams, scripts
B. Real Supernova Observations
B1. Real observations
Overview. Curated multi-band observations with cadence, passbands, uncertainty model, and basic quality flags. Where possible, records are mapped to standard naming and include minimal provenance notes.
Intended use: evaluating methods trained on synthetic data, domain shift studies, and end-to-end validation on real light curves.
Primary files
Last updated:
Preview (CSV)
First 5 rows from a tiny sample file.
FAIR artefacts (status)
We’re progressively adding the artefacts below; items marked “Coming soon” will appear in the next releases.
- Metadata record · metadata.json
- README · README.md
- Data dictionary · dictionary.csv
- Provenance & methods · provenance.md
- Licensing & citation · LICENCE · citation.txt
- Supplementary materials · notebooks, diagrams, scripts
C. GenAI Simulator
C1. Synthetic GenAI data
Overview. Synthetic light curves produced by the ESPAI GenAI model to augment coverage where observations are sparse.
Paired .parquet files provide light time series; releases will ship with transparent generation settings.
Intended use: data augmentation, controlled experiments on cadence/noise, and benchmarking generalisation.
Primary files
Last updated:
Preview (CSV)
First 5 rows from a tiny sample file.
FAIR artefacts (status)
We’re progressively adding the artefacts below; items marked “Coming soon” will appear in the next releases.
- Metadata record · metadata.json
- README · README.md
- Data dictionary · dictionary.csv
- Provenance & methods · provenance.md
- Licensing & citation · LICENCE · citation.txt
- Supplementary materials · notebooks, diagrams, scripts
Source Code
Heads up: FAIR artefacts are being published in stages. Items marked “Coming soon” will appear in the next updates; “External” links point to project-controlled sources (e.g., GitHub or a data catalogue) when appropriate.
ESPAI Core Repository
Planned contents
- Training & evaluation scripts
- Model architectures (physical + GenAI variants)
- Data loaders and preprocessing utilities
- Reproducible configs and example notebooks
Publications
This section lists journal & conference submissions, technical diagrams/notes, and selected findings for the community.
Journal & Conference Submissions
-
Paper #1 (title TBC)
Abstract Preprint (coming soon)
Show short note
Scope: early results on physical-model light curves; baselines & GenAI augmentation plan. -
Paper #2 (title TBC)
Diagrams & Technical Notes
-
Blocks and data flow of the DL model.
-
Preprocessing, training, and evaluation steps.
Findings
-
Brief description and selection criteria.
Licence & Citation
To support ethical reuse and proper attribution, ESPAI provides default licensing and citation templates for datasets and software.
Important: if a dataset or repository includes its own LICENSE, citation.txt, or DOI,
that local file overrides the defaults below. Always prefer the per-item files when present.
If you adapt the datasets or code, indicate changes and, where practical, link back to this hub so others can find the original materials.
Datasets — Licence & how to cite
Licence (default): Creative Commons Attribution 4.0 International (CC BY 4.0). You must provide appropriate credit and indicate if changes were made. Read the licence.
Recommended dataset citation (plain text)
ESPAI Project (2025). ESPAI Light Curves — Physical models (4-parameters), v0.1. Koexai. URL: https://ESPAI.koexai.com/resources/ Licence: CC BY 4.0.
Dataset BibTeX (template)
@dataset{ESPAI_A1_v0_1_2025,
author = {ESPAI Project},
title = {ESPAI Light Curves — Physical models (4-parameters)},
year = {2025},
version = {0.1},
url = {https://ESPAI.koexai.com/resources/},
license = {CC BY 4.0},
note = {Replace with DOI when available}
}
Tip: if a dataset provides its own citation.txt or DOI, please use that instead of the template above.
Software — Licence & how to cite
Licence (intended): MIT Licence (to be confirmed in the repository).
A copy of the licence will be included as LICENSE in the repo.
About MIT.
Recommended software citation (plain text)
ESPAI Project (2025). ESPAI Core (v0.1) — Generative models and characterisation tools. Source code. URL: https://ESPAI.koexai.com/resources/ Licence: MIT.
Software BibTeX (template)
@software{ESPAI_core_v0_1_2025,
author = {ESPAI Project},
title = {ESPAI Core},
year = {2025},
version = {0.1},
url = {https://ESPAI.koexai.com/resources/},
license = {MIT},
note = {Replace with repository URL and tag when public}
}