AstroMLab is a dynamic group of astrophysicists and computer scientists passionate about pushing the boundaries of Large Language Models (LLMs)in astronomy. Our team includes:
While LLMs are advancing rapidly, we believe that real progress in AI-driven astronomical research requires deep domain knowledge. This conviction drives us to tackle the challenges in applying LLMs to astronomy head-on.
Our ultimate aim is to:
Despite being a young group, we’ve made significant strides:
Our flagship model, AstroSage-8B, demonstrates remarkable performance when compared to other models in the 7B class. It achieves a substantial lead of 3.5 percentage points over its closest competitor, which translates to an estimated 10-fold reduction in computational costs (see the AstroBench page for details).
Model | Score (%) |
---|---|
AstroSage-8B (AstroMLab) | 79.1 |
AstroLLaMA-2-70B (AstroMLab) | 76.0 |
LLaMA-3.1-8B | 73.7 |
Phi-3.5-4B | 72.8 |
Gemma-2-9B | 71.5 |
LLaMA-2-70B | 70.7 |
Qwen-2.5-7B | 70.4 |
Yi-1.5-9B | 68.4 |
InternLM-2.5-7B | 64.5 |
Mistral-7B-v0.3 | 63.9 |
ChatGLM3-6B | 50.4 |
AstroLLaMA-2-7B (UniverseTBD) | 44.3 |
The exceptional performance of AstroSage-8B showcases the potential for more efficient and cost-effective agentic research in astronomy. This advancement opens up new possibilities for widespread application of AI in astronomical research, making sophisticated analysis more accessible to a broader range of institutions and researchers.
We are fully committed to open source:
We are grateful for our supporters:
Our team is expanding, and we’d love to hear from you!
Yuan-Sen Ting The Ohio State University |
Tirthankar Ghosal Oak Ridge National Laboratory |
Tijmen de Haan KEK |
Josh Nguyen University of Pennsylvania |
|
Rui Pan University of Illinois Urbana-Champaign |
Hardik Arora Indian Institutes of Technology |
Emily Herron Oak Ridge National Laboratory |
Yuwei Yang Australian National University |
|
Zechang Sun Tsinghua University |
Alberto Accomazzi NASA Astrophysics Data System |
Azton Wells Argonne National Laboratory |
Nesar Ramachandra Argonne National Laboratory |
|
Sandeep Madireddy Argonne National Laboratory |
Yuan-Sen Ting, et al., 2024, arXiv:2407.11194
We present a comprehensive evaluation of proprietary and open-weights large language models using the first astronomy-specific benchmarking dataset. This dataset comprises 4,425 multiple-choice questions curated from the Annual Review of Astronomy and Astrophysics, covering a broad range of astrophysical topics.
Key findings:
Rui Pan, Josh Nguyen, et al., 2024
We introduce new models: AstroLLaMA-3-8B and AstroLLaMA-2-70B, building upon the previous AstroLLaMA series and quantitatively assess specialized LLMs in astronomy, leveraging recently curated high-quality astronomical MCQs.
Key points:
The first open-source conversational AI tool tailored for the astronomy community – AstroLLaMA-2-7B and AstroLLaMA-2-7B-Chat.