Eywa
Heterogeneous Scientific Foundation Model Collaboration
Ask AI about Eywa
Powered by Claude ยท Grounded in docs
I know everything about Eywa. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Heterogeneous (Scientific) Foundation Model Collaboration
Bring Domain-Specific Foundation Models into Agentic Systems.
In Avatar franchise, Eywa ("All Mother") is a planetary-scale network. She connects and coordinates diverse life forms.
We bring Eywa from a fictional world into the digital world, for orchestrating heterogeneous foundation models.
โจ What's inside
- Three execution modes. Eywa has three instantiations:
single-agent,multi-agent, andorchestration. - A cross-domain benchmark.
eywabench.parquetcovers various scientific domains spanning material, energy, space, biology, clinic, drug, economy, business, and infrastructure. - Foundation model - Language Model "Tsaheylu". Robust and stable communication channels between a domain-specific foundation model and a language model.
- Async + worker pool.
--num_workersruns many tasks concurrently; each worker gets its own pair of MCP servers on isolated ports.
โ๏ธ Environment Setup
This repository requires several dependencies to be installed: langchain, langchain-mcp, langchain-openai, langchain_google_genai, fastmcp, autogluon, tabpfn.
Since a recent update, TabPFN requires signing up with PriorLabs. During execution TabPFN will pop-up a browser window to sign in.
Note: The commands below are a recommended starting point, not a fully portable installer. Dependency resolution can vary across operating systems, CUDA builds, and upstream package updates. If installation fails, create a fresh environment and follow the upstream installation guides linked above, especially for AutoGluon and TabPFN.
conda create -n eywa python=3.11
conda activate eywa
pip install autogluon
pip install tabpfn
pip install langchain langchain-openai langchain-mcp-adapters langchain-google-genai fastmcp==2.14.5
๐ Quickstart
# 1. Set up your environment yourself; then drop API keys into .env
cp .env.example .env
# edit OPENAI_API_KEY / GOOGLE_API_KEY
# 2. Launch foundation-model MCP servers (one pair per worker)
python launch_mcp_servers.py --num_workers 4
# 3. Run an experiment (in a second terminal)
python main.py --eywa --eywabench_name eywabench --exp_name first-run
# 4. Aggregate per-domain metrics
python eywabench/run_eval.py --exp_name first-run_gpt-5-nano_single-agent_eywa
๐ก Outputs land in
experiments/<output_folder>/<exp_name>_<model>_<setting>[_<multi_agent_type>][_eywa]/. Re-running with the same--exp_nameresumes any unfinished tasks.
๐งช Three Stages of Eywa
Single agent
# Plain LLM baseline
python main.py --eywabench_name eywabench --exp_name baseline --model gpt-5-nano
# LLM + foundation model (Eywa)
python main.py --eywabench_name eywabench --exp_name eywa-run --model gpt-5-nano --eywa
# Parallelize across 16 workers
python main.py --eywabench_name eywabench --exp_name eywa-run --eywa --num_workers 16
Multi agent
--agents takes specs of the form <type>:<llm> where <type> is base (plain LLM) or eywa (LLM + FM via MCP).
# Switching EywaAgent with LLM Agent in MAS -> EywaMAS
python main.py --eywa --setting multi-agent --multi_agent_type debate \
--agents eywa:gpt-5-nano base:gpt-5-nano \
--exp_name eywa-debate
Orchestration
python main.py --eywa --setting orchestration --exp_name orch
The planner emits a JSON plan per task that picks the setting, the LLM(s), whether to use Eywa, and which foundation model. Plans are cached in orchestration.jsonl and reused on resumed runs.
๐งฑ Repository layout
HMAS-exp-dev/
โโโ main.py # Async runner: dispatches tasks โ records results
โโโ launch_mcp_servers.py # Spawns one chronos + tabpfn server per worker
โ
โโโ eywa/
โ โโโ agents/
โ โ โโโ base_agent.py # SingleAgent + traced invocation helper
โ โ โโโ eywa.py # EywaAgent (FM-LLM "Tsaheylu" via MCP)
โ โ โโโ multi_agent.py # Multi-Agent System Topologies
โ โโโ foundation_models/
โ โ โโโ time_series/chronos.py # Chronos2 wrapped as an MCP server
โ โ โโโ tabular/tabpfn.py # TabPFN wrapped as an MCP server
โ โโโ utils/ # path / model_provider / parse / choose_prompt / langchain_mcp
โ โโโ configs/ # JSON CLI overrides
โ
โโโ eywabench/
โ โโโ eywabench.parquet # The benchmark
โ โโโ eval/ # timeseries / tabular / nlp scorers
โ โโโ run_eval.py # Per-domain + overall metrics
โ
โโโ prompts/ # User and orchestration prompt templates
โโโ .env.example # Required API keys
โโโ LICENSE
๐งฎ Scoring
- Time-series forecasting โ utility =
1 โ (sMAPE/2 + MAAPEยท2/ฯ) / 2โ [0, 1] - Tabular classification โ accuracy โ [0, 1]
- Tabular regression โ same composite as time-series
- Deep-principle physics / MMLU โ soft match: exact normalized โ numeric relative error โ token-F1 + character-similarity fallback (capped at 0.8)
๐ Supported Models
| Provider | Models |
|---|---|
| OpenAI | gpt-5-nano, gpt-4.1-nano, gpt-5-mini |
gemini-2.5-flash, gemini-2.5-flash-lite, gemini-3-flash-preview |
Adding another provider is a one-liner in eywa/utils/model_provider.py plus a branch in eywa/agents/base_agent.initialize_model.
๐ Key Results
๐ Citation
If Eywa is useful in your research, please cite:
@misc{eywa2026,
title = {Heterogeneous Scientific Foundation Model Collaboration},
author = {Zihao Li, Jiaru Zou, Feihao Fang, Xuying Ning, Mengting Ai, Tianxin Wei,
Sirui Chen, Xiyuan Yang, Jingrui He},
year = {2026},
note = {\url{https://github.com/Violet24K/Eywa}}
}
๐ License
Released under the Apache License 2.0.
