Ipfs Accelerate Py
A model server and decentralized task management system.
Ask AI about Ipfs Accelerate Py
Powered by Claude Β· Grounded in docs
I know everything about Ipfs Accelerate Py. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
IPFS Accelerate Python
Enterprise-grade hardware-accelerated machine learning inference with IPFS network-based distribution
π Table of Contents
- Overview
- Installation
- Quick Start
- MCP++ Server
- Architecture
- Supported Hardware
- Supported Models
- Documentation
- IPFS & Distributed Features
- Performance & Optimization
- Troubleshooting
- Testing & Quality
- Contributing
- License
π Overview
IPFS Accelerate Python combines cutting-edge hardware acceleration, distributed computing, and IPFS network integration to deliver blazing-fast machine learning inference across multiple platforms and devices - from data centers to browsers.
β‘ Key Highlights
- π₯ 8+ Hardware Platforms - CPU, CUDA, ROCm, OpenVINO, Apple MPS, WebNN, WebGPU, Qualcomm
- π Distributed by Design - IPFS content addressing, P2P inference, global caching
- π€ 300+ Models - Full HuggingFace compatibility + custom architectures
- π§ Canonical MCP++ Server - Unified
ipfs_accelerate_py.mcp_serverruntime is now the default startup path - π Browser-Native - WebNN & WebGPU for client-side acceleration
- π Production Ready - Real-time monitoring, enterprise security, compliance validation
- β‘ High Performance - Intelligent caching, batch processing, model optimization
π¦ Installation
Quick Start (5 minutes)
# 1. Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# 2. Install IPFS Accelerate
pip install -U pip setuptools wheel
pip install ipfs-accelerate-py
# 3. Verify installation
python -c "from ipfs_accelerate_py import IPFSAccelerator; print('β
Ready!')"
NVIDIA CUDA (PyTorch)
By default, pip may install a CPU-only PyTorch wheel from PyPI (e.g. torch==...+cpu) because the CUDA-enabled wheels are published on PyTorch's wheel indexes.
If you have an NVIDIA GPU and want to ensure CUDA is available in PyTorch, install PyTorch from the CUDA wheel index:
python -m pip install -U pip
python -m pip install --upgrade --force-reinstall -r install/requirements_torch_cu124.txt
python -c "import torch; print('torch=', torch.__version__); print('cuda_available=', torch.cuda.is_available()); print('torch_cuda=', torch.version.cuda)"
If you're on an NVIDIA GB10 / DGX Spark-class system (CUDA capability 12.1, CUDA 13.0), stable builds may warn that your GPU is unsupported. In that case, use the CUDA 13.0 nightly wheels:
./scripts/install_torch_cuda_cu130_nightly.sh
If you're installing from source/editable mode, you can also run:
python -m pip install -e . --no-deps
python -m pip install --upgrade --force-reinstall -r install/requirements_torch_cu124.txt
python -m pip install -r requirements.txt
Installation Profiles
Choose the profile that matches your needs:
| Profile | Use Case | Installation |
|---|---|---|
| Core | Basic inference | pip install ipfs-accelerate-py |
| Full | Models + API server | pip install ipfs-accelerate-py[full] |
| MCP | MCP server extras | pip install ipfs-accelerate-py[mcp] |
| Dev | Development setup | pip install -e . |
π Detailed instructions: Installation Guide | Troubleshooting | Getting Started
π― Quick Start
Python API
from ipfs_accelerate_py import IPFSAccelerator
# Initialize with automatic hardware detection
accelerator = IPFSAccelerator()
# Load any HuggingFace model
model = accelerator.load_model("bert-base-uncased")
# Run inference (automatically optimized for your hardware)
result = model.inference("Hello, world!")
print(result)
Command Line Interface
# Start the default MCP++ server for automation
ipfs-accelerate mcp start
# Run the canonical FastAPI MCP service directly
python -m ipfs_accelerate_py.mcp_server.fastapi_service
# Run the direct MCP server CLI with p2p/task options
python -m ipfs_accelerate_py.mcp.cli --host 0.0.0.0 --port 9000
# Run inference directly
ipfs-accelerate inference generate \
--model bert-base-uncased \
--input "Hello, world!"
# List available models and hardware
ipfs-accelerate models list
ipfs-accelerate hardware status
# Start GitHub Actions autoscaler
ipfs-accelerate github autoscaler
Remote libp2p task pickup (ipfs_datasets_py)
If you want a remote machine running the ipfs_accelerate_py MCP server to also pick up libp2p task submissions coming from ipfs_datasets_py, you can start the MCP server CLI with the built-in P2P task worker:
# Remote machine (runs MCP + worker + libp2p TaskQueue service)
python -m ipfs_accelerate_py.mcp.cli \
--host 0.0.0.0 --port 9000 \
--p2p-task-worker --p2p-service --p2p-listen-port 9710 \
--p2p-queue ~/.cache/ipfs_datasets_py/task_queue.duckdb
# Optional (off-host clients): set the public IP that will be embedded in the announced multiaddr
export IPFS_DATASETS_PY_TASK_P2P_PUBLIC_IP="YOUR_PUBLIC_IP"
By default, the libp2p TaskQueue service writes an announce file into your XDG cache dir and clients will try to use it automatically:
- Default announce file:
~/.cache/ipfs_accelerate_py/task_p2p_announce.json - Disable announce file (opt-out):
IPFS_ACCELERATE_PY_TASK_P2P_ANNOUNCE_FILE=0(orIPFS_DATASETS_PY_TASK_P2P_ANNOUNCE_FILE=0)
If your client machine can read that announce file (same host/user, or a shared filesystem path you set via
IPFS_ACCELERATE_PY_TASK_P2P_ANNOUNCE_FILE / IPFS_DATASETS_PY_TASK_P2P_ANNOUNCE_FILE), you do not need to set any remote multiaddr env vars.
Otherwise, the process also prints a multiaddr=... line. On the client machine, set:
export IPFS_DATASETS_PY_TASK_P2P_REMOTE_MULTIADDR="/ip4/.../tcp/9710/p2p/..."
Notes:
- This mode requires
ipfs_datasets_pyto be installed on the remote machine (andlibp2pinstalled viaipfs_datasets_py[p2p]).
Real-World Examples
| Example | Description | Complexity |
|---|---|---|
| Basic Usage | Simple inference with BERT | Beginner |
| Hardware Selection | Choose specific accelerator | Intermediate |
| Distributed Inference | P2P model sharing | Advanced |
| Browser Integration | WebNN/WebGPU in browsers | Advanced |
π More examples: examples/ | Quick Start Guide
π§ MCP++ Server
The MCP server in this repository has completed its unification cutover.
- Canonical runtime:
ipfs_accelerate_py/mcp_server - Compatibility facade:
ipfs_accelerate_py/mcp - Current default:
create_mcp_server()and the main MCP startup paths now select the unified runtime by default - Cutover status: approved and frozen with a focused release-candidate matrix of
120 passed
Current entrypoints
| Entry point | Best for | Notes |
|---|---|---|
ipfs-accelerate mcp start | End-user server startup | Main product CLI for MCP server management and dashboard workflows |
python -m ipfs_accelerate_py.mcp.cli | Direct server/process control | Starts the MCP server and can also host TaskQueue/libp2p worker services |
python -m ipfs_accelerate_py.mcp_server.fastapi_service | Standalone HTTP/FastAPI hosting | Reads IPFS_MCP_* env vars and mounts the MCP app at /mcp by default |
from ipfs_accelerate_py.mcp_server import create_server | Programmatic embedding | Stable import target for the canonical runtime package |
Supported MCP++ profile chapters
The unified runtime currently advertises these additive MCP++ profiles:
mcp++/profile-a-idlmcp++/profile-b-cid-artifactsmcp++/profile-c-ucanmcp++/profile-d-temporal-policymcp++/profile-e-mcp-p2p
Unified control-plane features
- Meta-tools:
tools_list_categories,tools_list_tools,tools_get_schema,tools_dispatch,tools_runtime_metrics - Migrated native categories:
ipfs,workflow,p2p - Security and governance: UCAN validation, temporal/deontic policy evaluation, policy audit logging, secrets vault support, and risk scoring/frontier execution
- Observability: runtime metrics, audit-to-metrics bridging, OpenTelemetry hooks, and Prometheus exporter support
- Transport coverage: compatibility-tested process helpers, FastAPI mounting, and MCP+p2p handler parity with mixed-version negotiation hardening
Cutover and rollback controls
These controls remain available for validation and operational rollback:
IPFS_MCP_FORCE_LEGACY_ROLLBACK=1β force the compatibility facade to stay on the legacy wrapperIPFS_MCP_UNIFIED_CUTOVER_DRY_RUN=1β validate the unified startup path while keeping legacy runtime behavior activeIPFS_MCP_ENABLE_UNIFIED_BRIDGE=1β explicitly request the unified bridge on compatibility-facade paths
Recommended documentation
- Canonical MCP server README
- MCP Cutover Checklist
- MCP Server Unification Plan
- MCP++ Conformance Checklist
- MCP++ Spec Gap Matrix
ποΈ Architecture
IPFS Accelerate Python is built on a modular, enterprise-grade architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Application Layer β
β Python API β’ CLI β’ MCP Server β’ Web Dashboard β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ
β Hardware Abstraction Layer β
β Unified interface across 8+ hardware platforms β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ
β Inference Backends β
β CPU β’ CUDA β’ ROCm β’ MPS β’ OpenVINO β’ WebNN β’ WebGPU β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ
β IPFS Network Layer β
β Content addressing β’ P2P β’ Distributed caching β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Core Components
- Hardware Abstraction: Unified API across 8+ platforms with automatic selection
- IPFS Integration: Content-addressed storage, P2P distribution, intelligent caching
- Performance Modeling: ML-powered optimization and resource management
- MCP Server: Canonical
ipfs_accelerate_py.mcp_serverMCP++ runtime with compatibility facade and cutover controls - Monitoring: Real-time metrics, profiling, and analytics
π Detailed architecture: docs/architecture/overview.md | CI/CD
π§ Supported Hardware
Run anywhere - from powerful servers to edge devices and browsers:
| Platform | Status | Acceleration | Requirements | Performance |
|---|---|---|---|---|
| CPU (x86/ARM) | β | SIMD, AVX | Any | Good |
| NVIDIA CUDA | β | GPU + TensorRT | CUDA 11.8+ | Excellent |
| AMD ROCm | β | GPU + HIP | ROCm 5.0+ | Excellent |
| Apple MPS | β | Metal | M1/M2/M3 | Excellent |
| Intel OpenVINO | β | CPU/GPU | Intel HW | Very Good |
| WebNN | β | Browser NPU | Chrome, Edge | Good |
| WebGPU | β | Browser GPU | Modern browsers | Very Good |
| Qualcomm | β | Mobile DSP | Snapdragon | Good |
Hardware Selection
The framework automatically detects and selects the best available hardware:
# Automatic (recommended)
accelerator = IPFSAccelerator() # Uses best available
# Manual selection
accelerator = IPFSAccelerator(device="cuda") # Force CUDA
accelerator = IPFSAccelerator(device="mps") # Force Apple MPS
βοΈ Hardware guides: Hardware Optimization | Platform Support
π€ Supported Models
Pre-trained Models (300+)
| Category | Models | Status |
|---|---|---|
| Text | BERT, RoBERTa, DistilBERT, ALBERT, GPT-2/Neo/J, T5, BART, Pegasus, Sentence Transformers | β |
| Vision | ViT, DeiT, BEiT, ResNet, EfficientNet, DETR, YOLO | β |
| Audio | Whisper, Wav2Vec2, WavLM, Audio Transformers | β |
| Multimodal | CLIP, BLIP, LLaVA | β |
| Custom | PyTorch models, ONNX, TensorFlow (converted) | β |
Model Loading
# From HuggingFace Hub
model = accelerator.load_model("bert-base-uncased")
# From IPFS (content-addressed)
model = accelerator.load_model("ipfs://QmXxxx...")
# Local model
model = accelerator.load_model("./my_model/")
# With specific hardware
model = accelerator.load_model("gpt2", device="cuda")
π€ Full model list: Supported Models | Custom Models Guide
π Documentation
π Essential Guides
| Guide | Description | Audience |
|---|---|---|
| Getting Started | Complete beginner tutorial | Everyone |
| Quick Start | Get running in 5 minutes | Everyone |
| Installation | Detailed setup instructions | Users |
| FAQ | Common questions & answers | Everyone |
| API Reference | Complete API documentation | Developers |
| Architecture | System design & components | Architects |
| Hardware Optimization | Platform-specific tuning | Engineers |
| Testing Guide | Testing & benchmarking | QA/DevOps |
π― Specialized Topics
| Topic | Resources |
|---|---|
| IPFS & P2P | IPFS Integration β’ P2P Networking |
| GitHub Actions | Autoscaler β’ CI/CD |
| Docker & K8s | Container Guide β’ Deployment |
| MCP Server | Canonical MCP Server README β’ MCP Setup β’ Protocol Docs β’ Cutover Checklist |
| Browser Support | WebNN/WebGPU β’ Examples |
π Documentation Quality
Our documentation has been professionally audited (January 2026):
- β 200+ files covering all features
- β 93/100 quality score (Excellent)
- β Comprehensive - From beginner to expert
- β Well-organized - Clear structure and navigation
- β Verified - All examples tested and working
π Documentation Hub: docs/ | Full Index | Audit Report
π IPFS & Distributed Features
Why IPFS?
IPFS integration provides enterprise-grade distributed computing:
- π Content Addressing - Cryptographically secure, immutable model distribution
- π Global Network - Automatic peer discovery and geographic optimization
- β‘ Intelligent Caching - Multi-level LRU caching across the network
- π Load Balancing - Automatic distribution across available peers
- π‘οΈ Fault Tolerance - Robust error handling and fallback mechanisms
IPFS Backend Router (New! β)
The IPFS Backend Router provides a flexible, pluggable backend system with automatic fallback:
Backend Preference Order:
- ipfs_kit_py - Full distributed storage (preferred)
- HuggingFace Cache - Local storage with IPFS addressing
- Kubo CLI - Standard IPFS daemon
from ipfs_accelerate_py import ipfs_backend_router
# Store model weights to IPFS
cid = ipfs_backend_router.add_path("/path/to/model", pin=True)
print(f"Model CID: {cid}")
# Retrieve from anywhere
ipfs_backend_router.get_to_path(cid, output_path="/cache/model")
Configuration:
# Prefer ipfs_kit_py (default)
export ENABLE_IPFS_KIT=true
# Use HF cache only (good for CI/CD)
export IPFS_BACKEND=hf_cache
# Force Kubo CLI
export IPFS_BACKEND=kubo
π Full documentation: IPFS Backend Router Guide
Distributed Inference
# Enable P2P inference
accelerator = IPFSAccelerator(enable_p2p=True)
# Model is automatically shared across peers
model = accelerator.load_model("bert-base-uncased")
# Inference uses best available peer
result = model.inference("Distributed AI!")
Advanced Features
| Feature | Description | Status |
|---|---|---|
| P2P Workflow Scheduler | Distributed task execution with merkle clocks | β |
| GitHub Actions Cache | Distributed cache for CI/CD | β |
| Autoscaler | Dynamic runner provisioning | β |
| MCP Server | Model Context Protocol (14+ tools) | β |
π Learn more: IPFS Guide | P2P Architecture | Network Setup
π§ͺ Testing & Quality
# Run all tests
pytest
# Run specific test suite
pytest test/test_inference.py
# Run with coverage report
pytest --cov=ipfs_accelerate_py --cov-report=html
# Run benchmarks
python data/benchmarks/run_benchmarks.py
Quality Metrics
| Metric | Status | Details |
|---|---|---|
| Test Coverage | β | Comprehensive test suite |
| Documentation | β 93/100 | Audit Report |
| Code Quality | β | Linted, type-checked |
| Security | β | Regular vulnerability scans |
| Performance | β | Benchmarked across platforms |
π§ͺ Testing guide: docs/guides/testing/TESTING_README.md | CI/CD Setup
β‘ Performance & Optimization
Benchmarks
| Hardware | Model | Throughput | Latency |
|---|---|---|---|
| NVIDIA RTX 3090 | BERT-base | ~2000 samples/sec | <1ms |
| Apple M2 Max | BERT-base | ~800 samples/sec | 2-3ms |
| Intel i9 (CPU) | BERT-base | ~100 samples/sec | 10-15ms |
| WebGPU (Browser) | BERT-base | ~50 samples/sec | 20-30ms |
Optimization Tips
# Enable mixed precision for 2x speedup
accelerator = IPFSAccelerator(precision="fp16")
# Use batch processing for better throughput
results = model.batch_inference(inputs, batch_size=32)
# Enable model quantization for 4x memory reduction
model = accelerator.load_model("bert-base-uncased", quantize=True)
# Use intelligent caching for repeated queries
accelerator = IPFSAccelerator(enable_cache=True)
π Performance guide: Hardware Optimization | Benchmarking
π§ Troubleshooting
Common Issues
| Issue | Solution |
|---|---|
| Import errors | pip install --upgrade ipfs-accelerate-py |
| CUDA not found | Install CUDA Toolkit 11.8+ |
| Slow inference | Check hardware selection, enable caching |
| Memory errors | Use quantization, reduce batch size |
| Connection issues | Check IPFS daemon, firewall settings |
Quick Fixes
# Verify installation
python -c "import ipfs_accelerate_py; print(ipfs_accelerate_py.__version__)"
# Check hardware detection
ipfs-accelerate hardware status
# Test basic inference
ipfs-accelerate inference test
# View logs
ipfs-accelerate logs --tail 100
π Get help: Troubleshooting Guide | FAQ | GitHub Issues
π€ Contributing
We welcome contributions! Here's how to get started:
Quick Contribution Guide
- Fork & Clone: Get your own copy of the repository
- Create Branch:
git checkout -b feature/your-feature - Make Changes: Follow our coding standards
- Run Tests:
pytestto ensure everything works - Submit PR: Open a pull request with clear description
Areas We Need Help
- π Bug Reports - Found an issue? Let us know!
- π Documentation - Help improve guides and examples
- π§ͺ Testing - Add tests for edge cases
- π Translations - Translate docs to other languages
- π‘ Features - Suggest or implement new features
Community & Guidelines
- π¬ GitHub Discussions - Ask questions, share ideas
- π Issue Tracker - Report bugs, request features
- π Security Policy - Report security vulnerabilities
- π§ Email: starworks5@gmail.com
π Full guides: CONTRIBUTING.md | Code of Conduct | Security Policy
π License
This project is licensed under the GNU Affero General Public License v3.0 or later (AGPLv3+).
What this means:
- β Free to use, modify, and distribute
- β Commercial use allowed
- β Patent protection included
- β οΈ Source code must be disclosed for network services
- β οΈ Modifications must use same license
π Details: LICENSE | AGPL FAQ
π Acknowledgments
Built with amazing open source technologies:
- HuggingFace Transformers - ML model ecosystem
- IPFS - Distributed file system
- PyTorch - Deep learning framework
- FastAPI - Modern web framework
Special thanks to all contributors who make this project possible! π
Project Information
- π Changelog - Version history and release notes
- π Security Policy - Security reporting and best practices
- π€ Contributing Guide - How to contribute
- π License - AGPLv3+ license details
π Show Your Support
If you find this project useful:
- β Star this repository on GitHub
- π’ Share with your network
- π Report issues to help improve it
- π‘ Contribute features or fixes
- π Write about your experience
Made with β€οΈ by Benjamin Barber and contributors
π Homepage β’ π Documentation β’ π Issues β’ π¬ Discussions
