Py Basic Agent
No description available
Ask AI about Py Basic Agent
Powered by Claude · Grounded in docs
I know everything about Py Basic Agent. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
py-basic-agent
A tiny but real agentic system you can run locally. It uses:
- an LLM (via Ollama) for reasoning and tool selection
- pgvector for long‑term memory (document embeddings in Postgres)
- a Rich + prompt_toolkit REPL for a pleasant CLI
- optional MCP (Model Context Protocol) tools (HTTP façade) for file I/O
- a minimal ETL pipeline (CSV/JSON → transform → save), plus simple web search/fetch tools
RAG support: The REPL now includes a small Retrieval‑Augmented Generation flow. See RAG (Retrieval‑Augmented Generation) below.
Contents
- Quick start
- Requirements
- Setup (Linux / macOS / Windows)
- Ollama hosting
- Environment variables
- Reset after embedding update
- How it works (core concepts)
- Using the REPL
- ETL mini‑DSL
- MCP (Model Context Protocol) tools
- RAG (Retrieval‑Augmented Generation)
- Troubleshooting
- Project layout
Quick start
Before you start: make sure Ollama is running and reachable. See Ollama hosting for OS‑specific setup (macOS runs Ollama outside Docker; Linux/Windows may use the optional
ollamaservice).
# 1) Start dependencies (DB + optional MCP file server)
docker compose up -d pgvector mcpfs
# 2) Run the agent REPL
docker compose run --rm app
In the REPL:
/research Who founded NVIDIA and when?
/etl -p ./data/sales_orders.csv -t "reorder:date,region,product,units,unit_price; rename:unit_price->price; limit:3"
/mcp add-http -n fs -u http://host.docker.internal:8765
/mcp tools
/mcp call fs list_files '{"path":"./data"}'
Type /help for a full list of commands, or exit() to quit.
RAG quick taste:
/rag ingest -p ./knowledge /rag show -q "quartz-8127" -k 3 /rag ask What is the RAG demo code?
Requirements
You don’t need Python locally. Everything runs in containers.
-
Docker Desktop or Engine (compose v2)
-
Ollama running on your host (for the LLM + embeddings)
-
pull models you’ll use:
ollama pull llama3.1:8b ollama pull all-minilm -
make sure Ollama is reachable from Docker at
http://host.docker.internal:11434
-
-
~2–4 GB free RAM for the small models
Setup (Linux / macOS / Windows)
1) Clone and prepare .env
git clone <your-repo-url>
cd py-basic-agent
cp .env.example .env
Update values as needed (see Environment variables).
2) Start Postgres (pgvector) and MCP file server
docker compose up -d pgvector mcpfs
The pgvector container auto‑creates the database, role, and tables via the mounted
db/schema.sql. No manual DB bootstrapping needed.
3) Run the REPL
docker compose run --rm app
(Optional) Prepare a knowledge folder for RAG
Create knowledge/ with demo docs; you can use the included intro.md and policies.md examples.
mkdir -p knowledge
# these files may already exist in the repo
# echo "# RAG Demo — py-basic-agent" > knowledge/intro.md
# echo "# PII Handling & Redaction Policy (Demo)" > knowledge/policies.md
Mount it in compose (see RAG section) or use a repo‑relative path with /rag ingest -p ./knowledge.
Ollama hosting
- macOS (recommended): run Ollama outside Docker (host app). This is required on macOS so the models can use Metal and to avoid Docker networking issues. Keep
OLLAMA_HOST=http://host.docker.internal:11434. - Linux/Windows (optional in‑container): you may run Ollama as a container with Compose. When running the in‑container service, set
OLLAMA_HOST=http://ollama:11434.
⚠️ Experimental: the dockerized Ollama path depends on GPU drivers/permissions. If you hit timeouts or slow responses, prefer running Ollama on the host and use
host.docker.internal:11434.
Environment variables
All live in .env. Defaults are safe for local use.
| Variable | What it does | Example |
|---|---|---|
OLLAMA_HOST | Base URL for Ollama. The app calls /api/chat and /api/embeddings. | http://host.docker.internal:11434 |
AGENT_LLM_MODEL | Chat model used by the agent. | llama3.1:8b |
AGENT_TEMPERATURE | LLM sampling temperature (float). | 0.2 |
AGENT_MAX_STEPS | Agent loop max tool steps before giving up. | 8 |
AGENT_DB_URL | Postgres URL used by PgVectorMemory. Provided by compose. | postgresql://agent:agentpass@pgvector:5432/agentdb |
AGENT_EMBED_MODEL | Embedding model name in Ollama. | all-minilm |
AGENT_EMBED_DIM | Embedding vector dimension. Must match schema.sql. | 384 |
SERPER_API_KEY | API key for the Serper search tool (if enabled). | sk-… |
AGENT_VERBOSE | Controls extra logging in some contexts (true/false). | true |
KB_PATH | Default folder for RAG ingestion inside the container. If you don’t mount it, use /rag ingest -p ./knowledge. | /knowledge |
Changing models
- Use a different chat model: set
AGENT_LLM_MODEL(e.g.,llama3.2:3b). - Use a different embedding model: set
AGENT_EMBED_MODELand updateAGENT_EMBED_DIM, then update the DB schema (vector size inschema.sql) and recreate DB or migrate:
ALTER TABLE docs ALTER COLUMN embedding TYPE vector(<new_dim>);
Reset after embedding update
If you change AGENT_EMBED_MODEL or AGENT_EMBED_DIM, ensure the database vector dimension matches the embedding dimension. Easiest path is to recreate the DB volume (auto‑initialized from db/schema.sql).
- Edit
.envto the new model & dim (defaults useall-minilm). - Edit
db/schema.sqlto match (e.g.,VECTOR(384)). - Drop and rebuild Postgres (recreates volume and schema):
docker compose down -v
docker compose up -d pgvector
- (Optional) verify the table shape:
docker compose exec -T pgvector psql -X -U agent -d agentdb -c "\d+ docs"
- Restart the REPL:
docker compose run --rm app
Prefer a migration instead of dropping the volume? Use the
ALTER TABLE … TYPE vector(<dim>)command above.
How it works (core concepts)
1) The Agent loop (agent/core.py)
- Builds a message list: a system prompt + user task + any observations from tools.
- Calls the LLM with the messages.
- The LLM responds with a JSON tool call or a final answer, e.g.:
{ "tool": "search", "input": {"query": "NVIDIA founders"} }. - The agent executes the tool, captures an observation (and often a short LLM summary), appends that back to the conversation, and repeats until a final answer or the step limit is reached.
2) The LLM client (agent/llm.py)
-
Thin async wrapper around Ollama APIs:
/api/chatfor conversation/api/embeddingsfor vectorization
-
Normalizes messages and enforces length limits.
-
Includes a system prompt that teaches the LLM to return JSON tool calls or a final JSON with a readable summary.
3) Tools (agent/tools.py)
- Search: Serper (or stub). Returns top results.
- Fetch: Gets a URL, returns
{title, url, text}. - ETL:
load_csv/json→transform(select/rename/limit) →save. - Memory: router that calls the configured memory implementation.
Each tool is pure async and returns structured data.
4) Memory (agent/memory/)
Two implementations (same interface):
-
SimpleMemory(in‑process, substring search) — great for unit tests. -
PgVectorMemory(Postgres + pgvector, cosine similarity):aupsert(docs)store content + embeddingsaquery(query, k)vector searchaadd(text, source, uri, meta)quick noteadump(n)latest notes dump (for system prompt context)
Embeddings come from Ollama (AGENT_EMBED_MODEL), and the vector size must match DB schema.
5) REPL (agent/repl.py)
-
Rich header + tool hints + keybindings
-
History, completion for commands and file paths
-
Commands:
/research <question>/etl -p <path> -t "<transform>" [-l <out>]/etl_from_source -p <url> -t "<transform>" [-l <out>]/where <path>/mcp …(see below)/help,exit()/rag …(see below)
Using the REPL
Start it:
docker compose run --rm app
Try research:
/research Compare NVIDIA vs AMD GPU market share over the last 2 years.
Try ETL:
/etl -p ./data/sales_orders.csv -t "reorder:date,region,product,units,unit_price; rename:unit_price->price; limit:3"
Check a path:
/where ./data/sales_orders.csv
ETL mini‑DSL
Chain operations with semicolons. Works for CSV columns and JSON keys.
reorder:colA,colB,colC— Reorder columns; unspecified columns are appended in original order.rename:old1->new1,old2->new2— Rename fields/columns. Quote names with spaces or numeric keys:rename:'1958'->y1958,'unit price'->price.limit:K— Truncate rows/objects to K.
Examples:
/etl -p ./data/sales_orders.csv -t "reorder:date,region,product,units,unit_price; rename:unit_price->price; limit:3" -l ./data/sales_orders.sample.parquet
MCP (Model Context Protocol) tools
MCP lets you add external tools to the agent at runtime (HTTP façade or stdio servers).
Examples:
/mcp add-http -n fs -u http://host.docker.internal:8765
/mcp tools
/mcp call fs list_files '{"path":"./data"}'
RAG (Retrieval‑Augmented Generation)
A small, readable RAG layer is built into the REPL so you can demonstrate how articles/notes become ground truth for answers.
Files & mounting
-
Place
.mdand.txtfiles underknowledge/in your repo. -
Either:
-
run:
/rag ingest -p ./knowledge(no compose changes), or -
mount the folder into the container and use the default
KB_PATH:# compose.yml (app service excerpt) services: app: environment: KB_PATH: /knowledge volumes: - ./:/app - ./knowledge:/knowledge:ro
-
Commands
/rag ingest [-p PATH] [--glob "*.md,*.txt"]
/rag add -t "text" [-s source] [-u uri]
/rag show -q "query" [-k 6]
/rag ask <question> [-k 6]
- ingest — indexes files into pgvector (chunks ≈ 800 words, 150 overlap).
- add — stores a one‑off snippet as a doc chunk.
- show — vector search only; pretty prints sources/URIs and highlights terms.
- ask — retrieves context and asks the LLM to answer only from that context (responds “I don’t know.” when not covered).
Demo knowledge
The repo includes knowledge/intro.md and knowledge/policies.md with respective test tokens:
-
RAG‑DEMO-INTRO‑CODE:
quartz-8127 -
RAG‑PII‑POLICY‑CODE:
heron-4512
PII behavior to demo:
/rag ask What is Jane Roe's SSN?
# → Per policy, I cannot share that information. [REDACTED]
/rag ask What is the PII policy code?
# → heron-4512
Smoke checks
/rag ingest
/rag show -q "quartz-8127" -k 3
/rag ask What is the RAG demo code?
Expect: ingest counts > 0, retrieval shows intro.md, answer includes quartz-8127.
Notes
-
Embedding dim must match DB schema (
VECTOR(dim)); changeAGENT_EMBED_MODEL/AGENT_EMBED_DIMtogether and re‑ingest. -
Apostrophes are safe in
/rag ask(parser avoidsshlexpitfalls). -
If
/rag ingestreturnsfiles=0, verify mounts:/where /knowledgevs/where ./knowledge.
Troubleshooting
- Ollama not reachable: check
OLLAMA_HOST; test withcurl $OLLAMA_HOST/api/tagsfrom inside a container. - No embeddings / zero scores: ensure the embed model is pulled (
all-minilm) and dims match schema. - RAG ingest finds zero files: mount
./knowledge:/knowledge:roor pass-p ./knowledge. - Docker networking: on macOS/Windows use
host.docker.internal; on Linux, seeextra_hostsin compose.
Project layout
app/
agent/
core.py # agent loop
llm.py # Ollama chat/embeddings client
repl.py # REPL (commands include research/etl/mcp/rag)
tools.py # tool registry (search/fetch/etl/memory)
memory/
__init__.py # get_memory() factory
pg_store.py # PgVectorMemory impl
simple.py # SimpleMemory impl
knowledge/
intro.md # RAG demo document (token quartz-8127)
policies.md # optional PII policy demo (token heron-4512)
compose.yml
.env.example
README.md
