π
io.github.Embassy-of-the-Free-Mind/source-library
Search rare historical texts with OCR, translations & DOI citations.
0 installs
Trust: 37 β Low
Search
Ask AI about io.github.Embassy-of-the-Free-Mind/source-library
Powered by Claude Β· Grounded in docs
I know everything about io.github.Embassy-of-the-Free-Mind/source-library. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Loading tools...
Reviews
Documentation
Source Library v2
A Next.js application for digitizing and translating historical texts. Built for the Embassy of the Free Mind.
Stack
- Framework: Next.js 14 (App Router)
- Database: MongoDB Atlas
- AI: Google Gemini for OCR and Translation
- Storage: Vercel Blob for images
- Deployment: Vercel
Getting Started
npm install
npm run dev
Architecture
Image System
All page images go through /api/image for consistent sizing and cropping:
| Tier | Size | Quality | Use Case |
|---|---|---|---|
| Thumbnail | 400px | 70% | Grid views, page navigation |
| Display | 1200px | 80% | Main reading view |
| Full | 2400px | 90% | Magnifier, fullscreen |
Split Pages
Books with two-page spreads can be split. Each page stores:
crop.xStartandcrop.xEnd(0-1000 scale)cropped_photo(optional pre-generated Vercel Blob URL)
Cropping happens on-demand via Sharp. OCR automatically crops inline and saves the result for future use.
Processing Pipeline
- Import - Upload images or import from Internet Archive
- Split - Detect and split two-page spreads (ML or manual)
- OCR - Extract text using Gemini Vision
- Translate - Translate to English using Gemini
- Summarize - Generate summaries and key themes
Key Directories
src/
βββ app/ # All routes, pages, and API endpoints
β βββ api/ # API routes
β βββ book/ # Book pages (detail, read, pipeline)
β βββ page.tsx # Homepage
βββ components/ # Reusable React components
βββ hooks/ # Reusable React hooks for component logic
βββ lib/ # Business Logic, Utilities (mongodb, ai, types), and Services
