🔍

io.github.Embassy-of-the-Free-Mind/source-library

Search rare historical texts with OCR, translations & DOI citations.

0 installs

Trust: 37 — Low

Ask AI about io.github.Embassy-of-the-Free-Mind/source-library

I know everything about io.github.Embassy-of-the-Free-Mind/source-library. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

Source Library v2

A Next.js application for digitizing and translating historical texts. Built for the Embassy of the Free Mind.

Stack

Framework: Next.js 14 (App Router)
Database: MongoDB Atlas
AI: Google Gemini for OCR and Translation
Storage: Vercel Blob for images
Deployment: Vercel

Getting Started

npm install
npm run dev

Open http://localhost:3000

Architecture

Image System

All page images go through /api/image for consistent sizing and cropping:

Tier	Size	Quality	Use Case
Thumbnail	400px	70%	Grid views, page navigation
Display	1200px	80%	Main reading view
Full	2400px	90%	Magnifier, fullscreen

Split Pages

Books with two-page spreads can be split. Each page stores:

crop.xStart and crop.xEnd (0-1000 scale)
cropped_photo (optional pre-generated Vercel Blob URL)

Cropping happens on-demand via Sharp. OCR automatically crops inline and saves the result for future use.

Processing Pipeline

Import - Upload images or import from Internet Archive
Split - Detect and split two-page spreads (ML or manual)
OCR - Extract text using Gemini Vision
Translate - Translate to English using Gemini
Summarize - Generate summaries and key themes

Key Directories

src/
├── app/              # All routes, pages, and API endpoints
│   ├── api/          # API routes
│   ├── book/         # Book pages (detail, read, pipeline)
│   └── page.tsx      # Homepage
├── components/       # Reusable React components
├── hooks/            # Reusable React hooks for component logic
└── lib/              # Business Logic, Utilities (mongodb, ai, types), and Services