NLP-powered reading app
for in-depth textual analysis

A transformer-based reading environment that tokenizes, parses, annotates, and assigns dictionary entries to text in 30+ modern and historical languages. Reads a variety of input formats — PDF, HTML snapshots, Word documents, EPUBs.

Built for serious learners and professionals who want research-grade linguistic understanding.

See How It Works

Output showcase

Read PDFs, ebooks, and Word docs with their layout intact

Documents are ingested with their original page geometry preserved — page numbers, running headers, centred text blocks, line wrapping, and book typography. Dependency arcs are drawn directly over the live text, so syntax is visible on the page.

Read web snapshots the same way

Save an HTML snapshot of any web page with the companion Chrome extension — Language Capture — and load it into the reader as a frozen, fully-analysable text. Infoboxes, hyperlinks, and visual structure stay intact.

Use the selector tool to highlight specific chunks of prose for lookup.

Tokenization across writing systems

Arabic clitics are split off from their host words. Sanskrit sandhi chains are decomposed into their constituent morphemes. Scriptio continua languages — Thai, Japanese, Chinese — are segmented by a powerful multilingual transformer pretrained on 2.6 terabytes of text from over 100 languages.

Hover any word for layered dictionary lookup

Hovering a token surfaces a stack of curated dictionaries assembled per language — Wiktionary alongside specialist sources like JMDict (Japanese), CC-CEDICT (Chinese), KRDICT (Korean), and historical lexica for classical languages.

Entries are filtered by lemma and POS, so you only see context-suitable candidates.

Dictionary entries given grammatical shape

Each entry is annotated with a POS tag, a language-specific fine-grained POS tag, a grapheme-level script/romanization breakdown, and a full morphological feature bundle: aspect, mood, tense, voice, person, number, case, gender and more. Inflected forms are also resolved back to their dictionary lemma.

Sentence structure visible at a glance

Sentence structure is rendered visually onscreen through dependency arcs drawn over the live text, accompanied by POS shading to give shape to phrase and clause boundaries. Each word’s grammatical role in the sentence is flagged, marking out subjects, objects, verbs and function words.

Named entities are tagged with colour-coded chips so you can scan a passage at a glance for the who, what, where and when.

Crowdsourced LLM annotations

Per-token glosses for word-sense disambiguation in context. On-demand synthetic dictionary entries for forms not covered by the existing sources. Sentence-level translations anchored to the source text. Free-form notes attached to dictionary entries. All of the above are community-editable — once one user generates an annotation, every other reader can see, alter, and benefit from it.

A built-in LLM chatbot always has your page loaded as context, and can answer any questions you might have.

30+ languages

The world’s most widely spoken modern languages, plus the classical languages that open up the heritage of major civilisations.

Historical & Classical

Classical Chinese Ancient Greek Latin Sanskrit Old English

Modern

Chinese Traditional Chinese Japanese Korean Vietnamese French Spanish German Italian Portuguese Russian Dutch Greek Arabic Hebrew Hindi Persian Turkish Indonesian Thai Tagalog Swahili Armenian
Free
$0 / month
  • Full NLP pipeline
  • Dictionary entries across all 30+ languages
  • 1,000 tokens per day
  • No credit card required
Pro
$10 / month
Monthly billing
  • Unlimited access
  • Unlocks all LLM features:
    • Per-token glosses
    • Sentence-level translations
    • Built-in chatbot with full page context
    • Synthetic dictionary entry creation
    • Free-form note creation on entries
  • Cancel anytime

Create your account

Free to start. 1,000 tokens per day, no card required.

or

Already have an account? Sign in