PDF → JSON

Convert PDF to structured JSON

Get the full DoclingDocument: every heading, table cell, and text block with its type, reading order, and bounding box. Generated in your browser, so the PDF stays on your device.

A document you can program against

Markdown is for reading; JSON is for code. The DoclingDocument is a typed tree: each node knows whether it is a heading, paragraph, list item, table, or figure, where it sits in the reading order, and where it appears on the page. That makes it straightforward to filter by element type, rebuild tables, or route content by document region.

What the JSON gives you

Element types

Headings, paragraphs, lists, tables, code, and figures are each tagged, so you can extract exactly the parts you need.

Layout & bounding boxes

Every element carries page-relative coordinates, useful for region-based extraction and visual overlays.

Canonical schema

It is the same DoclingDocument the official library emits, so existing Docling tooling reads it without changes.

Output comes from IBM's docling-core library running in WebAssembly — byte-identical to the canonical Docling CLI.

Three steps

1

Open the converter and drop in a PDF or scanned image.

2

Parsade runs the Docling pipeline on your GPU and builds the document tree.

3

Switch to the JSON tab and copy or download the DoclingDocument.

Other ways to use Parsade

Questions

What does the JSON output contain? +

It is a DoclingDocument: a structured tree of the document with element types, reading order, heading levels, table cells, and bounding-box coordinates for each element on each page.

Is the JSON the same as the official Docling output? +

Yes. Parsade runs IBM's docling-core library in the browser via Pyodide, so the JSON is byte-identical to what the canonical Docling CLI produces server-side.

Can I convert PDF to JSON without uploading the file? +

Yes. The whole pipeline runs on your device inside the browser tab. The PDF is never sent to a server.

Get the structured document

Convert a PDF to a DoclingDocument JSON, all in your browser.