Convert PDF to Markdown API
Turn any PDF URL into clean, structured Markdown with a single API call. Built for LLM pipelines, RAG systems, vector databases, and documentation workflows - with headings, tables, and lists preserved.
Why convert PDFs to Markdown?
Large language models and RAG systems work best with structured text. Raw PDF text loses all formatting - headings become indistinguishable from body text, tables collapse into unreadable strings, and lists lose their hierarchy. Markdown preserves that structure in a format every LLM understands.
If you are building a RAG pipeline, you need to chunk documents at semantic boundaries - section headings, paragraph breaks, table edges. Markdown gives you those boundaries as syntax, making it trivial to split documents into meaningful chunks for embedding.
PDFPipe converts any PDF URL to clean Markdown with proper heading levels, formatted tables, ordered and unordered lists, and bold/italic emphasis. Works with inline PDFs and auto-download files behind authentication, redirects, or JavaScript triggers.
How it works
One POST request. We handle the rest.
1. Send a request
curl -X POST https://api.pdfpipe.dev/v1/convert \
-H "Authorization: Bearer pk_live_..." \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/whitepaper.pdf",
"format": "markdown"
}'2. Get the response
{
"requestId": "req_01J9X7K2M...",
"status": "complete",
"format": "markdown",
"pagesProcessed": 12,
"creditsUsed": 1,
"resultUrl": "https://pdfpipe-results.s3..."
}Structured Markdown, ready for AI
PDFPipe preserves document structure as proper Markdown syntax. Headings, tables, lists, and emphasis are all retained - giving your LLM or RAG pipeline the semantic context it needs to generate accurate responses.
- Heading hierarchy preserved (H1 through H6)
- Tables converted to Markdown pipe syntax
- Ordered and unordered lists maintained
- Bold and italic emphasis detected
- Clean chunk boundaries for RAG pipelines
# Quarterly Financial Report
## Q4 2025
**Prepared by:** Acme Corp
**Date:** January 15, 2026
---
## Executive Summary
Revenue for Q4 2025 reached **$4.2M**, a 23% increase
over the previous quarter. Operating margins improved
to 18.5%, driven by reduced infrastructure costs.
## Key Metrics
| Metric | Value |
|--------|-------|
| Revenue | $4,200,000 |
| Operating Margin | 18.5% |
| Customer Acquisition Cost | $142 |
| Monthly Active Users | 52,400 |
## Recommendations
1. Expand API capacity to handle projected Q1 growth
2. Invest in automated onboarding pipeline
3. Evaluate enterprise tier pricing modelconst response = await fetch(
"https://api.pdfpipe.dev/v1/convert",
{
method: "POST",
headers: {
"Authorization": "Bearer pk_live_...",
"Content-Type": "application/json",
},
body: JSON.stringify({
url: "https://example.com/whitepaper.pdf",
format: "markdown",
}),
}
);
const data = await response.json();
const result = await fetch(data.resultUrl);
const markdown = await result.text();
// Split into chunks for your RAG pipeline
const chunks = markdown.split(/\n#{1,3} /).filter(Boolean);
// Feed each chunk to your vector database
for (const chunk of chunks) {
await vectorDb.upsert({
content: chunk,
embedding: await embed(chunk),
});
}Built for AI workflows
Feed PDFs into ChatGPT, Claude, or any LLM. Build RAG pipelines that chunk on heading boundaries. Index documents into Pinecone, Weaviate, or Chroma. PDFPipe gives you the Markdown that makes all of it work.
Start converting PDFs to Markdown today
Free tier includes 10 requests per month. No credit card required.