Extract Structured Data from Documents with Claude
Turn long documents into clean JSON, tables, or custom schemas. Claude's long context and instruction-following make it ideal.
Use cases
- Extract key terms from contracts
- Turn meeting transcripts into action items with owners and dates
- Parse resumes into structured profiles
- Convert unstructured notes into a database-ready format
Step 1: Define your schema
Write the exact structure you want. Example: {"summary": "string", "action_items": [{"task": "string", "owner": "string", "due": "YYYY-MM-DD"}], "decisions": ["string"]}
Step 2: Craft the prompt
"Extract the following from this document. Return valid JSON matching this schema exactly. If a field is missing, use null. Do not add fields not in the schema."
Paste the schema and the document. For very long docs, chunk by section and merge results, or use Claude's 200K context.
Step 3: Validate output
Parse the JSON. Check for required fields. Handle malformed output (retry with "Fix the JSON" or add validation in code).
Step 4: Integrate
Feed the JSON into your DB, CRM, or next workflow step. Use Make, n8n, or Zapier to automate: new doc → Claude extraction → write to Airtable/Notion/Sheets.
Tips
- Include 1–2 examples in the prompt for complex schemas.
- For tables, ask for Markdown or CSV if that's easier to parse.
- Always verify critical extractions (legal, financial) with a human.
Discussion
Sign in to comment. Your account must be at least 1 day old.