https://catalogartifact.azureedge.net/publicartifacts/johnsnowlabsinc1646051154808.visual_language_ocr_llm-cfd7a26c-dbb2-41f7-b562-4a71b9f8097b/fc57ac63-0311-49fa-9d57-ba22e1f85c6e_logo216x216.png
Vision OCR LLM
by John Snow Labs Inc
Just a moment, logging you in...
Extracts text from forms, invoices, receipts, medical records, legal documents, and complex structured layouts.
This 30B parameter vision-language model delivers production-grade optical character recognition with enterprise-level accuracy across diverse document types. Powered by a Mixture-of-Experts architecture that activates only 3B parameters per token, the model
It achieves exceptional OCR performance while maintaining computational efficiency. The model excels at extracting text from forms, invoices, receipts, medical records, legal documents, and complex structured layouts, achieving 88% accuracy on industry-standard OCR benchmarks.
With specialized training in form understanding, it demonstrates a 14.7 Character Error Rate on FUNSD benchmark, making it highly effective for automated document processing pipelines.
The 32K context window enables processing of multi-page documents and batch operations in a single inference pass.
Optimized for high-throughput production environments, it processes thousands of documents efficiently while maintaining consistent accuracy across diverse document formats including tables, multi-column layouts, and mixed-content documents.
OCR Performance
Achieves 88% accuracy on OCRBench evaluations
Demonstrates 14.7 Character Error Rate on FUNSD form understanding
Handles 20+ languages with consistent accuracy Robust text extraction from receipts, invoices, forms, and business documents
Excellent performance on complex layouts and structured documents
Technical Specifications
30B total parameters with 3B active per inference (MoE architecture)
Maximum context length: 32K tokens
Image resolution: Up to 8MP/4K (3840 X 2160)
Fast inference through efficient architecture design
Supports batch processing for high-volume workflows
Document Understanding
Strong performance on charts and data visualizations
Excellent table extraction and structure preservation
Reliable text extraction from complex multi-column layouts
Handles documents with varying quality and orientations
Effective processing of mixed-content documents
Production Advantages:
Real-time inference suitable for automated workflows
Consistent performance across diverse document types
Optimized for integration with document management systems
Balances accuracy and speed for enterprise-scale deployments
Ideal for high-volume document processing pipeline
Other apps from John Snow Labs Inc
Generative AI LabJohn Snow Labs IncGenerative AI Lab is an End-to-End No-Code platform for data labeling and DL model and LLM training.
+1
Applicable to:
Containers
NaN out of 5
John Snow Labs - Healthcare NLPJohn Snow Labs IncNLP & OCR libraries, models and notebooks for text and image annotation and model training & tuning
+1
Applicable to:
Virtual Machines
NaN out of 5
Medical Visual LLM - 30BJohn Snow Labs IncMedical vision-language model combining top-tier depth and accuracy in processing complex medical cases and literature medical expertise.
+1
Applicable to:
Virtual Machines
NaN out of 5
Medical LLM - MediumJohn Snow Labs IncUse for chat, RAG, medical summarization, open-book question answering with context of up to 32K tokens.
+1
Applicable to:
Virtual Machines
NaN out of 5
John Snow Labs - Finance and Legal NLPJohn Snow Labs IncNLP & OCR libraries, models and notebooks for text and image annotation and model training & tuning
+1
Applicable to:
Virtual Machines
NaN out of 5