feat: Add OCR fallback when MLM is unavailable for image processing

feat: Add OCR fallback when MLM is unavailable for image processing - Add OCR text extraction using easyocr when MLM client/model is not configured - Support both Chinese and English text recognition - Add OCR results under "OCR Text" section in markdown output - Only execute OCR as fallback when MLM description is not available
2024-12-15 11:52:58 +08:00 · 2024-12-15 11:52:58 +08:00 · 02cc0cef84
commit 02cc0cef84
parent 81e3f24acd
1 changed files with 14 additions and 0 deletions
--- a/src/markitdown/_markitdown.py
+++ b/src/markitdown/_markitdown.py
@ -798,6 +798,20 @@ class ImageConverter(MediaConverter):
                ).strip()
                + "\n"
            )
+        # add ocr only when MLM is not available
+        if mlm_client is None or mlm_model is None:
+            try:
+                import easyocr
+                reader = easyocr.Reader(['ch_sim','en']) # support chinese and english 
+                ocr_result = reader.readtext(local_path)
+                if ocr_result:
+                    md_content += "\n"
+                    for detection in ocr_result:
+                        text = detection[1]  # extract text
+                        md_content += f"- {text}\n"
+            except ImportError:
+                # easyocr not installed
+                pass

        return DocumentConverterResult(
            title=None,