feat: Add OCR fallback when MLM is unavailable for image processing

feat: Add OCR fallback when MLM is unavailable for image processing - Add OCR text extraction using easyocr when MLM client/model is not configured - Support both Chinese and English text recognition - Add OCR results under "OCR Text" section in markdown output - Only execute OCR as fallback when MLM description is not available
2024-12-15 11:52:58 +08:00 · 2024-12-15 11:52:58 +08:00 · 02cc0cef84
commit 02cc0cef84
parent 81e3f24acd
1 changed files with 14 additions and 0 deletions
--- a/src/markitdown/_markitdown.py
+++ b/src/markitdown/_markitdown.py
@ -798,6 +798,20 @@ class ImageConverter(MediaConverter):
                ).strip()
                + "\n"
            )
        # add ocr only when MLM is not available
        if mlm_client is None or mlm_model is None:
            try:
                import easyocr
                reader = easyocr.Reader(['ch_sim','en']) # support chinese and english 
                ocr_result = reader.readtext(local_path)
                if ocr_result:
                    md_content += "\n"
                    for detection in ocr_result:
                        text = detection[1]  # extract text
                        md_content += f"- {text}\n"
            except ImportError:
                # easyocr not installed
                pass
        return DocumentConverterResult(
            title=None,