Commit graph

  • f86147a536
    Merge 928ddab91a into 73ba69d8cd Pedro Miola 2025-02-09 16:44:35 +0000
  • 98928e101a
    Merge dbf09026bc into 73ba69d8cd Mauro Druwel 2025-02-09 17:59:34 +0600
  • 90367ddec6
    Merge 238ff216dc into 73ba69d8cd Marcos Romero Lamas 2025-02-09 12:04:14 +0700
  • 73ba69d8cd
    For csv files mimetypes.guess_type is returning "application/vnd.ms-excel" on windows causing an invalid mime type in plaintextconverter. In reference to issue: https://github.com/microsoft/markitdown/issues/150 (#273) wunde005 2025-02-08 22:58:13 -0600
  • 016b149f3d
    Merge branch 'main' into main afourney 2025-02-08 20:54:24 -0800
  • 2a4f7bb6a8
    fix: argparse CLI option ordering, fixes #268 (#290) Werner Robitza 2025-02-09 05:50:38 +0100
  • 448d0c7bd7 Fixed formatting. Adam Fourney 2025-02-08 20:49:30 -0800
  • 926c64a1c3
    Merge branch 'main' into fix-options afourney 2025-02-08 20:40:11 -0800
  • 7cf5e0bb23
    feat(pptx): support image description with LLM for pptx files (#306) masquare 2025-02-09 05:37:34 +0100
  • 959ea53f96
    Merge branch 'main' into main afourney 2025-02-08 20:33:15 -0800
  • 3090917a49
    Typo fixed (#270) James Hickey 2025-02-09 00:30:13 -0400
  • d000f5a828
    Merge branch 'main' into patch-1 afourney 2025-02-08 20:29:07 -0800
  • 7bea2672a0
    remove leading and trailing \n for HtmlConverter (#262) ZeyuTeng96 2025-02-09 12:28:35 +0800
  • 621e96ad3f
    Merge branch 'main' into patch-2 afourney 2025-02-08 20:27:29 -0800
  • d7fa9425b0
    Merge b38ece12be into bf6a15e9b5 FeuRicardo 2025-02-05 16:54:51 +0000
  • b38ece12be
    Merge branch 'main' into embedded_image FeuRicardo 2025-02-05 13:54:49 -0300
  • b5b22a825d
    Merge 9c4a542193 into bf6a15e9b5 lumin 2025-02-05 20:20:10 +0900
  • 977660f85a
    Merge 2b4317dc9e into bf6a15e9b5 lumin 2025-02-05 20:20:10 +0900
  • 657d63e5d1
    Merge branch 'main' into fix-docker Sebastian Yaghoubi 2025-02-03 00:22:17 -0800
  • 6964e34a96
    Merge 5a3ca479f1 into bf6a15e9b5 suke 2025-02-03 16:13:43 +0900
  • 238ff216dc
    Merge branch 'main' into equation-support Marcos Romero Lamas 2025-02-01 23:22:10 +0100
  • 3dd2f0a118 test: add test file Marcos Romero Lamas 2025-02-01 18:54:14 +0100
  • bf6a15e9b5
    Kennyzhang/docintel docs (#312) KennyZhang1 2025-02-01 01:23:26 -0500
  • 4e4ca4e6fb
    include reference to doc intel setup docs KennyZhang1 2025-01-31 12:21:35 -0500
  • 8df960093b updated docs to include doc intelligence Kenny Zhang 2025-01-31 12:09:14 -0500
  • e562fb4e94
    Merge b89f51acdc into bfde857420 Ramazan Değirmenci 2025-01-31 00:45:17 +0000
  • b89f51acdc feat: Add comprehensive XML support with structured Markdown conversion ramomen 2025-01-31 03:40:03 +0300
  • 7a3e9223ca feat(pptx): support image description with LLM for pptx files masquare 2025-01-27 13:18:40 +0100
  • 86ab5483ca
    Merge branch 'main' into main Athroniaeth 2025-01-26 12:24:06 +0100
  • 4e1ffc677d
    Update pyproject.toml Ayman Hamed Moustafa 2025-01-26 00:11:16 +0200
  • f1f5c2f2fd
    Update pyproject.toml Ayman Hamed Moustafa 2025-01-26 00:09:20 +0200
  • bfde857420
    Add support for conversion via Document Intelligence (#303) KennyZhang1 2025-01-24 17:09:32 -0500
  • d89b3a3db9
    Merge 7b1088cc80 into f58a864951 Ayman Hamed Moustafa 2025-01-24 09:23:36 +0000
  • 7b1088cc80
    Merge branch 'main' into main Ayman Hamed Moustafa 2025-01-24 11:23:33 +0200
  • 277c234fd1 more toml import fixes Kenny Zhang 2025-01-23 17:55:57 -0500
  • 8ffbfae913 modified project toml file Kenny Zhang 2025-01-23 17:53:21 -0500
  • 46c4890bb4 formatting changes Kenny Zhang 2025-01-23 17:47:30 -0500
  • 9bbf547517 Add “convert_local_content” method to set failed tests Athroniaeth 2025-01-21 23:59:25 +0100
  • dbc93dd584 Adds tests for adding the “convert_local_content” method Athroniaeth 2025-01-21 23:58:22 +0100
  • fea4a0687e feat: surround eqs and convert them to latex Marcos Romero Lamas 2025-01-21 01:06:39 +0100
  • 002c6d1b30 feat: preprocess eqns before html conversion Marcos Romero Lamas 2025-01-20 01:13:30 +0100
  • ca6dc80e22 feat: add some deps Marcos Romero Lamas 2025-01-20 01:04:12 +0100
  • 1c9a938a44
    Merge branch 'main' into feature/llm-description-in-markdown dzemeuksis 2025-01-17 14:33:02 +0100
  • ca5a25140f I changed the prompt as suggested in the PR comments. Michał Zemełka 2025-01-17 14:29:08 +0100
  • 01fea457ed fix: argparse CLI option ordering, fixes #268 Werner Robitza 2025-01-17 11:26:07 +0100
  • 33a0cd8efe small formatting change joshbradley/add-file-input-support Josh Bradley 2025-01-14 18:04:14 -0500
  • 1310bd48ad push doc intel converter to the top of the stack Kenny Zhang 2025-01-14 15:04:25 -0500
  • 1e856c3eb6 Make this a TypeScript SDK uratmangun.ovh 2025-01-13 20:10:34 +0700
  • 8176a4e2cb Make this a TypeScript SDK uratmangun.ovh 2025-01-13 20:10:14 +0700
  • 9230300100 ran tests for docintel and offline for many filetypes Kenny Zhang 2025-01-10 14:11:48 -0500
  • 928ddab91a feat: adding support for images inside docx PedroMiolaSilva 2025-01-10 09:45:32 -0300
  • b211ddbe82 temp fix for ContentFormat import bug Kenny Zhang 2025-01-09 16:03:35 -0500
  • 811e4413aa added isolated doc_intel main conversion function Kenny Zhang 2025-01-09 15:27:03 -0500
  • 62a0d6c082 initialized doc intel client instance field Kenny Zhang 2025-01-09 14:51:02 -0500
  • 06080eb2e8 added DocumentIntelligenceConverter class implementation Kenny Zhang 2025-01-09 14:41:14 -0500
  • 48f1216728 migrate to use HTML converter + add convert_em method to it Raduan77 2025-01-09 20:23:54 +0100
  • d8422ea55e For csv files mimetypes.guess_type is returning "application/vnd.ms-excel" on windows causing an invalid mime type in plaintextconverter. In reference to issue: https://github.com/microsoft/markitdown/issues/150 Eric Wunderlin 2025-01-09 13:22:53 -0600
  • d6debbdaf7 added cli params for doc intel Kenny Zhang 2025-01-09 13:43:16 -0500
  • 42fb33a32e
    add options to keep data uris VoidIsVoid 2025-01-09 18:40:50 +0800
  • 9db3fec959 Merge remote-tracking branch 'origin/add-epub-support' into add-epub-support Raduan77 2025-01-09 11:22:22 +0100
  • f1d9d1f16c merge w/ main Raduan77 2025-01-09 11:19:57 +0100
  • 3aebc24f2f merge w/ main Raduan77 2025-01-09 11:17:07 +0100
  • 68cc8aa672 add support for EML Raduan77 2025-01-09 11:14:50 +0100
  • 8392721f93
    Typo fixed James Hickey 2025-01-08 17:53:06 -0400
  • cbbc829917
    Merge branch 'main' into embedded_image FeuRicardo 2025-01-07 17:14:11 -0300
  • a1766c5981 update: cli options added for engine selection tungsten106 2025-01-07 15:05:30 +0800
  • 94876e873e
    Merge 57ccae421b into f58a864951 AbSadiki 2025-01-06 21:44:54 +0100
  • c47f856250
    Merge d46cff8857 into f58a864951 Vijay Soni 2025-01-06 21:44:53 +0100
  • f58a864951
    Set exiftool path explicitly. (#267) afourney 2025-01-06 12:43:47 -0800
  • 1bf73938a6 Set exiftool path explicitly. Adam Fourney 2025-01-06 10:22:42 -0800
  • 265aea2edf
    Removed the holiday away message from README.md (#266) afourney 2025-01-06 09:06:21 -0800
  • 42395849ea Removed the holiday away message from README.md Adam Fourney 2025-01-06 09:04:01 -0800
  • 2f655da810 added the ability to call Ollama client seamlessly Ayman Hamed 2025-01-06 17:11:19 +0200
  • da1007085c Add API endpoints for file conversion Brian Yang 2025-01-06 00:45:58 -0500
  • 08a45fa4bd
    remove leading and trailing \n for HtmlConverter ZeyuTeng96 2025-01-06 09:59:46 +0800
  • 1c0362f375 rm todo yeungadrian 2025-01-04 13:17:43 +0000
  • dbf09026bc Remove newlines in image alt_text Mauro Druwel 2025-01-04 13:26:27 +0100
  • afda281a67 Add more images Mauro Druwel 2025-01-04 13:08:21 +0100
  • 1b0d1491be Pre-commit Mauro Druwel 2025-01-04 12:54:03 +0100
  • 3a6f023f0b Underscores, length limit, unique name, tests Mauro Druwel 2025-01-04 12:53:02 +0100
  • 0a9e1f4d75
    Merge branch 'main' into main Mauro Druwel 2025-01-04 11:11:16 +0100
  • 05b78e7ce1
    Recognize json as plain text (if no other handlers are present). (#261) afourney 2025-01-03 16:40:43 -0800
  • 18667a86f7 Forgot the test file! Adam Fourney 2025-01-03 16:38:34 -0800
  • d7b47ae326 Recognize json as plain text (if no other handlers are present). Adam Fourney 2025-01-03 16:30:44 -0800
  • bbacf89b53 remove pandas, use calamine + tabulate yeungadrian 2025-01-04 00:22:59 +0000
  • 436407288f
    If puremagic has no guesses, try again after ltrim. (#260) afourney 2025-01-03 16:03:11 -0800
  • 7f63bb424b If puremagic has no guesses, try again after ltrim. Adam Fourney 2025-01-03 15:59:33 -0800
  • b95312172f combine xlsx and xls to excel, replace openpxyl/xlrd with calamine yeungadrian 2025-01-03 23:49:37 +0000
  • 7548720917
    Merge dd977ca1d8 into 731b39e7f5 Hemanth HM 2025-01-03 23:27:05 +0000
  • 731b39e7f5
    Added a test for leading spaces. (#258) afourney 2025-01-03 14:34:33 -0800
  • 5452c6b014 Added a test for leading spaces. Adam Fourney 2025-01-03 14:31:23 -0800
  • 08ed32869e
    Feature/ Add xls support (#169) yeungadrian 2025-01-03 21:58:17 +0000
  • 3acd5067f8
    Merge branch 'main' into feature/xls-support afourney 2025-01-03 13:56:02 -0800
  • fe5824f0d7
    Merge branch 'main' into main afourney 2025-01-03 13:42:34 -0800
  • d248621ba4
    feat: outlook ".msg" file converter (#196) Murat Can Kurtuluş 2025-01-04 00:34:39 +0300
  • 39e862e3ca
    Merge branch 'main' into feat/outlook-msg-converter afourney 2025-01-03 13:31:02 -0800
  • a6bfa30628
    Merge e2470fc413 into 4678c8a2a4 Tom 2025-01-03 22:30:13 +0100
  • 4678c8a2a4
    fix(transcription): IS_AUDIO_TRANSCRIPTION_CAPABLE should be iniztialized (#194) AbSadiki 2025-01-03 16:29:26 -0500
  • e2470fc413 Add Ollama integration for image descriptions Tom 2025-01-03 13:48:19 -0700
  • 61c0c584ab refactor: split _markitdown.py into modular components t3tra-dev 2025-01-03 20:50:21 +0900