@@ -43,6 +52,13 @@ You can help by looking at issues or helping review PRs. Any issue or PR is welc
- Run pre-commit checks before submitting a PR: `pre-commit run --all-files`
+- Run pre-commit checks before submitting a PR: `pre-commit run --all-files`
+
+
+### How to Contribute
+
+You can help by looking at issues or helping review PRs. Any issue or PR is welcome, but we have also marked some as 'open for contribution' and 'open for reviewing' to help facilitate community contributions. These are ofcourse just suggestions and you are welcome to contribute in any way you like.
+
## Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft
From 38295fb235189edd1409bd2d3d78f862d83a5044 Mon Sep 17 00:00:00 2001
From: Lalitha A R <165548623+lalithaar@users.noreply.github.com>
Date: Wed, 25 Dec 2024 13:45:21 +0530
Subject: [PATCH 11/15] Update contributing.md
---
docs/contributor-guide/contributing.md | 62 +++++++++++++++++---------
1 file changed, 40 insertions(+), 22 deletions(-)
diff --git a/docs/contributor-guide/contributing.md b/docs/contributor-guide/contributing.md
index 37c104f..149792a 100644
--- a/docs/contributor-guide/contributing.md
+++ b/docs/contributor-guide/contributing.md
@@ -1,29 +1,45 @@
-# How to Contribute
+# Contributing to MarkItDown
-This project welcomes contributions and suggestions.
+Welcome! We're pleased that you're considering contributing to this project. Whether you're fixing a typo, reporting a bug, suggesting a feature, or writing code, your contributions are highly valued and appreciated.
## Steps to Contribute
-1. Fork the repository.
-2. Create a branch for your feature or bug fix.
-3. Write your code and tests.
-4. Submit a pull request.
+
+Follow these steps to get started:
+
+1. **Fork the Repository**
+ Create a copy of the repository by forking it on GitHub.
+
+2. **Create a Branch**
+ Make a branch for your feature or bug fix. Use a meaningful name like `feature/add-login` or `fix/typo-readme`.
+
+3. **Write Your Code**
+ Add your changes, write tests if necessary, and ensure your code is clean and well-documented.
+
+4. **Run Tests and Pre-Commit Checks**
+ Before submitting, please make sure your code passes all tests and follows the code formatting guidelines (see the [Running Tests and Checks](#running-tests-and-checks) section).
+
+5. **Submit a Pull Request (PR)**
+ Open a pull request to share your changes with us. Reviewers will help you improve it.
## Contributor License Agreement (CLA)
-Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
-the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
+Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
-When you submit a pull request, a CLA bot will automatically determine whether you need to provide
-a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
-provided by the bot. You will only need to do this once across all repos using our CLA.
+- When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment).
+- Follow the instructions provided by the bot.
+- You will only need to do this once across all repos using our CLA.
-This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
-For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
-contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
+## Code of Conduct
+
+We have adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). By participating in this project, you agree to uphold these standards.
+
+- **For FAQs or more information:** Visit the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq).
+- **For questions or concerns:** Contact [opencode@microsoft.com](mailto:opencode@microsoft.com).
## Getting Started
-To start contributing, refer to the [Running Tests and Checks](#running-tests-and-checks) section.
-## Issues and PRs
+Ready to contribute? Start here:
+
+### Issues and PRs
You can help by looking at issues or helping review PRs. Any issue or PR is welcome, but we have also marked some as 'open for contribution' and 'open for reviewing' to help facilitate community contributions. These are of course just suggestions and you are welcome to contribute in any way you like.
@@ -35,7 +51,7 @@ You can help by looking at issues or helping review PRs. Any issue or PR is welc
-### Running Tests and Checks
+## Running Tests and Checks
- Install `hatch` in your environment and run tests:
```sh
@@ -52,12 +68,12 @@ You can help by looking at issues or helping review PRs. Any issue or PR is welc
- Run pre-commit checks before submitting a PR: `pre-commit run --all-files`
-- Run pre-commit checks before submitting a PR: `pre-commit run --all-files`
+##### Pre-Commit Checks
-
-### How to Contribute
-
-You can help by looking at issues or helping review PRs. Any issue or PR is welcome, but we have also marked some as 'open for contribution' and 'open for reviewing' to help facilitate community contributions. These are ofcourse just suggestions and you are welcome to contribute in any way you like.
+Before submitting your pull request, run these checks to ensure code quality:
+```sh
+pre-commit run --all-files
+```
## Trademarks
@@ -66,3 +82,5 @@ trademarks or logos is subject to and must follow
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
Any use of third-party trademarks or logos are subject to those third-party's policies.
+
+Thank you for helping make this project better!
From 351a70b7e4ac51bd6e53031d6637f41545f591a5 Mon Sep 17 00:00:00 2001
From: Lalitha A R <165548623+lalithaar@users.noreply.github.com>
Date: Wed, 25 Dec 2024 13:48:15 +0530
Subject: [PATCH 12/15] Create code of conduct.md
---
docs/contributor-guide/code of conduct.md | 52 +++++++++++++++++++++++
1 file changed, 52 insertions(+)
create mode 100644 docs/contributor-guide/code of conduct.md
diff --git a/docs/contributor-guide/code of conduct.md b/docs/contributor-guide/code of conduct.md
new file mode 100644
index 0000000..b59dbb0
--- /dev/null
+++ b/docs/contributor-guide/code of conduct.md
@@ -0,0 +1,52 @@
+>This project has adopted the Microsoft Open Source Code of Conduct.
+>
+
+This code of conduct outlines expectations for participation in Microsoft-managed open source communities, as well as steps for reporting unacceptable behavior. We are committed to providing a welcoming and inspiring community for all. People violating this code of conduct may be banned from the community.
+
+
+
+Our Pledge
+We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
+
+We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
+
+Our Standards
+Examples of behavior that contributes to a positive environment for our community include:
+
+Demonstrating empathy and kindness toward other people
+Being respectful of differing opinions, viewpoints, and experiences
+Giving and gracefully accepting constructive feedback
+Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
+Focusing on what is best not just for us as individuals, but for the overall community
+
+Examples of unacceptable behavior include:
+
+The use of sexualized language or imagery, and sexual attention or advances of any kind
+Trolling, insulting or derogatory comments, and personal or political attacks
+Public or private harassment
+Disruptive behavior
+Publishing others' private information, such as a physical or email address, without their explicit permission
+Other conduct which could reasonably be considered inappropriate in a professional setting
+Enforcement Responsibilities
+Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
+
+Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
+
+Scope
+This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
+
+This Code of Conduct also applies to actions taken outside of these spaces, and which have a negative impact on community health.
+
+Enforcement and Reporting
+We encourage all communities to resolve issues on their own whenever possible. Instances of abusive, harassing, or otherwise unacceptable behavior should be reported to the community leaders responsible for enforcement in a given project or to opencode@microsoft.com. If you are a Microsoft employee looking for support, please use the Community 911 reporting process.
+
+Your report will be handled in accordance with the issue resolution process described in the Code of Conduct FAQ. All project and community leaders are obligated to respect the privacy and security of the reporter of any incident.
+
+Attribution
+This Code of Conduct is adapted from the Contributor Covenant, version 2.0, available at https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
+
+Community Impact Guidelines were inspired by Mozilla's code of conduct enforcement ladder.
+
+Expanding scope to include external impact on community health inspired by Facebook's Open Source Code of Conduct and Mozilla's Community Participation Guidelines.
+
+For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.
From c4e4a747e38370581d89f4c30f6782cd92c33939 Mon Sep 17 00:00:00 2001
From: Lalitha A R <165548623+lalithaar@users.noreply.github.com>
Date: Wed, 25 Dec 2024 13:55:27 +0530
Subject: [PATCH 13/15] Update code of conduct.md
---
docs/contributor-guide/code of conduct.md | 54 +++++++++++------------
1 file changed, 25 insertions(+), 29 deletions(-)
diff --git a/docs/contributor-guide/code of conduct.md b/docs/contributor-guide/code of conduct.md
index b59dbb0..d4c1d22 100644
--- a/docs/contributor-guide/code of conduct.md
+++ b/docs/contributor-guide/code of conduct.md
@@ -1,52 +1,48 @@
->This project has adopted the Microsoft Open Source Code of Conduct.
->
+>This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
+>We have adopted the Microsoft Open Source Code of Conduct. By participating in this project, you agree to uphold these standards.
-This code of conduct outlines expectations for participation in Microsoft-managed open source communities, as well as steps for reporting unacceptable behavior. We are committed to providing a welcoming and inspiring community for all. People violating this code of conduct may be banned from the community.
+>For FAQs or more information: Visit the [Code of Conduct FAQ](https://www.contributor-covenant.org/faq/).
+>For questions or concerns: Contact [opencode@microsoft.com](mailto:opencode@microsoft.com).
+**This code of conduct outlines expectations for participation in Microsoft-managed open source communities, as well as steps for reporting unacceptable behavior. We are committed to providing a welcoming and inspiring community for all. People violating this code of conduct may be banned from the community.**
-
-Our Pledge
+# Our Pledge
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
-Our Standards
+# Our Standards
Examples of behavior that contributes to a positive environment for our community include:
-Demonstrating empathy and kindness toward other people
-Being respectful of differing opinions, viewpoints, and experiences
-Giving and gracefully accepting constructive feedback
-Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
-Focusing on what is best not just for us as individuals, but for the overall community
+- Demonstrating empathy and kindness toward other people
+- Being respectful of differing opinions, viewpoints, and experiences
+- Giving and gracefully accepting constructive feedback
+- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
+- Focusing on what is best not just for us as individuals, but for the overall community
Examples of unacceptable behavior include:
-The use of sexualized language or imagery, and sexual attention or advances of any kind
-Trolling, insulting or derogatory comments, and personal or political attacks
-Public or private harassment
-Disruptive behavior
-Publishing others' private information, such as a physical or email address, without their explicit permission
-Other conduct which could reasonably be considered inappropriate in a professional setting
-Enforcement Responsibilities
+- The use of sexualized language or imagery, and sexual attention or advances of any kind
+- Trolling, insulting or derogatory comments, and personal or political attacks
+- Public or private harassment
+- Disruptive behavior
+- Publishing others' private information, such as a physical or email address, without their explicit permission
+- Other conduct which could reasonably be considered inappropriate in a professional setting
+
+# Enforcement Responsibilities
+
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
-Scope
+# Scope
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
This Code of Conduct also applies to actions taken outside of these spaces, and which have a negative impact on community health.
-Enforcement and Reporting
-We encourage all communities to resolve issues on their own whenever possible. Instances of abusive, harassing, or otherwise unacceptable behavior should be reported to the community leaders responsible for enforcement in a given project or to opencode@microsoft.com. If you are a Microsoft employee looking for support, please use the Community 911 reporting process.
+# Enforcement and Reporting
+
+We encourage all communities to resolve issues on their own whenever possible. Instances of abusive, harassing, or otherwise unacceptable behavior should be reported to the community leaders responsible for enforcement in a given project or to opencode@microsoft.com. If you are a Microsoft employee looking for support, please use the [Community 911 reporting process](https://aka.ms/community-911-landingpage).
Your report will be handled in accordance with the issue resolution process described in the Code of Conduct FAQ. All project and community leaders are obligated to respect the privacy and security of the reporter of any incident.
-Attribution
-This Code of Conduct is adapted from the Contributor Covenant, version 2.0, available at https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
-
-Community Impact Guidelines were inspired by Mozilla's code of conduct enforcement ladder.
-
-Expanding scope to include external impact on community health inspired by Facebook's Open Source Code of Conduct and Mozilla's Community Participation Guidelines.
-
-For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.
From 40d0ac284f4ef26957a0824fff13f1764ba6af7a Mon Sep 17 00:00:00 2001
From: Lalitha A R <165548623+lalithaar@users.noreply.github.com>
Date: Wed, 25 Dec 2024 13:57:40 +0530
Subject: [PATCH 14/15] Delete docs/contributor-guide/code of conduct.md
---
docs/contributor-guide/code of conduct.md | 48 -----------------------
1 file changed, 48 deletions(-)
delete mode 100644 docs/contributor-guide/code of conduct.md
diff --git a/docs/contributor-guide/code of conduct.md b/docs/contributor-guide/code of conduct.md
deleted file mode 100644
index d4c1d22..0000000
--- a/docs/contributor-guide/code of conduct.md
+++ /dev/null
@@ -1,48 +0,0 @@
->This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
->We have adopted the Microsoft Open Source Code of Conduct. By participating in this project, you agree to uphold these standards.
-
->For FAQs or more information: Visit the [Code of Conduct FAQ](https://www.contributor-covenant.org/faq/).
->For questions or concerns: Contact [opencode@microsoft.com](mailto:opencode@microsoft.com).
-
-**This code of conduct outlines expectations for participation in Microsoft-managed open source communities, as well as steps for reporting unacceptable behavior. We are committed to providing a welcoming and inspiring community for all. People violating this code of conduct may be banned from the community.**
-
-# Our Pledge
-We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
-
-We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
-
-# Our Standards
-Examples of behavior that contributes to a positive environment for our community include:
-
-- Demonstrating empathy and kindness toward other people
-- Being respectful of differing opinions, viewpoints, and experiences
-- Giving and gracefully accepting constructive feedback
-- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
-- Focusing on what is best not just for us as individuals, but for the overall community
-
-Examples of unacceptable behavior include:
-
-- The use of sexualized language or imagery, and sexual attention or advances of any kind
-- Trolling, insulting or derogatory comments, and personal or political attacks
-- Public or private harassment
-- Disruptive behavior
-- Publishing others' private information, such as a physical or email address, without their explicit permission
-- Other conduct which could reasonably be considered inappropriate in a professional setting
-
-# Enforcement Responsibilities
-
-Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
-
-Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
-
-# Scope
-This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
-
-This Code of Conduct also applies to actions taken outside of these spaces, and which have a negative impact on community health.
-
-# Enforcement and Reporting
-
-We encourage all communities to resolve issues on their own whenever possible. Instances of abusive, harassing, or otherwise unacceptable behavior should be reported to the community leaders responsible for enforcement in a given project or to opencode@microsoft.com. If you are a Microsoft employee looking for support, please use the [Community 911 reporting process](https://aka.ms/community-911-landingpage).
-
-Your report will be handled in accordance with the issue resolution process described in the Code of Conduct FAQ. All project and community leaders are obligated to respect the privacy and security of the reporter of any incident.
-
From 10e57415d209e8cd9ba26b0d8bb06acf4ceb4c72 Mon Sep 17 00:00:00 2001
From: Lalitha A R <165548623+lalithaar@users.noreply.github.com>
Date: Wed, 25 Dec 2024 13:59:14 +0530
Subject: [PATCH 15/15] Delete docs/user-guide/readme.md
---
docs/user-guide/readme.md | 110 --------------------------------------
1 file changed, 110 deletions(-)
delete mode 100644 docs/user-guide/readme.md
diff --git a/docs/user-guide/readme.md b/docs/user-guide/readme.md
deleted file mode 100644
index 1002576..0000000
--- a/docs/user-guide/readme.md
+++ /dev/null
@@ -1,110 +0,0 @@
-> [!IMPORTANT]
-> (12/19/24) Hello! MarkItDown team members will be resting and recharging with family and friends over the holiday period. Activity/responses on the project may be delayed during the period of Dec 21-Jan 06. We will be excited to engage with you in the new year!
-
-# MarkItDown
-
-[](https://pypi.org/project/markitdown/)
-
-[](https://github.com/microsoft/autogen)
-
-
-MarkItDown is a utility for converting various files to Markdown (e.g., for indexing, text analysis, etc).
-It supports:
-- PDF
-- PowerPoint
-- Word
-- Excel
-- Images (EXIF metadata and OCR)
-- Audio (EXIF metadata and speech transcription)
-- HTML
-- Text-based formats (CSV, JSON, XML)
-- ZIP files (iterates over contents)
-
-To install MarkItDown, use pip: `pip install markitdown`. Alternatively, you can install it from the source: `pip install -e .`
-
-## Usage
-
-### Command-Line
-
-```bash
-markitdown path-to-file.pdf > document.md
-```
-
-Or use `-o` to specify the output file:
-
-```bash
-markitdown path-to-file.pdf -o document.md
-```
-
-You can also pipe content:
-
-```bash
-cat path-to-file.pdf | markitdown
-```
-
-### Python API
-
-Basic usage in Python:
-
-```python
-from markitdown import MarkItDown
-
-md = MarkItDown()
-result = md.convert("test.xlsx")
-print(result.text_content)
-```
-
-To use Large Language Models for image descriptions, provide `llm_client` and `llm_model`:
-
-```python
-from markitdown import MarkItDown
-from openai import OpenAI
-
-client = OpenAI()
-md = MarkItDown(llm_client=client, llm_model="gpt-4o")
-result = md.convert("example.jpg")
-print(result.text_content)
-```
-
-### Docker
-
-```sh
-docker build -t markitdown:latest .
-docker run --rm -i markitdown:latest < ~/your-file.pdf > output.md
-```
-
-
-Batch Processing Multiple Files
-
-This example shows how to convert multiple files to markdown format in a single run. The script processes all supported files in a directory and creates corresponding markdown files.
-
-
-```python convert.py
-from markitdown import MarkItDown
-from openai import OpenAI
-import os
-client = OpenAI(api_key="your-api-key-here")
-md = MarkItDown(llm_client=client, llm_model="gpt-4o-2024-11-20")
-supported_extensions = ('.pptx', '.docx', '.pdf', '.jpg', '.jpeg', '.png')
-files_to_convert = [f for f in os.listdir('.') if f.lower().endswith(supported_extensions)]
-for file in files_to_convert:
- print(f"\nConverting {file}...")
- try:
- md_file = os.path.splitext(file)[0] + '.md'
- result = md.convert(file)
- with open(md_file, 'w') as f:
- f.write(result.text_content)
-
- print(f"Successfully converted {file} to {md_file}")
- except Exception as e:
- print(f"Error converting {file}: {str(e)}")
-
-print("\nAll conversions completed!")
-```
-2. Place the script in the same directory as your files
-3. Install required packages: like openai
-4. Run script ```bash python convert.py ```
-
-Note that original files will remain unchanged and new markdown files are created with the same base name.
-
-