markitdown/packages/markitdown-sample-plugin/README.md

69 lines
2.6 KiB
Markdown
Raw Normal View History

2025-02-10 08:13:17 +00:00
# MarkItDown Sample Plugin
[![PyPI](https://img.shields.io/pypi/v/markitdown.svg)](https://pypi.org/project/markitdown/)
![PyPI - Downloads](https://img.shields.io/pypi/dd/markitdown)
[![Built by AutoGen Team](https://img.shields.io/badge/Built%20by-AutoGen%20Team-blue)](https://github.com/microsoft/autogen)
2025-02-10 19:05:20 +00:00
This project shows how to create a sample plugin for MarkItDown. The most important parts are as follows:
2025-02-10 08:25:23 +00:00
2025-02-10 19:05:20 +00:00
FNext, implement your custom DocumentConverter:
2025-02-10 08:25:23 +00:00
```python
from typing import Union
from markitdown import DocumentConverter, DocumentConverterResult
class RtfConverter(DocumentConverter):
def convert(self, local_path, **kwargs) -> Union[None, DocumentConverterResult]:
# Bail if not an RTF file
extension = kwargs.get("file_extension", "")
if extension.lower() != ".rtf":
return None
# Implement the conversion logic here ...
# Return the result
return DocumentConverterResult(
title=title,
text_content=text_content,
)
```
2025-02-10 19:05:20 +00:00
Next, make sure your package implements and exports the following:
2025-02-10 08:25:23 +00:00
2025-02-10 19:05:20 +00:00
```python
# The version of the plugin interface that this plugin uses.
# The only supported version is 1 for now.
__plugin_interface_version__ = 1
# The main entrypoint for the plugin. This is called each time MarkItDown instances are created.
def register_converters(markitdown: MarkItDown, **kwargs):
"""
Called during construction of MarkItDown instances to register converters provided by plugins.
"""
# Simply create and attach an RtfConverter instance
markitdown.register_converter(RtfConverter())
2025-02-10 08:25:23 +00:00
```
2025-02-10 19:05:20 +00:00
Finally, create an entrypoint in the `pyproject.toml` file:
```toml
[project.entry-points."markitdown.plugin"]
sample_plugin = "markitdown_sample_plugin"
```
Here, the value of `sample_plugin` can be any key, but should ideally be the name of the plugin. The value is the fully qualified name of the package implementing the plugin.
2025-02-10 08:25:23 +00:00
2025-02-10 19:05:20 +00:00
Once the plugin package is installed (e.g., `pip install -e .`), MarkItDown will automatically discover register it for use.
2025-02-10 08:25:23 +00:00
2025-02-10 08:13:17 +00:00
## Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft
trademarks or logos is subject to and must follow
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
Any use of third-party trademarks or logos are subject to those third-party's policies.