Skip to content

Commit 6ca52e8

Browse files
updated fixing
1 parent 6d60865 commit 6ca52e8

File tree

2 files changed

+107
-1
lines changed

2 files changed

+107
-1
lines changed

packages/markitdown/src/README.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# Openize.MarkItDown
2+
3+
![Python Version](https://img.shields.io/badge/python-3.7%2B-blue)
4+
![License](https://img.shields.io/badge/license-MIT-green)
5+
![Status](https://img.shields.io/badge/status-alpha-orange)
6+
7+
Openize.MarkItDown is a Python package that converts documents into Markdown format. It supports multiple file formats, provides flexible output handling, and integrates with LLMs for extended processing.
8+
9+
## Features
10+
11+
- Convert `.docx`, `.pdf`, `.xlsx`, and `.pptx` to Markdown.
12+
- Save Markdown files locally or send them to an LLM for processing.
13+
- Structured with the **Factory & Strategy Pattern** for scalability.
14+
- Works with Windows and Linux-compatible paths.
15+
- Command-line interface for easy use.
16+
17+
## Requirements
18+
19+
This package depends on the Aspose libraries, which are commercial products:
20+
21+
- [Aspose.Words](https://purchase.aspose.com/buy/words/python)
22+
- [Aspose.Cells](https://purchase.aspose.com/buy/cells/python)
23+
- [Aspose.Slides](https://purchase.aspose.com/buy/slides/python)
24+
25+
You'll need to obtain valid licenses for these libraries separately. The package will install these dependencies, but you're responsible for complying with Aspose's licensing terms.
26+
27+
## Installation
28+
29+
### From TestPyPI
30+
31+
```sh
32+
pip install -i https://test.pypi.org/simple/ openize-markitdown
33+
```
34+
35+
### From Source
36+
37+
```sh
38+
git clone https://github.com/openize-com/Openize.MarkItDown.git
39+
cd Openize.MarkItDown
40+
pip install -e .
41+
```
42+
43+
## Usage
44+
45+
### Command Line Interface
46+
47+
```sh
48+
# Convert a file and save locally
49+
markitdown document.docx
50+
51+
# Specify output directory
52+
markitdown document.docx -o output_folder
53+
54+
# Process with an LLM (requires OPENAI_API_KEY environment variable)
55+
markitdown document.docx --llm
56+
```
57+
58+
### Python API
59+
60+
```python
61+
from openize.markitdown import DocumentProcessor
62+
63+
# Initialize with custom output directory
64+
processor = DocumentProcessor(output_dir="my_markdown_files")
65+
66+
# Convert files and save locally
67+
processor.process_document("document.docx")
68+
processor.process_document("presentation.pptx")
69+
processor.process_document("spreadsheet.xlsx")
70+
processor.process_document("sample.pdf")
71+
72+
# Send to LLM for processing (requires OPENAI_API_KEY environment variable)
73+
processor.process_document("document.docx", insert_into_llm=True)
74+
```
75+
76+
## Environment Variables
77+
78+
- `OPENAI_API_KEY`: Required when using the `insert_into_llm=True` option or the `--llm` flag.
79+
80+
## Running Tests
81+
82+
```sh
83+
# Install test dependencies
84+
pip install pytest pytest-mock
85+
86+
# Run the tests
87+
pytest
88+
```
89+
90+
## Contributing
91+
92+
We appreciate your interest in contributing to this project! To ensure a smooth collaboration, please follow these steps when submitting a pull request:
93+
94+
1. **Fork & Clone** – Fork the repository and clone it to your local machine.
95+
2. **Create a Branch** – Use a new branch for your contribution.
96+
3. **Sign the Contributor License Agreement (CLA)** – Before your first contribution can be accepted, you must sign our CLA via [CLA Assistant](https://cla-assistant.io). You will be prompted to sign it when submitting your first pull request. You can also review the CLA here: [https://cla.openize.com/agreement](https://cla.openize.com/agreement).
97+
4. **Submit a Pull Request (PR)** – Once your changes are ready, open a PR with a clear description.
98+
5. **Review & Feedback** – Our maintainers will review your PR and provide feedback if needed.
99+
100+
By contributing, you agree to the terms of the CLA and confirm that your changes comply with the project's licensing policies.
101+
102+
## License
103+
104+
This package is licensed under the MIT License. However, it depends on Aspose libraries, which are proprietary, closed-source libraries.
105+
106+
⚠️ Users must obtain a valid license for Aspose libraries separately. This repository does not include or distribute any proprietary components.

packages/markitdown/src/setup.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ version = 25.3.1
44
author = Openize
55
author_email = packages@openize.com
66
description = A document converter for Word, PDF, Excel, and PowerPoint to Markdown.
7-
long_description = file: ../../../../README.md
7+
long_description = file:README.md
88
long_description_content_type = text/markdown
99
license = MIT
1010
license_files = LICENSE

0 commit comments

Comments
 (0)