Welcome to Visual Intelligence, a powerful desktop application designed to extract text from images and PDFs in both Turkish and English. This application leverages Tesseract OCR technology to deliver accurate text extraction, while also providing advanced features such as text summarization, keyword extraction, and detection of tables and QR/barcodes. With a modern and user-friendly interface built using TailwindCSS and Vanilla JS, Visual Intelligence aims to enhance your productivity and streamline your workflow.
- Text Extraction: Extract text from images and PDFs in Turkish and English using Tesseract OCR.
- Text Summarization: Automatically summarize extracted text to highlight key points.
- Keyword Extraction: Identify and extract important keywords from the text.
- Table Detection: Detect and extract tables from documents for easy data manipulation.
- QR/Barcode Detection: Scan and decode QR codes and barcodes from images.
- User-Friendly Interface: Enjoy a clean and modern interface designed for ease of use.
Visual Intelligence is built using a variety of technologies that work together to provide a seamless experience:
- Python: The core programming language for backend processing.
- Flask: A lightweight web framework for building the app.
- Tesseract OCR: An open-source Optical Character Recognition engine.
- Hugging Face: For advanced NLP tasks like summarization and keyword extraction.
- HTML/CSS: For structuring and styling the application interface.
- TailwindCSS: A utility-first CSS framework for rapid UI development.
- PyWebview: To create a web-based GUI for the desktop application.
- Image Processing Libraries: For handling image manipulation and analysis.
To get started with Visual Intelligence, you can download the latest release from the Releases section.
Make sure you have the following installed on your machine:
- Python 3.x
- pip (Python package installer)
- Tesseract OCR
-
Clone the Repository: Open your terminal and run the following command:
git clone https://github.com/Zurdo1007/visual-intelligence.git
-
Navigate to the Project Directory:
cd visual-intelligence
-
Install Dependencies: Run the following command to install the required packages:
pip install -r requirements.txt
-
Download Tesseract OCR: Follow the installation instructions for Tesseract OCR based on your operating system. You can find the installation guide here.
-
Run the Application: Start the application by executing:
python app.py
-
Access the Application: Open your web browser and go to
http://localhost:5000
to start using Visual Intelligence.
Once you have installed Visual Intelligence, you can begin extracting text from images and PDFs. Here’s how to use the main features:
- Upload an Image or PDF: Click on the "Upload" button to select your file.
- Select Language: Choose between Turkish or English for text extraction.
- Extract Text: Click on the "Extract" button to process the file. The extracted text will appear in the text area.
- Extract Text First: Ensure you have extracted text from your document.
- Summarize: Click on the "Summarize" button to generate a summary of the extracted text.
- Extract Text First: Ensure you have extracted text from your document.
- Extract Keywords: Click on the "Extract Keywords" button to identify important keywords from the text.
- Upload an Image or PDF: Click on the "Upload" button to select your file containing tables.
- Detect Tables: Click on the "Detect Tables" button to extract table data.
- Upload an Image: Click on the "Upload" button to select an image containing a QR code or barcode.
- Scan: Click on the "Scan" button to decode the QR code or barcode.
We welcome contributions to improve Visual Intelligence. If you would like to contribute, please follow these steps:
-
Fork the Repository: Click on the "Fork" button at the top right of the page.
-
Create a Branch: Create a new branch for your feature or bug fix:
git checkout -b feature/YourFeatureName
-
Make Changes: Implement your changes in the codebase.
-
Commit Your Changes: Commit your changes with a descriptive message:
git commit -m "Add feature: YourFeatureName"
-
Push to Your Fork:
git push origin feature/YourFeatureName
-
Create a Pull Request: Go to the original repository and click on "New Pull Request".
Visual Intelligence is licensed under the MIT License. See the LICENSE file for more details.
For questions or feedback, feel free to reach out:
- GitHub: Zurdo1007
- Email: zurdo1007@example.com
You can download the latest release of Visual Intelligence from the Releases section. Enjoy exploring the features and capabilities of this powerful application!