Classification Algorithms

Overview

This repository contains Python implementations of several classical classification algorithms along with utility scripts for running experiments and generating visualizations. The focus is on a minimal yet educational approach to algorithms such as k-nearest neighbors, a perceptron classifier, and a one-vs-rest strategy for handling multi-class problems. An additional script demonstrates how these implementations compare with ensemble models from scikit-learn.

Project Structure

/src: Library modules used by the example scripts.
- config.py: Paths and default hyperparameters.
- data/: Dataset loading utilities.
- models/: Implementations of KNNClassifier, Perceptron, and OneVsRestClassifier.
- utils/: Simple evaluation metrics (accuracy, precision, recall, F1) and a confusion matrix helper.
- visualization/: Functions for plotting results and ensuring output directories exist.
/scripts: Command line scripts illustrating how to train and evaluate the models.
- run_knn.py – optimize k and p for k-NN, then evaluate on the wine dataset.
- run_perceptron.py – tune the learning rate of a perceptron using the banknote dataset.
- run_one_vs_rest.py – apply a perceptron in a one-vs-rest setup for the wine dataset.
- run_ensemble.py – compare the custom models with several scikit-learn ensemble methods.
/data: Contains the raw datasets used by the scripts (e.g. data_banknote_authentication.csv).
/docs: Output directories for figures generated by the example runs.
README.md: This file.

Getting Started

Clone the repository

git clone <repo-url>
cd classification-algorithms

Install dependencies

Create a virtual environment and install the required packages:

python -m venv venv
source venv/bin/activate
pip install numpy pandas matplotlib seaborn scikit-learn

Run the example scripts

Each script can be executed directly. Results (plots and printed metrics) will be saved under docs/.

python scripts/run_knn.py
python scripts/run_perceptron.py
python scripts/run_one_vs_rest.py
python scripts/run_ensemble.py

Usage

The scripts are meant as demonstrations of the provided algorithms. They perform typical data loading, preprocessing and evaluation steps.

KNN: loads the wine dataset, splits it into training/validation/test parts, searches for the best k and Minkowski p value, then plots metrics and a t-SNE visualization.
Perceptron: uses the banknote authentication dataset and sweeps the learning rate to find the best model.
One-vs-Rest: wraps the perceptron for multi-class classification on the wine dataset and shows macro and micro averaged metrics.
Ensemble: compares the custom models with scikit-learn random forest, bagging, gradient boosting and others.

Generated plots are stored in the corresponding docs/task_*_results folder.

Documentation

Models

KNNClassifier
- fit(X, y) – memorize the training data.
- predict(X) – return predicted labels for new samples.
- optimize_k and optimize_p – helper functions to evaluate different hyperparameters.
Perceptron
- fit(X, y) – train weights using a simple gradient update rule.
- predict and predict_proba – produce class labels or raw scores.
- optimize_learning_rate – sweep over a range of learning rates.
OneVsRestClassifier – trains one binary perceptron per class and selects the class with the highest score when predicting.

Utilities

Metrics: accuracy, precision, recall, F1 and confusion matrix implemented with NumPy.
Visualization: plotting functions for pairplots, t-SNE embeddings, confusion matrices and basic metric tables.

Testing

The repository currently does not include automated unit tests. Running the example scripts serves as an integration test of the modules.

Future Extensions

Additional algorithms such as logistic regression or SVM.
More extensive visualizations and reporting utilities.
Unit tests and continuous integration configuration.

Contact

For questions feel free to open an issue or reach out to the repository owner.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
data/raw		data/raw
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Classification Algorithms

Overview

Project Structure

Getting Started

Usage

Documentation

Models

Utilities

Testing

Future Extensions

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

JGZimek/classification-algorithms

Folders and files

Latest commit

History

Repository files navigation

Classification Algorithms

Overview

Project Structure

Getting Started

Usage

Documentation

Models

Utilities

Testing

Future Extensions

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages