This repository contains Python implementations of several classical classification algorithms along with utility scripts for running experiments and generating visualizations. The focus is on a minimal yet educational approach to algorithms such as k-nearest neighbors, a perceptron classifier, and a one-vs-rest strategy for handling multi-class problems. An additional script demonstrates how these implementations compare with ensemble models from scikit-learn.
- /src: Library modules used by the example scripts.
- config.py: Paths and default hyperparameters.
- data/: Dataset loading utilities.
- models/: Implementations of
KNNClassifier
,Perceptron
, andOneVsRestClassifier
. - utils/: Simple evaluation metrics (accuracy, precision, recall, F1) and a confusion matrix helper.
- visualization/: Functions for plotting results and ensuring output directories exist.
- /scripts: Command line scripts illustrating how to train and evaluate the models.
run_knn.py
– optimize k and p for k-NN, then evaluate on the wine dataset.run_perceptron.py
– tune the learning rate of a perceptron using the banknote dataset.run_one_vs_rest.py
– apply a perceptron in a one-vs-rest setup for the wine dataset.run_ensemble.py
– compare the custom models with several scikit-learn ensemble methods.
- /data: Contains the raw datasets used by the scripts (e.g.
data_banknote_authentication.csv
). - /docs: Output directories for figures generated by the example runs.
- README.md: This file.
-
Clone the repository
git clone <repo-url> cd classification-algorithms
-
Install dependencies
Create a virtual environment and install the required packages:
python -m venv venv source venv/bin/activate pip install numpy pandas matplotlib seaborn scikit-learn
-
Run the example scripts
Each script can be executed directly. Results (plots and printed metrics) will be saved under
docs/
.python scripts/run_knn.py python scripts/run_perceptron.py python scripts/run_one_vs_rest.py python scripts/run_ensemble.py
The scripts are meant as demonstrations of the provided algorithms. They perform typical data loading, preprocessing and evaluation steps.
- KNN: loads the wine dataset, splits it into training/validation/test parts, searches for the best
k
and Minkowskip
value, then plots metrics and a t-SNE visualization. - Perceptron: uses the banknote authentication dataset and sweeps the learning rate to find the best model.
- One-vs-Rest: wraps the perceptron for multi-class classification on the wine dataset and shows macro and micro averaged metrics.
- Ensemble: compares the custom models with scikit-learn random forest, bagging, gradient boosting and others.
Generated plots are stored in the corresponding docs/task_*_results
folder.
- KNNClassifier
fit(X, y)
– memorize the training data.predict(X)
– return predicted labels for new samples.optimize_k
andoptimize_p
– helper functions to evaluate different hyperparameters.
- Perceptron
fit(X, y)
– train weights using a simple gradient update rule.predict
andpredict_proba
– produce class labels or raw scores.optimize_learning_rate
– sweep over a range of learning rates.
- OneVsRestClassifier – trains one binary perceptron per class and selects the class with the highest score when predicting.
- Metrics: accuracy, precision, recall, F1 and confusion matrix implemented with NumPy.
- Visualization: plotting functions for pairplots, t-SNE embeddings, confusion matrices and basic metric tables.
The repository currently does not include automated unit tests. Running the example scripts serves as an integration test of the modules.
- Additional algorithms such as logistic regression or SVM.
- More extensive visualizations and reporting utilities.
- Unit tests and continuous integration configuration.
For questions feel free to open an issue or reach out to the repository owner.