Deep Learning
- Designing and training CNNs for vision tasks
- Tuning transformer-based models for NLP and speech
- Applying transfer learning and fine-tuning on domain-specific data
- Evaluating models and performing error analysis
Data Scientist / ML Engineer
Data Scientist with 3 years of professional experience working with Python, ML models, and data pipelines. Skilled in developing and optimizing machine learning solutions, including deep learning models and data-driven applications. Currently pursuing an MSc in Data Science at Warsaw University of Technology. Hobbyist of applying data into football analytics.
Built a U-Net model with ResNet-34 encoder to detect changes between satellite image pairs from LEVIR-CD dataset.
Handled severe class imbalance with weighted loss, tracked experiments with MLflow, packaged in Docker container. Model resulted in Validation F1 0.81, further proved useful with Google Earth imagery beyond the training dataset.
Compared Transformer and CNN approach (based on MEL spectrograms), for audio classification on TF Speech Recognition dataset.
Wav2Vec2 transformer embeddings with MLP/RNN heads achieved F1 0.92 and 94.4% accuracy. Tackled severe class imbalance with a two-stage CNN pipeline, reaching F1 equal 0.83.
Developed RL agents using PPO and Advantage Actor-Critic in the OpenAI Gym environment.
Built training pipeline with TensorBoard experiment tracking, model checkpointing, and hyperparameter optimization. Trained agents across multiple scenarios, achieving convergence through systematic reward shaping.
Using PyTorch implemented and compared CNN architectures such as ResNet, VGG16 for 10-class image classification on the CINIC dataset
Experimented with data augmentation, regularisation, hyperparameter optimisation, and ensemble methods, with soft voting ensemble achieving 81.6% accuracy. Applied few-shot learning.
Streamlit-based app facilitating exploration of StatsBomb 360 contextual event data.
Tool may be used in two modes: either by choosing particular moment from the match using time slider or by selecting freeze frames for particular shots taken throughout the chosen match. Moreover, it features Voronoi diagrams for match events, visualizing pitch control.
Library designed to accelerate clustering tasks using scikit-learn
Serves as a quick tool for selecting the optimal clustering algorithm and its hyperparameters, providing visualizations and metrics for comparison, with easy HTML reporting.
End-to-end explanation of how the Poisson Model, Skellam Distribution & ELO Ratings can be leveraged to predict football match outcomes, with a practical implementation in Python and discussion of real-world performance, along with data viz.
Open ArticleMathematical explanation of how to adapt ELO ratings to account for the possibility of draws in sport event predictions.
Open ArticleUnveiling peculiarities of belgian side's tactics and performance in the 2023/2024 season through data-driven analysis, with particular focus on their focal point, striker Kévin Denkey.
Open ArticleSearching for Ibrahim Osman's replacement for Nordsjælland: data-driven scouting in Polish Ekstraklasa 2023/24. Article walks through the whole process of identyfing the best suited players using data.
Open ArticleFull walkthrough of calculating and plotting popular metric of football match momentum, using event data, resembling Opta visualizations.
Open TutorialModern, geometrical approach to showcasing passing behaviour for each player using football event data, with a full walkthrough of the Python code.
Open TutorialExperience
Education
Feel free to reach out for any professional matters!