A Data Science graduate student based in Berlin, Germany. I'm passionate about Machine Learning, AI, and Data Analytics, with expertise in Python, PyTorch, and Deep Learning.

I'm Shahabub Alam, a Data Science graduate student currently pursuing my M.Sc. at the University of Potsdam, with completed coursework at TU Dortmund. I'm passionate about Machine Learning, AI, and Data Analytics, with a proven track record of delivering impactful projects and research publications. My expertise spans across Deep Learning, NLP, Computer Vision, and Statistical Modeling.
I've worked as a Research Assistant at DFKI (German Research Center for Artificial Intelligence), developing ML-powered models for human-computer interaction. I've also contributed to various research projects at Technische Universität Berlin and ESCP Business School, building NLP assistants, BI dashboards, and data-driven solutions. My core technical skills include Python (PyTorch, TensorFlow, Scikit-learn), SQL, Data Engineering, and Full-Stack Development.
I'm open to full-time opportunities in Data Science, Machine Learning, and AI where I can contribute, learn, and grow. If you have a good opportunity that matches my skills and experience, don't hesitate to contact me.
Technologies and tools I work with to build innovative solutions.
Professional experience that I have accumulated over several years.
Projects I worked on. Each of them containing its own case study.

Comprehensive research project on Bengali speech recognition using the OOD-Speech dataset (~1,178 hours from 22,645 speakers). Benchmarked Whisper, IndicWav2Vec, and BengaliAI regional models, achieving 76% relative WER reduction. Implemented domain-generalization fine-tuning with GroupDRO, resulting in additional 40% WER improvement. Built interactive demos with Gradio and FastAPI for real-time ASR.

AI-powered research engine for searching 65,000+ legal and medical documents with instant, accurate answers. Built with privacy-by-design principles, fully GDPR compliant with no data storage. Features source citations, confidence scores, and real-time query processing for EU legal documents and medical guidelines.

Advanced time series and causal modeling system for supply chain optimization. Features agentic AI mode for autonomous model selection and hyperparameter tuning. Built with LightGBM and ensemble models, achieving 20-30% MAPE accuracy. Includes interactive demo, real-time forecasting, GDPR compliance, and production-ready architecture. Reduces inventory costs by 15-20% and saves 8-16 hours/month vs manual forecasting.

Advanced ML-powered recommendation engine using the Brazilian Olist dataset (100K orders, 99K+ users, 33K+ products). Implemented collaborative filtering, content-based filtering, and hybrid approaches for personalized product recommendations. Features real-time recommendation serving, model performance comparison (Precision@10, Recall@10, NDCG@10), and interactive web demo. Achieved improved user engagement through matrix factorization and item embeddings.
Published research papers and conference proceedings in Machine Learning, Computer Vision, and NLP.
2025 International Conference on Electrical, Computer and Communication Engineering
View Publication →2024 International Conference on Decision Aid Sciences and Applications
View Publication →2024 International Conference on Decision Aid Sciences and Applications
View Publication →Indonesian Journal of Electrical Engineering and Computer Science
View Publication →International Journal of Advances in Intelligent Informatics
View Publication →2020 11th international conference on computing, communication and networking technologies
View Publication →International Journal of Computer Applications
View Publication →Professional certifications and credentials demonstrating expertise in various technologies and domains.
Snowflake
DASA 2025 (Presenter)
Showing 1-4 of 34 certifications
Please contact me directly at msa.nabid.cse@gmail.com or through this form.