Stefan Mićić
Machine Learning Developer and Data Engineer
If a seasoned MLOps & Data Engineer is what you need to elevate your AI and Machine Learning efforts, look no further. Stefan brings over 8 years of relevant industry experience, working with niches and prominent names like HTEC and PepsiCo. From building solutions from scratch to introducing innovative approaches and enhancements - everything is a breeze for this expert.
Provectus
July 2024 - ongoing
Provectus is an AI consultancy and solutions provider that helps businesses integrate AI technologies to achieve their unique objectives, offering open, cloud-native solutions without vendor lock-in or license fees.
Led the development of a robust document summarizer leveraging Retrieval-Augmented Generation (RAG), Large Language Models (LLMs), and Natural Language Processing (NLP) techniques. Delivered a scalable and high-accuracy solution for extracting insights from unstructured text data.
Reviewed, supported, and mentored team members, fostering collaboration and continuous improvement. Provided technical guidance to ensure project success and long-term maintainability.
Planned, groomed, and estimated work for sprints and quarterly roadmaps, ensuring alignment with business goals and timely delivery of milestones.
Reduced operational costs and improved application robustness by implementing microservices, enabling modular development, easier maintenance, and enhanced scalability.
Improved team delivery timelines and productivity within the first month of leadership, streamlining processes and setting clear priorities to meet deadlines effectively.
Main tech stack:
AWS, LLM, Python, Docker
Neptune Technologies LLC
July 2023 - June 2024
Neptune Technologies leverages AI to provide innovative solutions for organizations and individuals, offering advanced investment platforms, institutional-grade AI models, and AI-powered market intelligence to streamline operations and drive business growth.
Designing and implementing infrastructure on AWS to support the development and CI/CD of AI models, following MLOps best practices.
Chatbot deployment with real-time data access using the RAG approach, combined with LLMs (GPT, Falcon, Raven, LLaMa2...).
Leadership and mentoring, with a strong focus on thorough documentation.
Main tech stack:
AWS (Cloud9, CodePipeline, EC2, ECS, Lambda, Polly, Comprehend, Bedrock...), Python, OpenAI GPT & Docker
PlusPower
March 2023 - August 2023
Plus Power develops, owns, and operates battery energy storage systems to enhance grid efficiency and reliability, providing innovative solutions that store and dispatch energy during demand peaks, supporting the transition to renewable energy across the U.S. and Canada.
Significantly increased test coverage from 5% to 85%, driving better code reliability and performance.
Used Terraform for setting up DataDog monitoring for observability.
Worked with end-to-end pipelines using AWS SageMaker, from data preprocessing and model training to deployment, ensuring seamless operations.
Utilized Grafana for real-time monitoring and worked with DynamoDB to support scalable solutions.
Main tech stack:
AWS, Grafana, DataDog, Terraform, SageMaker, Python, Docker
PepsiCo
December 2022 - March 2023
PepsiCo is a global food and beverage leader, known for its diverse portfolio of brands including Pepsi, Mountain Dew, Frito-Lay, Gatorade, Tropicana, and Quaker, providing products to customers worldwide.
Designed and implemented a comprehensive end-to-end pipeline utilizing Apache Spark and Scikit-Learn. This pipeline seamlessly integrates data preprocessing, feature engineering, model training, and evaluation stages, ensuring a streamlined and efficient workflow.
Leveraged the power of Apache Spark to handle large-scale data processing tasks, enabling efficient parallel computation and distributed data processing across clusters.
Utilized Scikit-Learn's robust library of machine learning algorithms and tools to develop predictive models, perform feature selection, and optimize model performance.
Implemented automation mechanisms and optimization techniques to enhance the efficiency of the pipeline, reducing manual intervention and maximizing productivity.
Delivered a fully operational end-to-end solution, empowering stakeholders to extract actionable insights from data and drive informed decision-making.
Designed and implemented a comprehensive end-to-end pipeline utilizing Apache Spark and Scikit-Learn. This pipeline seamlessly integrates data preprocessing, feature engineering, model training, and evaluation stages, ensuring a streamlined and efficient workflow.
Leveraged the power of Apache Spark to handle large-scale data processing tasks, enabling efficient parallel computation and distributed data processing across clusters.
Utilized Scikit-Learn's robust library of machine learning algorithms and tools to develop predictive models, perform feature selection, and optimize model performance.
Implemented automation mechanisms and optimization techniques to enhance the efficiency of the pipeline, reducing manual intervention and maximizing productivity.
Delivered a fully operational end-to-end solution, empowering stakeholders to extract actionable insights from data and drive informed decision-making.
Main tech stack:
PySpark, Snowflake, Databricks, Airflow & Python
Lifebit
January 2020 - February 2022
Lifebit provides a federated platform for precision medicine, genomics, and healthcare, enabling secure access, collaboration, and analysis of distributed clinical and biomedical data to drive research and innovation in drug discovery and disease surveillance.
Leveraged advanced techniques such as ONNX conversion, Quantization, and EC2 instance optimization to optimize already trained Deep Learning models, resulting in a 3x decrease in cost and a 4x increase in throughput.
Implemented automated model evaluation and deployment pipelines using AWS services (CloudWatch, SQS, S3, EC2), Valohai, and GitHub Actions. With a streamlined workflow, new code pushes to main trigger the deployment of a new model version to staging, facilitating seamless testing and comparison against the current production model.
Took charge of decision-making and strategic planning for the CI/CD pipeline, including defining the architecture, versioning experiments, and optimizing the ML inference architecture. This proactive approach ensured efficient and effective deployment of models while maintaining high standards of performance and reliability.
Leveraged advanced techniques such as ONNX conversion, Quantization, and EC2 instance optimization to optimize already trained Deep Learning models, resulting in a 3x decrease in cost and a 4x increase in throughput.
Implemented automated model evaluation and deployment pipelines using AWS services (CloudWatch, SQS, S3, EC2), Valohai, and GitHub Actions. With a streamlined workflow, new code pushes to main trigger the deployment of a new model version to staging, facilitating seamless testing and comparison against the current production model.
Took charge of decision-making and strategic planning for the CI/CD pipeline, including defining the architecture, versioning experiments, and optimizing the ML inference architecture. This proactive approach ensured efficient and effective deployment of models while maintaining high standards of performance and reliability.
Main tech stack:
AWS, Valohai, Terraform, K8s, GH Actions, CodeShip, Sentry, Uptrends, W&B
HTEC
January 2019 - December 2019
HTEC is a digital solutions provider offering a wide range of services including innovation strategy, experience design, technical strategy, and product and platform engineering to drive digital transformation and deliver advanced engineering solutions across various industries.
Led a team of engineers in developing object detection and classification models specifically optimized for Android devices to detect and classify COVID test results. Leveraged techniques such as quantization, TensorFlow Lite, and MobileNet to ensure the model was lightweight and efficient for mobile deployment.
Spearheaded a high-impact R&D initiative to enhance an existing super-resolution model with over 95% accuracy. Conducted extensive experimentation with custom loss functions and novel layer designs, implementing these from scratch to boost performance. Introduced innovative techniques, including multi-phase learning and combined loss functions, achieving significant gains in visual fidelity.
Enabled ONNX Runtime to leverage the MIGraphX library, enhancing model support and performance. This integration expanded the library’s compatibility and improved inference efficiency for deep learning workflows.
Identified and addressed performance bottlenecks in BERT-like models by designing custom operators in PyTorch and C++. This involved pinpointing inefficiencies and replacing them with optimized components, significantly improving runtime and resource efficiency in inference tasks.
Led a team of engineers in developing object detection and classification models specifically optimized for Android devices to detect and classify COVID test results. Leveraged techniques such as quantization, TensorFlow Lite, and MobileNet to ensure the model was lightweight and efficient for mobile deployment.
Spearheaded a high-impact R&D initiative to enhance an existing super-resolution model with over 95%accuracy. Conducted extensive experimentation with custom loss functions and novel layer designs, implementing these from scratch to boost performance. Introduced innovative techniques, including multi-phase learning and combined loss functions, achieving significant gains in visual fidelity.
Enabled ONNX Runtime to leverage the MIGraphX library, enhancing model support and performance. This integration expanded the library’s compatibility and improved inference efficiency for deep learning workflows.
Identified and addressed performance bottlenecks in BERT-like models by designing custom operators in PyTorch and C++. This involved pinpointing inefficiencies and replacing them with optimized components, significantly improving runtime and resource efficiency in inference tasks.
Main tech stack:
Machine Learning · PyTorch · Computer Vision · Deep Learning · Python (Programming Language) · C++
SmartCat
January 2018 - December 2018
SmartCat is a service-based company that provides data engineering, applied analytics, machine learning, and AI solutions, simplifying complex processes to assist businesses in making data-driven decisions across various industries.
Developed high-accuracy YOLO-based models to detect people and assets in real-time video streams, achieving over 97% on all business-critical metrics. Following detection, implemented CNN-based categorization (ResNet with transfer learning) to assess behaviors, such as identifying if a person is sitting or standing.
Managed the entire MLOps lifecycle, using MLFlow for model versioning, LakeFS for data versioning, AWS S3 for data storage, and TensorFlow Serving in Docker for scalable deployment. Built a fully automated pipeline that allowed clients to trigger data preprocessing, model training, versioning, and deployment with minimal manual intervention.
Contributed to multiple projects focused on calculating critical KPIs by implementing robust ETL processes. Leveraged Spark (in both Scala and Python) for data transformations, along with Snowflake, AWS (S3, Lambda, Fargate), and orchestrators like Airflow and Prefect to build scalable and efficient data pipelines that supported business decision-making and analytics
Developed high-accuracy YOLO-based models to detect people and assets in real-time video streams, achieving over 97% on all business-critical metrics. Following detection, implemented CNN-based categorization (ResNet with transfer learning) to assess behaviors, such as identifying if a person is sitting or standing.
Managed the entire MLOps lifecycle, using MLFlow for model versioning, LakeFS for data versioning, AWS S3 for data storage, and TensorFlow Serving in Docker for scalable deployment. Built a fully automated pipeline that allowed clients to trigger data preprocessing, model training, versioning, and deployment with minimal manual intervention.
Contributed to multiple projects focused on calculating critical KPIs by implementing robust ETL processes. Leveraged Spark (in both Scala and Python) for data transformations, along with Snowflake, AWS (S3, Lambda, Fargate), and orchestrators like Airflow and Prefect to build scalable and efficient data pipelines that supported business decision-making and analytics.
Main tech stack:
Machine Learning · Computer Vision · Apache Spark · MLOps · Scala · Python (Programming Language) · Amazon Web Services (AWS) · Keras
Master's Degree, Data Science
Faculty of Technical Sciences, Novi Sad
2020-2021

Serbian

English