Building an application with Microservices and Machine Learning

By Ahmed Sulaimon • 2025-05-07

Developing a production-grade application that combines microservices architecture with machine learning was both a technical challenge and a rewarding learning experience. This post outlines my process of designing and implementing a real-time grocery price comparison tool covering the vision, system architecture, ML integration, CI/CD & DevOps, challenges, Technical Decisions & Trade-offs, What I’d Improve and key takeaways.

1. The Vision

The goal was to build a platform that empowers users to make smarter grocery decisions by:

Tracking historical and real-time price trends
Offering predictive insights into future price drops
Recommending the best times to buy products based on statistical confidence

This tool aims to bring dynamic pricing awareness to everyday grocery shoppers, something typically reserved for large-scale retailers or marketplaces.

2. System Architecture

To ensure scalability and maintainability, I adopted a microservices-based architecture with clear service separation:

Scraping Service – Python + Selenium used for dynamic scraping across UK retailers (Sainsbury’s, Aldi, Iceland, Morrisons).
Price Analysis Service – Python-based service with embedded ML models to forecast prices and compute confidence scores.
API Gateway – A central Flask-based service to route traffic and orchestrate communication between services.
Frontend – Built with Flutter for responsive, cross-platform UI. Features include voice search and accessibility enhancements.

Architecture Overview:

   Frontend (Flutter) → API Gateway (Flask) → [Scraping Service] ↔ [Price Analysis Service(ML)]

All services were dockerized for modular deployment and fault isolation.

3. Machine Learning Component

At the heart of the system lies a Weighted Moving Average (WMA) model used to predict near-future price movements. The ML pipeline included:

Unit normalization (e.g., pence → pounds, standardizing pack sizes)
Smoothing short-term volatility to prevent overfitting
Confidence scoring (range: 0–1) to reflect prediction reliability
Output translated into natural-language recommendations, such as:

Good time to buy — £0.20 drop detected, 85% confidence

4. CI/CD & DevOps

To maintain smooth development and safe deployments, I implemented:

Docker-based workflows for consistent builds across environments
Basic health checks and logging for every service

This allowed us to test and deploy services independently without disrupting the entire system; a major advantage of the microservices model.

5. Challenges Faced

As with any real-world system, the project presented several hurdles:

Service coordination – Ensuring that loosely coupled services remained in sync
Scraping at scale – Managing session headers, delay timers, and rate limits to avoid detection
Data inconsistency – Handling missing units, mislabelled products, and naming discrepancies
Performance trade-offs – Balancing prediction accuracy with low-latency responses, especially for mobile users

6. Technical Decisions & Trade-offs

Why Weighted Moving Average (WMA) Over ARIMA or LSTM?

Although I considered ARIMA and LSTM, I chose WMA because:

Faster inference time — ideal for user-facing mobile apps
Lower computational cost — suitable for containerized services
Smaller dataset requirement — more advanced models didn’t add value given limited historical data

This trade-off struck the right balance between simplicity and practical performance.

Why Flask for the API Gateway Instead of FastAPI or Express?

Despite FastAPI's modern async features, I chose Flask due to:

A familiar ecosystem that allowed quicker setup
The gateway's limited role of routing and proxying, which didn’t demand asynchronous capabilities
Flask’s mature middleware support for logging, debugging, and request validation

Why Selenium for Scraping?

I used headless Selenium because:

DOM interaction and cookie/session handling were essential
Selenium provided flexibility for lazy loading, form interactions, and dynamic content parsing

Bot detection mitigation included:

Rotating user agents and randomized headers
Simulating human-like delays
Running Chromium in headless containers to resemble real browsers

Why Microservices Instead of a Monolith?

Although microservices introduced some complexity, they were the better choice for:

Independent development and deployment of each module
Resilience and fault isolation across services
Mimicking real-world distributed system designs; important for my long-term career path

7. What I’d Improve

Replace WMA with Adaptive Forecasting

While WMA was effective, it lacks the ability to model non-linear patterns and seasonality. In the future, I’d explore:

Facebook Prophet or XGBoost regressors
Online learning algorithms that adapt with each new data point

Move Scraping to Serverless or Queue-Based Infrastructure

Currently, scraping runs on a fixed schedule within a container. A better approach would be:

Deploying scraping tasks via AWS Lambda or Cloud Functions
Using queues or cron triggers for event-driven execution
Adding observability tools like Prometheus and Grafana for monitoring

Improve API Gateway Security

The current setup lacks critical security layers. Future upgrades would include:

JWT-based authentication, rate limiting, and CORS enforcement
Centralized logging, error alerts, and audit trails

Add Internationalization and More Retailers

To scale globally, the system should support:

Multi-currency and multi-language capabilities
A modular scraping logic to onboard more retailers from different regions

8. Lessons Learned

This project helped reinforce key software engineering principles:

The importance of versioned APIs to avoid downstream breakage
How to design and manage resilient CI/CD pipelines
The gap between notebook-based ML prototypes and production-ready models
How to build systems that can fail gracefully without affecting user experience

9. Final Reflections

What began as an ambitious stack experiment turned into a robust, deployable platform. It challenged me to think like both a systems architect and a data engineer; skills that are increasingly in demand.

I now feel significantly more confident tackling roles that involve distributed architecture, data pipelines, and machine learning integration.