← Go Back to Project List
AWS Data Warehousing
#AWS#Google Data Studio#PostgreSQL#SQL#ETL

Ingesting data from 5 sources, giving product insights and traceability from factory to field, accessible through reports and dashboards.

AWS Data Warehousing

Project Context

Working for a manufacturing solar start-up (~200 employees), I pitched and then executed the development of a data warehouse (technically more a central database than a data warehouse) to store & leverage business and product data from factory to field, for two distinct business cases.

The project entailed creating data pipelines from 5B’s Enterprise Resource Planning (ERP) software (Odoo), Customer Relationship Management (CRM) software (Salesforce), as well as project planning, and field reporting software into Amazon Web Services (AWS). After the daily Extract Transform Load (ETL) pipelines to S3, the data was stored in a single PostgreSQL database and served to customers (internal only) via dashboards, reports, and SQL queries access for more in-depth data analysis.

Challenges

This project wasn’t without its challenges. For example, to understand and query the ERP data effectively was tough, as we were using Odoo, an open-source ERP, with a lot of customization. For example, I remember struggling for a while on SQL queries to link each component of a 4 level BoM to Purchase Orders and Transfer Orders, through inventories and Production Orders (to have supplier traceability through individual products).

Another challenge was that I had to wear a lot of different hats on this project, such as doing database architecture, data analysis for reports and dashboards, cloud and data engineering to setup AWS and various pipelines, all tasks I wasn’t familiar with, while managing and teaching an intern. The way I solved this challenge, was to make clear to all stakeholders that this project was a POC, maintaining a constant bi-monthly schedule for releases, and reaching out to other employees about any similar experience they might have had.

Project Outcome

In the end, after 6 months of work (at 50% FTE) from myself, and a skilled intern on this Proof of Concept (POC), the data didn't have clear business repercussions (apart from revealing and solving a few mistakes earlier than if we hadn't done the project), but clearly set in motion a more data driven culture. Indeed, the entire company was aware of the project as dashboards were projected on the walls of the office, and we got many requests for integrating and reporting on other data streams as well as building customer facing data products. In the end, the product leaders decided to continue this project and I helped hire a replacement for my role that could build valuable products on top of this initial data infrastructure.