CSV Upload Service to migrate school data

Led a team of 3 developers on a nightly job to migrate School District data to remove burden of manual data entry.Features

Problem

Our company had built a robust learning management system (LMS) but we didn’t have a way to sync to a school district’s existing system; their schools had to do it manually through Excel uploads or by filling out forms on the front end, which required maintaining data in two places. For small schools this process was time-consuming and error prone and for school districts it was a deal-breaker.

Our CEO asked me to build a system that would allow our platform to sync with any external LMS system. A discovery phase revealed that most school districts model their data after IMS standards and make their data accessible through CSV files or a REST API.

Impact

Schools were unable to use the LMS for up to 3 days or longer depending on the school size. The school day would be interrupted due to manual entry errors (e.g. Student is registered in the wrong class) and that would impact the student’s, teacher’s, counselor’s and IT staff’s day trying to resolve the error.

Solution

An ETL application (Extract Transform Load) that would sftp into a school district’s server and monitor changes in CSVs and update our LMS system nightly and have the foundation for eventual REST integration.

After researching SaaS products that could handle this process for us, I realized none of them was an “end to end” solution that could handle our complicated data massaging hence I needed a custom ETL tool.

I came up with an architecture that would include a cron job to check for new CSV records nightly, migrate them to S3 for archive purposes, and trigger a lambda to handle our complex data massaging and update the necessary databases across our APIs. I additionally designed our application to allow IT users to enter their SFTP and REST credentials to instantaneous sync their data with no manual involvement on our apart.

After completing a prototype, I assigned the remaining tasks to my development team for completion.

Sprinkle of Sales

I made sure to build transaction monitoring into the scope because I learned from previous ETL jobs the importance of knowing the source of data movement. This feature could later be built into a full-blown dashboard for school districts to monitor their data pipelines. The Sales team loved this idea because we could potentially charge more for it.

Architectural Diagram

Results


Video Overview