Implementation
One of the largest pharma companies situated on the west coast wanted to revamp their existing data warehouse. This was a big challenge as they were already using one of the most cutting edge solutions which was currently delivering all the requirements however it was not capable of delivering all the future needs as the new product launch was suppose to explode their warehouse by 4 times in terms of the data volume. The current solution required a proprietary license and also it was not an ideal scenario for that software to process that large volume of data eventually.
We implemented a lambda solution that uses state of the art distributed in-memory computing system – Spark. This would meet their entire requirements both current and future and the system performance also grew by 4 times. This was also a huge gain as Apache Spark is an Open Source and runs on commodity hardware bringing their cost to mere one tenth of the earlier solution.
Eventually:
- The organization saved by more than 86% using the new platform.
- The processing improved by 4 times for example a report latency became one minute from 4 minutes.
- The environment can sustain up to 20x times their current need so all the future endeavors were taken care