Data has changed dramatically over the years in terms of volume, variety, velocity, and veracity. We are living in the age of data. Data is being generated in all possible ways by people, machines and physical assets. At the same time technology advancement in the area of machine learning and Artificial intelligence is creating huge opportunity. Benefits to the business are only limited by how you connect the dots with data.
Change in data and changes in data related technologies is redefining how various aspects of data platforms are implemented. Implementation of data platform is defined by new technology , cost, legacy data platform integration , new data types , flexibility , application integration & capturing data in motion.
Key consideration for modern data platform
- Need to collect all data. Need to collect more and different type of data continues to grow. We need to design for this evolving need.
- Schema on read vs Schema on write. Modern data platform needs to consider schema on read approach for storing raw data. It is hard to define schema for future needs of data.
- Modern data platform has to adopt Bottom up approach for data pipeline creation
- Modern data platform approach needs to consider integration between multiple data platform technologies. Modern data platform involves using multiple technologies which are built for different purposes. In most cases complete data platform will be combination of data warehouse and data lake approach.
- Data generated by IoT devices needs to be stored and processed differently. Constant stream of IoT data requires different approach to data quality and data integrity for processing.