- Structure of data can not be too generic because the development and accuracy of a model or a solution to any problem will solely depend on train-test data.
- Identification of the right data feature for a particular problem category.
- The data thresholds must be set by a global body taking into account of the various regional factors.
- There may be data variations based on locations so there is a need to identify and include that into a generic structure.
- Identify historically inaccurate data.
- Why there is a dependency in geo location is: There is data difference in Developed vs Developing nations & regions within it.
- To set a threshold for a minimum dataset [quantity] to train.
To understand the quality-quantity of data and the need for any additional problem specific data point. This will help in effective time & cost management while building a product or a solution