This is the second part of the blog series on “Data Overwhelming in the Indian Social Sector”. In the last blog, we discussed if there is enough data and discussed the prominent entities in India capturing different data sets. In this blog, we dwell on issues and challenges with the current data sets available and imagine a way forward through a collaborative approach to make data more standardized and accessible.
There are primarily three types of entities that are indulged in the mammoth task of capturing this data, But this abundance of data creates three major issues:
- Lack of Structure & Granularity: In most of the above cases, data can be extracted in two formats:
– Machine-readable data is complex for humans and requires specialized instruments to read, examples include XML, CSV, XLS, and JSON file formats.
– Human-readable data is easy for humans to understand and does not require specialized equipment, examples include HTML, PDF, and WORD file formats.
– Data from 100+ government departments and datasets across multiple websites and various domains and socioeconomic factors vary greatly, making it hard to analyze collectively. - Cautioned data sharing ethics and duplication of efforts: Data collected from the same stakeholders can reveal multiple insights across various domains, creating opportunities for organizations to collaborate and cross-validate findings on social issues. For example, when working in the same geographic area, multiple organizations can end up re-profiling the same beneficiaries, wasting time and resources.
- The technical skills gap in the development sector:
Data scraping is a useful method for gathering large data sets when information is presented in similar formats across two outlets. However, in the development sector, the lack of monetary incentives often causes highly skilled tech talent to opt out, leading to a lack of technical support and guidance within these communities.
We imagine the way forward in form of a three-pronged approach:
1. Use a standardised tool that saves registries of beneficiary data across multiple domains:
A standardized tool can improve data collection across different organizations by allowing for the collection of multiple indicators in their preferred format. This can then be used to create a single beneficiary registry, which enables longitudinal studies of impact, reduces duplicated efforts and leads to more efficient use of resources. It also allows for cross-validation of the data and data-driven decisions.
2. Build one open-source platform for all and provide API integrations:
When new organizations work in the same geography, they may want to expand upon the existing beneficiary registry by collecting new indicators specific to their cause. Application Programming Interfaces (APIs) can provide these organizations with easy access and the ability to add new data in a modular fashion. This can allow new organizations to build on the existing registry and collect additional information in a streamlined and efficient manner.
3. Collaborate often & advocate Tech4good communities:
Advocate the creation of communities amongst development sectors that leverage the skill-building of development sector professionals. These communities will support the various queries and tech resource management the sector needs to standardize and operate its data resource well enough to generate the needed insights.
In Conclusion, we know that high-quality development data is the foundation for meaningful policy-making, efficient resource allocation, and effective public service delivery. It’s prudent to realize that facilitating efforts in the abovementioned direction will require considerable collaboration and as CTO Non-Profit members, we aim to learn more about data & tech to build an optimal data ecosystem for the Indian development sector.
Your Tech Lunch Box:
How to use postman to access an API: What is Postman ?? || How to use Postman?? || Postman Tool For Beginners
This blog is part of our #TechTuesdays Series Curated for our CTO For Non-Profits Community