As Indian development sector professionals, when we wish to approach a problem statement, often the first question we ask ourselves is whether there is any data on this and if yes, where? This post is for anyone who belongs to the Indian development space, believes in a fact-based approach and wants to find the right data outlets to further their cause.
Before we move ahead let’s scratch our brains for a bit:
- There is enough data?
90% of the world’s data was created in the last two years and this number doubles every two years. We know that India, with a population of 1.4 billion people, is home to the second largest population (17.7% of the total world population) in the world. Hence, it’s safe to assume that there is a high influx of data in the Indian development sector as well.
- Should we say “There is enough data” or “There are enough data”?
Data is plural of datum (thing given). So, ideally, we should say “ there are enough data”, doesn’t that sound odd? That is because it is odd, in the modern ears at least. In today’s day and age, we use data as a mass plural noun, similar to ‘rain’ or ‘sand’. If you’re a grammar aficionado – here is the tangent you were looking for.
Back to our core topic,
If there is a high influx of data in the Indian development sector, is this data captured? – “Yes, There IS data!”
There are primarily three types of entities that are indulged in the mammoth task of capturing this data:
- Government Bodies: Includes the central & state government bodies and their domain-specific arms. These are the largest data providers due to their ability to operate at the national scale.
Here are some popular government data sources:
- Census of India: A government authority that is responsible for collecting information on the Indian population. As of today, the latest available census is that of 2011.
- National Family Health Survey (NFHS): A large-scale, multi-round survey conducted in a representative sample of households throughout India.
- National Sample Survey Organisation (NSSO): Collects nationwide sample surveys on various socio-economic aspects
- Open Budgets India: This platform has resulted from collective efforts by many organizations and individuals, led by the Centre for Budget and Governance Accountability (CBGA). Its aim is to strengthen the discourse and demand for the availability of all budget information in the public domain in a timely and accessible manner, at all levels of government in the country.
- API Setu: Started as a part of the Open API Platform Project, it is an API platform that acts as an API Marketplace cum Directory Portal
- Data.Gov: Open Government Data Portal designed, developed and hosted by the National Informatics Centre (NIC) to facilitate access to Government owned shareable data
- NGO Darpan: Collates all NGOs under a single platform
- National Data Bank: Aims to collate India-level statistics under one platform
- The National Data and Analytics Platform (NDAP): Aims to improve access and use of published Indian government data.
- Niti Ayog: An apex public policy think tank of the Government of India that has made an attempt to collate this data for the general public.
- Global Institutes/Extended Corporate Arms: Since the CSR Act, private players have also been documenting their impact and in turn collecting useful data. Organizations such as ILO, J-PAL, BMGF, World Bank, WHO, and Tata Trusts create annual reports based on their project data collection. Although, the data may not always be open source.
- World Bank Open Data: A collation of World Bank data across multiple indicators that are specific to India.
- International Labour Organisation Statistics (ILOSTAT): A labour stats source created by the UN which has the country profile of India and many more organizations.
- J-PAL Dataverse: The Abdul Latif Jameel Poverty Action Lab data repository hosted by Harvard Dataverse Support provides reliable data across countries & domains Including India
- World Health Organisation (WHO) Stats: Annual WHO reports that collate country population-specific health insights
- Tata NIN Centre of Excellence in Public Health Nutrition: A centralized data repository with harmonized datasets across the nutrition value chain (crop production to health outcomes through distribution, warehousing, purchase, consumption and absorption)
- Individual organizations: Individual Organizations are actively engaged in the space of data collection, but can be at times restricted to particular geographies due to limited funds.
A good example of an individual organization that operates at scale is
- TransUnion CIBIL– It collects data on loans to provide accurate information on individuals and groups. This data is often used by Microfinance Institutions (MFIs) to map out their areas of loan recovery
We believe that all three entity types have an important role to play while collating the development sector data. As CTO Non-Profit members, our suggestion, to make full use of this post is to explore the various data sets mentioned in this post.
Your Tech Lunch Box:
What is an API: What is an API ? Simply Explained
This blog is part of our #TechTuesdays Series Curated for our CTO For Non-Profits Community