The Learning and Educational Achievements in Punjab Schools (LEAPS) longitudinal project, initiated in 2003, was designed to map, understand, and improve the educational universe of primary education in Punjab, the largest province in Pakistan and the 12th largest schooling system in the world. We are now releasing this flagship LEAPS dataset, including both school- and household-level data and all its accompanying documentation for public use. This dataset has enabled multiple valuable research publications by both the LEAPS team and other researchers including graduate students.
About the Data
The LEAPS longitudinal follow-up dataset focuses on 112 villages in Pakistan, and follows 826 schools and 1,807 households from 2004 to 2011. It contains two main types of surveys: household and school surveys. This data and all its accompanying documentation is available for public use.
Data collection took place between 2004 and 2011. We conducted village selection (see sampling) in 2003. Followed by round 1 of data collection in 2004, round 2 in 2005, round 3 in 2006, round 4 in 2007, and round 5 in 2011. A long-term follow-up at the individual level was also conducted between 2016 and 2019. Data and documentation for the long-term follow up will be released at a later date, separately.
Data Contents
Information on the following variables was collected in each survey round. For a complete list of survey sections and variables, please see the documentation files.
Household surveys: household roster, educational attainment and decision making, adult time allocations, assets, and annual expenditures. In most rounds of data collection there are separate surveys with the male and female heads of household.
School Surveys: teacher and student rosters, teacher training and education, school information including enrollment, fees, and facilities, student information, and student and teacher test scores in English, Urdu, and Math. All teacher and student information includes either 3rd or 4th grade teachers/students depending on the year of data collection.
Sampling Summary
In 2003, the LEAPS team chose 112 villages at random from a list of villages in Punjab province with at least one existing private school according to the 2000 census of private schools. Following an accepted geographical stratification of the province into North, Center and South, these villages were located in the 3 districts of Attock (North), Faisalabad (center), and Rahim Yar Khan (South).
Our team first conducted a household and school census in these villages. The survey team then conducted the first round of data collection with all schools offering primary level education as well as a sample of households in each village. Additional rounds of data collection were conducted in 2004, 2005, 2006 and 2011, both at the household and school level. The final sample consists of 112 villages - 37 in Attock, 43 in Faisalabad, and 32 in Rahim Yar Khan.
Download Links:
Data Use Notice: The public data has been cleaned and labeled to make it as accessible as possible to any researcher unfamiliar with the LEAPS project. For most variables, the data is clean and consistent for 99% of the cases. However, some inconsistencies may remain from measurement or processing errors that cannot be explained from field records at this point. For these remaining cases, it is the user’s responsibility to apply corrections as they see fit. The user is encouraged to use the notes command in Stata to read the notes the LEAPS team added to the data tables and variables that identify most of these discrepancies, and give additional context for specific tables and variables.
Released public data is anonymized and doesn’t contain any personally identifiable information to protect the privacy of research subjects. By downloading and using the data, researchers commit to not making any attempts at re-identifying individuals from the microdata. Doing so would dramatically jeopardize the accessibility of this dataset and other datasets in the future, as privacy of research subjects is of paramount importance for ethical and transparent research.
Required Citation: By downloading the data, users also commit to citing the data documentation report using the following citation:
Andrabi, Das, and Khwaja (2022), The Learning and Educational Achievement in Pakistan Schools (LEAPS) Longitudinal Dataset, 2004-2011, version 11-2022, released November 1st 2022