Liberating Data in NYC Schools Comprehensive Education Plans

Independent Researcher


Heliyon Journal


Leadership teams at NYC schools made up of administrators, teachers, and parent coordinators spend hours every year writing details plans for their schools. These Comprehensive Educatoin Plans (CEPs) are shared with superintendents as PDFs. They are made public by NYS law, but the format is and the wealth of data is inaccessible. I wanted to change that.


Dr. Steven Azeka, Kenji Kanamaru


Bash, Python (PDFminer, Flask, SQLAlchemy), PostgresQL, ElasticSearch,


I scraped all the PDFs, broken them into pages, converted them into searchable text, and stored them in a database so users could search any text and all instances would be returned by year with links to the human-readable PDFs.

Conducted analysis with summer fellows at the Robin Hood Foundation in NYC that was used for exploratory analysis around literacy and math curriculum in use. This initial analysis may have played some part in the Foundation coming to support initiatives in the city around high quality instructional materials. Or not, who knows!

Co-authored a journal publication describing the tool and an exploration of how it was used by and what it revealed for program officers in philanthropies with Dr. Steve Azeka and Kenji Kanamaru.