Zia Chi, Research Technology Specialist, DataLab, School of Social Work
Watch RecordingZia Qi is, by her own description, “the one in the social work building who gets really excited about GPUs.” She was trained as a social worker but built her technical skills outside any curriculum — through real-world projects, deep self-directed learning, and the good fortune of working alongside people equally passionate about what the technology could do. Her AI journey is distinctive in this collection because the constraints she faced — data sensitivity, institutional privacy requirements, limited hardware — did not limit the work. They made it better.
The data lab she works in holds child welfare records from Michigan’s child protective services system: over 1.3 million text documents dating back to 2009, containing case narratives written by CPS workers under enormous time pressure. These narratives are dense with clinically and policy-relevant information — substance abuse, domestic violence, housing instability, firearms, opioids — that never makes it into the structured data fields that researchers and policymakers can actually query. The information is there. Getting it out reliably, at scale, has been the core challenge for years.
The obvious solution — send the documents to a cloud LLM — was not available. Sending sensitive child welfare records to a third-party API is not an option, full stop. That constraint forced Qi toward local, small models, and the constraint turned out to be generative. Rather than defaulting to the largest model available, she benchmarked — running candidate models against her actual tasks with her actual data, setting quality expectations explicitly before running any tests. What she found surprised her: models as small as 4 billion parameters could achieve near-perfect agreement with human coders on her classification tasks. There was no performance gain from scaling to much larger models. “Small is enough,” she said, “and small is what makes local possible, and local is the best solution when you handle sensitive data.” The hardware is a single office computer with a 96GB GPU. The entire system runs disconnected from the internet.
The database infrastructure evolved alongside the models. When Qi arrived at the lab, analytical workflows ran on large CSV files — sometimes 10 gigabytes — that had to be carefully assembled and broken apart by hand for every research request. A round of performance testing led her to restructure everything into DuckDB, a columnar analytics database that cuts query times from minutes to seconds. It now holds all 1.3 million records in a format that any member of the lab can query without manipulating raw files.
The most recent layer is agentic. Qi built an agent around the DuckDB infrastructure that allows researchers to query the database in plain English — the agent writes the SQL, executes it, and returns the answer. A 27-billion-parameter model, running locally, writes code rather than generating open-ended text, and code from small models can be verified in ways that text cannot. She has also built an automated weekly digest agent for the school’s research office: it reads incoming emails, summarizes key information, extracts attachments and links, and produces a formatted digest for distribution — a fully automated system running on local hardware, with no data leaving the building.
The practical payoff is now visible. With the classification infrastructure in place, Qi is conducting a historical analysis of the opioid epidemic as it appears in Michigan’s child welfare records — tracing the shift from prescription opioids to heroin to fentanyl across 15 years of case narratives, from 2009 to the present. “That story is in the case narratives,” she said. “It has always been there. We just didn’t have a reliable way to get it out at scale until now.” Her closing advice was the same discipline that shaped every decision: benchmark everything. It is, she said, “the habit that makes the difference between the tools you can trust and the tools you hope are working.”