GASP 2023 presentations

Government Advances in Statistical Programming conference

June 14 and 15, 2023

We are uploading almost all the presentations or related papers. Please contact to get a quick response by email.

Full GASP 2023 program with abstracts -- that includes information that is not on this page: (1) session list, (2) chairpersons, staff, and supporters, (2) abstracts of every paper

Videos from GASP 2023 are or will be here on YouTube: 

Presentations on Wednesday, June 14, 2023:

  • Cynthia Rudin (Duke University), plenary presentation, A New Technique for Dimension Reduction for Data Visualization, coauthored with Yingfan Wang, Haiyang Huang, and Yaron Shaposhnik
  • Bringing Search to the Economic Census: A NAPCS Classification Tool - Clayton Knappenberger (US Census Bureau)

  • Machine learning is the easy part: recoding write-in responses in the 2021 VIUS - Cecile Murray (US Census Bureau/Reveal Global Consulting)

  • Nowcasting European Production Price Indexes - Gergely Attila Kiss (Hungarian Central Statistical Office, Central European University)

  • Using CANS scores in the Community Risk Result (CRR) project in R - Jamie Joseph (Vanderbilt University), Rameela Raman (Vanderbilt University)

  • Open Source Software in the Federal Government: An Analysis of Code.Gov - Gizem Korkmaz (Westat), Rahul Shrivastava (Westat), Anil Battalahalli (Westat), Ekaterina Levitskaya (Coleridge Initiative), José Bayoán Santiago Calderón (Bureau of Economic Analysis), Ledia Guci (Bureau of Economic Analysis), Carol Robbins (National Science Foundation)

  • Timely Examination of Survey - Winnie Xu (Westat)

  • Text Analysis of Health Equity and Disparities in IRS Form 990 Schedule H - Emily Hadley (RTI International), Laura Marcial (RTI International), Wes Quattrone (RTI International), Georgiy Bobashev (RTI International)

  • Manufacturing Sentiment - Tomaz Cajner (Federal Reserve Board), Leland D. Crane (Federal Reserve Board), Christopher Kurz (Federal Reserve Board), Norman Morin (Federal Reserve Board), Paul E. Soto (Federal Reserve Board), Betsy Vrankovich (Federal Reserve Board)

  • Automating Text Cluster Naming with Large Language Models - Alexander Preiss (RTI International), John Bollenbacher (RTI International), Anthony Berghammer (RTI International), John McCarthy (RTI International), Caren Arbeit (RTI International)

  • Evaluation of a method for georeferencing farms - Robert Emmet (USDA, National Agricultural Statistics Service), Kevin Hunt (USDA, National Agricultural Statistics Service), Rachael Jennings (USDA, National Agricultural Statistics Service), Kara Daniel (USDA, National Agricultural Statistics Service), Denise A. Abreu (USDA, National Agricultural Statistics Service)

  • treecompareR: an open-source software package to visualize hierarchical chemical classifications
    Paul Kruse (Oak Ridge Institute for Science and Education), Caroline L. Ring (Center for Computational Toxicology and Exposure, Office of Research and Development, United States Environmental Protection Agency)

  • Building a Modeling Platform for DC Government - Hersh Gupta (DC Office of the Chief Technology Officer), Gautam Chakravarty (DC Office of the Chief Technology Officer)

  • Is Place of Birth an Appropriate Measure of Childhood Exposure? - John Sullivan (US Census Bureau), Katie Genadek (US Census Bureau), Carlos Becerra (US Census Bureau)

  • Linking the 1980 Enumeration Sample to the Decennial Census - Kelsey Drotning (US Census Bureau), Katie Genadek (US Census Bureau)

  • Harnessing Open-Source Tools to Link Records with PII - M Daniel Brannock (RTI International), Ed Preble (RTI International), Lynn Langton (RTI International), Marguerite DeLiema (University of Minnesota)

  • Identification of Anomalous Data Entries in Repeated Surveys - Luca Sartore (NISS/NASS), Chen Lu (NISS/NASS), Justin van Wart (NASS), Andrew Dau (NASS), Valbona Bejleri (NASS)

  • Refining Disclosure Controls for the Census of Fatal Occupational Injuries - Danny Friel (US Bureau of Labor Statistics), Alyssa Gillen (US Bureau of Labor Statistics), Julie Krautter (US Bureau of Labor Statistics), Yvangelista Saastamoinen (US Bureau of Labor Statistics)

  • Expanding the R Shiny Apps in the growclusters Package - Randall Powers (Bureau of Labor Statistics), Terrance Savitsky (Bureau of Labor Statistics), Wendy Martinez (US Census Bureau)

  • Sampling Frames Simulation with Copulas in R - Eli Kravitz (New Light Technologies), Daphne Liu (US Census Bureau, Center for Economic Studies), Stephanie Coffey (US Census Bureau, Center for Economic Studies)

  • Tiling your Parquet Files for the Greatest Efficiency - Stas Kolenikov (NORC at the University of Chicago)

  • Exploring the Analytical Power of Input-Output Tables in the UK Context - Julen Grube Doiz (UK Office for National Statistics), Eric Crane (UK Office for National Statistics), Andrew Banks (UK Office for National Statistics)

    Thursday, June 15, 2023:

  • Workshop: ChatGPT: A First-Look at the Utility and Risks of Large Language Models (LLMs) for Federal Agencies - Benjamin Rogers (NCHS/CDC) and Travis Hoppe (NCHS/CDC)

  • On Principles for Government Open Data Curation - Jun Yan (University of Connecticut)

  • Designing Against Bias: Identifying and Mitigating Bias in Machine Learning & AI - David J Corliss (Peace-Work)

  • Developing Documented, Trustworthy R Packages for Official Survey Statistics - Benjamin Schneider (Westat)

  • Webscraping in Python - Matt Ring (Westat Insight)

  • UI Design Patterns for Code Reuse - Weihuang Wong (NORC at the University of Chicago), Kiegan Rice (NORC at the University of Chicago)

  • Toward a Tool for Automated Disclosure Risk Review of NCES Data Products - John Riddles (Westat), Michael Armesto (Sanametrix), Tom Krenzke (Westat), Jennifer Nielsen (National Center for Education Statistics)

  • Using NLP to Access to the Unemployment Insurance System - Siobhan Mills De La Rosa (American Institutes for Research), Meghan Coffee (American Institutes for Research), and Samia Amin (American Institutes for Research)

  • Incorporating Survey Weights into Structural Topic Models - Caroline Lancaster (NORC at the University of Chicago), Brandon Sepulvado (NORC at the University of Chicago), Josh Lerner (NORC at the University of Chicago), Evan Herring-Nathan (NORC at the University of Chicago), Stas Kolenikov (NORC at the University of Chicago)

  • Traffic tracker: A gentle introduction to Python, APIs, and GitHub actions - Emily Mitchell (AHRQ - Agency for Healthcare Research and Quality)

  • Analysis of Crisis Effects via Maximum Entropy - Tucker McElroy (US Census Bureau)

  • Developing a Parameterized and Enterprise Ready Survey Data Review Tool - Andrew J. Dau (USDA NASS), Michael Jacobsen (USDA NASS), Ryan E. Morton (Morton Analytics LLC), Jared M, Pratt (USDA NASS), Justin P. Van Wart (USDA NASS)

  • A Computational Pipeline Applied to a Nested Case-Control Study Simulation - Michelle Mellers (Uniformed Services University of the Health Sciences and Henry M Jackson Foundation), Celia Byrne (Uniformed Services University of the Health Sciences)

  • Deep Neural Network Based Mass Imputation for Data Integration - Sixia Chen (University of Oklahoma Health Sciences Center), Chao Xu (University of Oklahoma Health Sciences Center)

  • Weswgt: A New R tool for Survey Weight Adjustments - Jianru Chen (Westat), Benjamin Schneider (Westat), Minsun Riddles (Westat)

  • Python Dash for Visualization: Reusable Dashboards - John Lombardi (US Census Bureau)

  • A Data Viz Makeover: The Evolution of A Visualization Showing State SNAP Data - Brian Knop (US Census Bureau)

  • Developing a Data Quality Scorecard Dashboard to Assess Data's Fitness for Use - John Finamore (NCSES), Elizabeth Mannshardt (NCSES), Lisa B. Mirel (NCSES), Julie Banks (NORC at the University of Chicago), F. Jay Breidt (NORC at the University of Chicago), Benjamin R. Peck (NORC at the University of Chicago), Kiegan Rice (NORC at the University of Chicago), Zachary H. Seeskin (NORC at the University of Chicago), Lance A. Selfa (NORC at the University of Chicago), Grace Xie (NORC at the University of Chicago)

  • An insider's view of the history of open-source software for data science - Douglas Bates, Emeritus Professor of Statistics, University of Wisconsin – Madison