Index
Experience
Education
Extras

Madison Ostermann

Data Scientist & Data Engineer on NASA’s People Analytics Team, M.S. Analytics Student at Georgia Institute of Technology, based in Washington, D.C.

Solving problems, providing technical recommendations, empowering experts with data, and guiding critical decisions.

Data Scientist & Data Engineer

People Analytics | NASA

Aug. 2021 - Present

Full Time

Automating data pipelines, modernizing analytics infrastructure, applying data science principles to artificial intelligence R&D projects, and exploring graph analytics for workforce insights are a few things we have in the works.

I currently lead the implementation of a cloud-based data and analytics infrastructure that underpins the Human Capital organization’s data science initiatives. Achievements include linking siloed cloud environments by removing firewall and authentication barriers, expanding how the agency accesses and shares data, implementing storage and processing solutions for analytics-ready data, transitioning a data pipeline (saves 40+ hours of work per week, earned an Early Career Achievement Medal) to the cloud for automation and scalability, and establishing a platform for analytics, LLMs, and specialized databases. Through this work, I have learned about cloud architecture and engineering, components of infrastructure required to support high-performing, fast-moving data and analytics teams, and strategic considerations when designing for long-term scaling and cost-optimization. Specific technical components of this work include: AWS (S3, RDS, EC2, IAM, DNS, VPCs, etc.), SAP, APIs (Rest, SAP, and otherwise), Databricks, Airflow, Python, SQL, R, Posit, SAML authentication, Ollama, Weaviate, Memgraph, dbt, Quarto, Linux, Git, and more.

Currently, I am also mentoring interns and pursuing some R&D projects around skill analysis leveraging graph databases and LLM applications in human capital.

In the past, I have had the opportunity to work with teams across the Human Capital Office and the agency to provide insights using people data. From producing workforce-at-a-glance metrics in Tableau dashboards to help mission areas see their people data in a new way to collaborating with supervisors and telework coordinators to collect, process, and visualize data into applications to assist with return-to-office decision making, I've demonstrated my ability to synthesize complex, disparate datasets into data products that enable non-technical partners even beyond the Human Capital Office to take action.

Data Science Intern

People Analytics | NASA

Aug. 2020 - Aug. 2021

Full Time

Expanding a graph-driven skills analysis project, this internship entailed:

Collecting and processing unstructured and structured data with web scraping, available APIs, Python, and NLP techniques to produce clean, structured datasets.
Designing a graph model based on collected data and implementing it in a Neo4j graph database by importing data with optimized Cypher queries in Python scripts.
Conducting analyses with graph algorithms and graph data science methodologies, as well as NLP techniques including Doc2Vec, topic modeling, and text similarity.
Developing a beta NLP “Text-to-Cypher” pipeline that converts a natural language question to a Cypher query, employing custom named entity recognition, entity linking, and relationship extraction techniques.
Visualizing results of analyses and offering a platform to interact with the “Text-to-Cypher” pipeline by developing an R Shiny application

Data Science Consultant

i2k Connect

Sep. 2022 - Sep. 2023

Freelance | Part Time

As a data science consultant for this artificial intelligence company, I focused on improving the representation of and access to a proprietary dataset with automated data processing, graph databases, and large language models. By automating the processing and validating of tabular data, I improved data integrity and reduced the manual time spent reviewing records by automating the processing and validating of tabular data. After enhancing data quality, I established a knowledge graph that better modeled the relationship-rich data that was modeled and maintained in a tabular format. By creating a SME-informed graph data model and importing the tabular data into a Neo4j graph database using Python and Cypher, I provided more intuitive ways to analyze, interact with, and visualize data. To further improve the experience of interacting with and querying the graph database, I researched and tested how best to use ChatGPT’s large language model to convert natural language into dynamic Cypher queries that retrieve information from the knowledge graph without directly interacting with the graph and/or using code. The research paper written to share findings was published on OnePetro (under my maiden name Gipson).

Check out my work before this on LinkedIn.