How Two Epidemiologists are Using Outbreak.info to Track Variants in Sonoma County

by Emily Haag

Julia Rubin and Stephen Johnston, epidemiologists at the County of Sonoma Department of Health, in collaboration with the Sonoma County Public Health laboratory team members Iryna Goraichuk, Lisa Critchett, Carlos Gonzalez, and lab Director Rachel Rees, are leading variant surveillance efforts that inform Sonoma County health officials and the public, through Sonoma County’s COVID-19 Dashboard, in Northern California. From the beginning of the pandemic, the epidemiologists have poured over data on COVID-19 cases, deaths, and hospitalizations. With the introduction of the first SARS-CoV-2 variants in Sonoma County in March 2021, they have worked ardently to compile and analyze genomic data. This disease surveillance is integral in helping guide the county’s response to COVID-19. They also work closely with the California Department of Public Health to support state-wide data quality control and analysis. Beyond monitoring trends and guiding emergency response efforts, they help steer targeted public health communication to directly assist the people of Sonoma County.


The team uses Outbreak.info and the Outbreak.info R package to evaluate the prevalence of variants for the past 90 days in Sonoma County. They also use Outbreak.info to get a higher vantage point on California-wide trends over time. While 90-day localized trends help them understand what is currently circulating, temporal state-wide data helps them build predictive models.

Screenshot--721--190-day prevalence of top SARS-CoV-2 lineages in Sonoma County from outbreak.info

Screenshot--723-6-month prevalence of top SARS-CoV-2 lineages in California from outbreak.info

These analyses are central to variant surveillance in the county, which in turn help inform public health decisions and keep the citizenry informed. SARS-CoV-2 cases in Sonoma County, like much of the world, have been dominated by newly emerging Omicron sublineages. The most pervasive lineage in the county, BA.5 and its sublineages, was designated a Variant of Concern by the CDC in late December 2021, but did not appear in the state until May 2022. Sonoma County saw its first case in early July 2022 and the lineage now accounts for nearly 20% of all lineages circulating in the county.

Screenshot--720-Most common lineages in Sonoma County over the past 90 days from outbreak.info

Tracking variants has been a challenge, particularly given the frequency by which new lineages are designated and resulting naming conventions change. The epidemiologists have had to change methodologies and adapt to dynamic variant categorization on the fly. Outbreak.info continues to try to abate this challenge, as well, through optimization of its data pipelines.

Because sequencing data is biased by sampling techniques, the Sonoma County team also uses wastewater data to support their overall prevalence estimations for the county. This helps them validate what the county is experiencing with case rates and hospitalizations, and to try to form a predictive model for what the county can expect in terms of overall COVID-19 cases.

Wastewater data does not provide granular insight into which lineages are prevalent, however. As Julia described:

"Wastewater is more of a community wide sample. [With the genomic data,] we are selecting from hospitalized patients, fatalities, and vulnerable populations and trying to look at severe outcomes. Any correlation between variants and more severe outcomes - we want to know that."


The genomic data collected by GISAID and aggregated by Outbreak.info provides insight into not only which lineages are present in the county, but which other parts of the world are experiencing similar patterns. Outbreak.info’s Lineage|Mutation Tracker provides information about where lineages originate, and which locations are experiencing high prevalence rates. This helps public health departments like Sonoma County figure out how and from where variants are getting into the county. Examining trends from other places in the world where specific lineages have been prevalent also helps them anticipate what to expect. As Stephen added, "We're also looking at what's happening in the rest of the country, what's happening in Europe...and that's what we're really using to plan and to try to predict. It is useful to us to cross validate."

Screenshot--722-Worldwide prevalence of SARS-Cov-2 lineage BA.5, the most common lineage in Sonoma County over the past 90 days from outbreak.info

By identifying other locations around the world that have the same SARS-CoV-2 lineages, the epidemiologists can dive more deeply into rates of severe health outcomes over time based on what these other locations have experienced. As Stephen noted, “...in the context of other global information, it lets us know where we are in the curve.”

Evaluating county and state trends, and synthesizing this information with global genomic data and county wastewater data, helps the team paint a detailed picture of COVID-19 in Sonoma County. This analysis, in turn, provides the County of Sonoma Department of Health with vital tools needed to fight rising cases and improve health outcomes.

Read stories about other Outbreak.info users here