The Host-Virus Model
Database (HVMD)

Team Verena doesn’t believe in searching under streetlights. We try to keep a close eye on emerging scientific literature both so we can learn from other scientists, and so that we don’t keep repeating the same analyses without moving the field forward. In that spirit, the Hivemind Database tracks studies that use machine learning to predict host-virus associations, as well as a few other studies we think might be useful to scientists interested in that problem.

We keep track of who’s doing the work, what data they’re using, and what methods they tried - so we know who to ask for help, what we should try next, and where the streetlamps end and the darkness starts.

Verena Icons_Hivemind Black.png

Frequently Asked Questions

What kinds of studies do we keep in the dataset?

HVMD is a record of studies that predict associations between hosts and viruses. "Associations" usually refer to the known or potential ability of a virus to infect a host and cause disease.

We categorize most studies into six types: general link prediction (any host-virus pairing), zoonotic risk (the probability a virus can infect humans), host identity (a virus's reservoir or vector species), parasite sharing (whether any two hosts share at least one virus), parasite richness (how many species of virus are associated with one host), and host range (how many hosts a given virus can infect). There's also a seventh type, "Emergence epidemiology," which is more of a catch-all, and includes anything after a virus makes the leap into a new host, including tropism, virulence, and transmissibility. Some studies don't neatly fit one methodology, and so we call those an "honorable mention" if they contribute conceptually to the topic but don't necessarily explicitly model it.

Some of these papers aren't about viruses - why are they in the database?

If a study invents a useful methodology, or includes viruses alongside parasites, we might include it if we think it would be useful for researchers to know about when they're designing new studies.

Can I access the studies directly through the database?

No, but all studies have a PDF available on our backend. If there's something you desperately need and don't have access to through your institution, we're happy to email you a copy.

Is there a paper I can read that explains the database?

HVMD is intended to be a companion resource to Albery et al., in prep., "Predicting the host-virus network", which is in revision at a journal. The paper also introduces the broad taxonomy of models we use here, and discusses how this field has emerged and what we've learned so far.

Why are some studies blank?

If you see a study with a red bar at the left, located at the bottom of the spreadsheet, that means that we know it's an important one to include in the database but haven't entered the associated data yet. This takes a little time, and unfortunately has a little bit of a lag to it. But we'll get there!

Why is my study missing?

Because we missed it! Oops. Send an email about it to Greg Albery or Colin Carlson.