A simple case serves to illustrate here:
Libeskind-Hadas & Charleston 2010, Ovadia et al. 2011) already. The proof Ran Libeskind-and I did (he mostly did!) first allowed the host phylogeny to be a network but the later proof by Ovadia et al. didn't. But it's an interesting problem, because there are hybridization events of host species though not so much of individual genes.
The cophylogeny problem is best attacked as a mapping problem (yes, this is my opinion: there are others). Given P, H and leaf associations phi, what is the minimal cost mapping of P into H that preserves phi and the structure of P and H, and which is interpretable in a well-defined way? (see Cophylogeny blog for more details.)
We typically have four event types:
|codivergence : a generalisation of cospeciation, where a node in P bifurcates at the same moment as does its host node in H;|
|duplication : where the parasite bifurcates without a corresponding host bifurcation, such as for a gene duplication;|
|host switch : a parasite/pathogen establishes on a new host lineage); and|
|loss : we fail to see a parasite/pathogen where we expected it, caused by "missing the boat" / lineage sorting...|
|extinction (above) or sampling failure.|
... but these cause new headaches.
Mike Steel once very usefully asked me, what are the desired properties of cophylogeny maps? In trying my best to answer him I realised that none of the properties really needs either phylogeny to be a tree. Nodes in P are mapped to nodes or edges in H, and the evolutionary history they imply is based on the route through H from parent node p' to child node p in P. If H is a tree then this is unique; otherwise there can be ambiguity and a potential explosion in number of solutions. But it does mean that potentially we can solve the mapping problem moderately well so long as P and H are at least DAGs. While this is possible in principle, in practice, it's pretty much impossible.
It's hard enough just with trees:
|Figure from Ramsden et al. 2009 Supp. material|
The figure above was cut from the manuscript and relegated to supplementary material because it was kind of unnecessary, but it's very pretty so it should see the light of day. It shows a consensus of 15 solutions / maps we found that could best explain the relationships between hosts and circoviruses; thicker lines for more frequently occurring components and different colours for different groups of maps.
The cophylogeny problem is a fun one that presents lots of modelling, computational and representational challenges.