Mapping Orthology

Initial thoughts:

An orthology is just another key space. It has a key space Table with (automatically generated?) "names" / identifiers for the orthologies, and a Link Table that lists the items that belong to each orthology. Therefore an orthology can potentially contain genes, or orthologies, or even both.

An example:

  • Human: Gene: Foobar
  • Mouse: Gene: Fubar
  • Rat: Gene: Fabar

  • Rodent (an orthology): Orthology: Rod00001: Contains Mouse::Fubar & Rat:Fabar

  • Mammal (an orthology): Orthology: Mam00501: Contains Human:Foobar, Mouse::Fubar & Rat:Fabar
    or
  • Mammal (an orthology): Orthology: Mam00501: Contains Human:Foobar & Rodent:Rod00001
    or
  • Mammal (an orthology): Orthology: Mam00501: Contains Human:Foobar, Mouse::Fubar, Rat:Fabar & Rodent:Rod00001

I think the first one makes the most sense, but I'm entirely open to doing it either of the other two ways.

With the first or third representations, we can easily Query for orthologies that contain certain genes (i.e. "Give me all orthologies in Rodent | Mammal that contain Mouse:Fubar and Rat:Fabar"). The third, with its inclusion of "sub orthologies" could make some searches faster, or easier to design (but with increased table size could make all searches a tiny little but slower).

The more I think about the second, the less I like it.

-- Main.gregd - 25 Jun 2007