OrthoDB v11 data dump consists of: odb11v0_levels.tab.gz: NCBI taxonomy nodes where OrthoDB orthologous groups (OGs) are calculated odb11v0_species.tab.gz: OrthoDB individual organism (aka species) ids based on NCBI taxonomy ids (mostly species level) odb11v0_level2species.tab.gz: correspondence between level ids and species ids odb11v0_genes.tab.gz: OrthoDB genes with some info odb11v0_gene_xrefs.tab.gz: UniProt, ENSEMBL, NCBI, GO and InterPro ids associated with OrthoDB gene odb11v0_OGs.tab.gz: OrthoDB orthologous groups odb11v0_OG2genes.tab.gz: OGs to genes correspondence odb11v0_OG_xrefs.tab.gz: OG associations with GO, COG and InterPro ids odb11v0_OG_pairs.tab.gz OG parent-child (ancestral) associations odb11v0_all_fasta.tab.gz AA sequence of the longest isoform for all genes, fasta formatted odb11v0_all_og_fasta.tab.gz AA sequence of the longest isoform for all genes participating in OG, fasta formatted The non-fasta files are in tab-separated format without column headers. The fasta files have headers with orthodb internal gene id as well as a public id. ----------------------------------------------------------------- odb11v0_levels.tab: 1. level NCBI tax id 2. scientific name 3. total non-redundant count of genes in all underneath clustered species 4. total count of OGs built on it 5. total non-redundant count of species underneath odb11v0_species.tab 1. NCBI tax id 2. OrthoDB individual organism id, based on NCBI tax id 3. scientific name inherited from the most relevant NCBI tax id 4. genome asssembly id, when available 5. total count of clustered genes in this species 6. total count of the OGs it participates 7. mapping type, clustered(C) or mapped(M) odb11v0_level2species.tab 1. top-most level NCBI tax id, one of {2, 2157, 2759, 10239} 2. OrthoDB organism id 3. number of hops between the top-most level id and the NCBI tax id assiciated with the organism 4. ordered list of OrthoDB selected intermediate levels from the top-most level to the bottom one odb11v0_genes.tab 1. OrthoDB unique gene id (not stable between releases) 2. OrthoDB individual organism id 3. protein original sequence id, as downloaded along with the sequence 4. semicolon separated list of synonyms, evaluated by mapping 5. Uniprot id, evaluated by mapping 6. semicolon separated list of ids from Ensembl, evaluated by mapping 7. NCBI gid or gene name, evaluated by mapping 8. description, evaluated by mapping odb11v0_gene_xrefs.tab 1. OrthoDB gene id 2. external gene identifier, either mapped or the original sequence id from Genes table 3. external DB name, one of {GOterm, InterPro, NCBIproteinGI, UniProt, ENSEMBL, NCBIgid, NCBIgenename} odb11v0_OGs.tab 1. OG unique id (not stable between releases) 2. level tax_id on which the group was built 3. OG name (the most common gene name within the group) odb11v0_OG2genes.tab 1. OG unique id 2. OrthoDB gene id odb11v0_OG_xrefs.tab 1. OG unique id 2. external DB or DB section 3. external identifier 4. number of genes in the OG associated with the identifier odb11v0_OG_pairs.tab 1. OG unique id 2. Parent OG unique id