4. Reaction Properties

The division of the ME-model into MEReaction and ProcessData classes allows the user to essentially have access to the entire database of information used to construct the model. The following will show how this can be leveraged to easily query, edit and update aspects of the model reactions.

In [1]:
import pickle
from collections import defaultdict
from os.path import abspath, dirname, join

import pandas as pd
import cobra.test

import cobrame
import ecolime
/home/sbrg-cjlloyd/cobrapy/cobra/io/sbml3.py:24: UserWarning: Install lxml for faster SBML I/O
  warn("Install lxml for faster SBML I/O")
/home/sbrg-cjlloyd/cobrapy/cobra/io/__init__.py:12: UserWarning: cobra.io.sbml requires libsbml
  warn("cobra.io.sbml requires libsbml")
In [2]:
# Load E. coli ME-model
ecoli_dir = dirname(abspath(ecolime.__file__))
model_dir = join(ecoli_dir, 'me_models/iJL1678b.pickle')
with open(model_dir, 'rb') as f:
    me = pickle.load(f)

# Load E. coli M-model
iJO1366 = cobra.test.create_test_model('ecoli')

4.1. Metabolic Reactions

These are the ME-model representations of all reactions in the metabolic reconstruction, in this case iJO1366

In [3]:
print('number of reactions in iJO1366 (excluding exchange) = %i' %
      len([i.id for i in iJO1366.reactions if not i.id.startswith('EX_')]))
print('number of stoichiometric data objects = %i\n' %
      len([r.id for r in me.stoichiometric_data]))
print('number of metabolic reactions = %i' %
      len([r.id for r in me.reactions if type(r) == cobrame.MetabolicReaction]))
print('number of complex data objects = %i' %
      len([r.id for r in me.complex_data]))
number of reactions in iJO1366 (excluding exchange) = 2259
number of stoichiometric data objects = 2282

number of metabolic reactions = 5266
number of complex data objects = 1445

Through a MetabolicReaction, the user has direct access to the StoichiometricData, ComplexData and keff used to construct the reaction.

In [4]:
rxn = me.reactions.get_by_id('E4PD_FWD_GAPDH-A-CPLX')

# Access the StoichiometricData and ComplexData directly through reaction
stoich_data = rxn.stoichiometric_data
complex_data = rxn.complex_data

print(rxn.reaction + '\n')
print('This reactions is formed using the StoichiometricData (%s) and ComplexData (%s) with keff = %.2f' %
      (stoich_data.id, complex_data.id, rxn.keff))
3.20942472231275e-6*mu GAPDH-A-CPLX + e4p_c + h2o_c + nad_c --> 4per_c + 2.0 h_c + nadh_c

This reactions is formed using the StoichiometricData (E4PD) and ComplexData (GAPDH-A-CPLX) with keff = 86.55

As a best practice, the ComplexData and StoichiometricData themselves should not be changed. If these need changed then a new MetabolicReaction should be created.

4.1.1. Edit keffs

Further, the keff is an attribute of the MetabolicReaction itself and not of the ComplexData changing the ComplexData will not affect the form of the MetabolicReaction.

In [5]:
print('keff = %d: \n\t%s' % (rxn.keff, rxn.reaction))
rxn.keff = 65.
rxn.update()
print('keff = %d: \n\t%s' % (rxn.keff, rxn.reaction))
keff = 86:
        3.20942472231275e-6*mu GAPDH-A-CPLX + e4p_c + h2o_c + nad_c --> 4per_c + 2.0 h_c + nadh_c
keff = 65:
        4.27350427350427e-6*mu GAPDH-A-CPLX + e4p_c + h2o_c + nad_c --> 4per_c + 2.0 h_c + nadh_c

4.1.2. Edit stoichiometry

Aspects of the StoichiometricData, however, can be changed. This includes: - Reaction stoichiometry - Reaction upper & lower bounds

Currently the stoichiometry is:

In [6]:
stoich_data.stoichiometry
Out[6]:
{'4per_c': 1.0,
 'e4p_c': -1.0,
 'h2o_c': -1.0,
 'h_c': 2.0,
 'nad_c': -1.0,
 'nadh_c': 1.0}

This can be updated to, for instance, translocate a hydrogen by performing the following

In [7]:
stoich_data.stoichiometry['h_c'] = -1
stoich_data.stoichiometry['h_p'] = 1
rxn.update()
print('%s: \n\t%s' % (rxn.id, rxn.reaction))
E4PD_FWD_GAPDH-A-CPLX:
        4.27350427350427e-6*mu GAPDH-A-CPLX + e4p_c + h2o_c + h_c + nad_c --> 4per_c + h_p + nadh_c

This change can be usd to update both the forward and reverse reaction

In [8]:
rxn_rev = me.reactions.get_by_id(rxn.id.replace('FWD', 'REV'))
rxn_rev.update()
print('%s: \n\t%s' % (rxn_rev.id, rxn_rev.reaction))
E4PD_REV_GAPDH-A-CPLX:
        4per_c + 3.20942472231275e-6*mu GAPDH-A-CPLX + h_p + nadh_c --> e4p_c + h2o_c + h_c + nad_c

A simpler approach is to updated the parent reactions for StoichiometricData. This will update any instances of the reaction catalyzed by an isozyme.

In [9]:
stoich_data.stoichiometry['h_c'] = -2
stoich_data.stoichiometry['h_p'] = 2
for r in stoich_data.parent_reactions:
    r.update()
    print('%s: \n\t%s' % (r.id, r.reaction))
E4PD_FWD_ERYTH4PDEHYDROG-CPLX:
        0.0135143204065698*mu ERYTH4PDEHYDROG-CPLX + e4p_c + h2o_c + 2.0 h_c + nad_c --> 4per_c + 2.0 h_p + nadh_c
E4PD_FWD_GAPDH-A-CPLX:
        4.27350427350427e-6*mu GAPDH-A-CPLX + e4p_c + h2o_c + 2.0 h_c + nad_c --> 4per_c + 2.0 h_p + nadh_c
E4PD_REV_ERYTH4PDEHYDROG-CPLX:
        4per_c + 3.09490345954754e-6*mu ERYTH4PDEHYDROG-CPLX + 2.0 h_p + nadh_c --> e4p_c + h2o_c + 2.0 h_c + nad_c
E4PD_REV_GAPDH-A-CPLX:
        4per_c + 3.20942472231275e-6*mu GAPDH-A-CPLX + 2.0 h_p + nadh_c --> e4p_c + h2o_c + 2.0 h_c + nad_c

4.1.3. Edit upper and lower reaction bounds

The upper and lower bounds can be edited through the stoichiometric data and updated to the metabolic reaction

Important: do not change the upper and lower bounds of a MetabolicReaction directly. If this is done than the change will be overwritten when the update function is ran (shown below)

In [10]:
rxn.lower_bound = -1000
print('Lower bound = %d' %rxn.lower_bound)
rxn.update()
print('Lower bound = %d' %rxn.lower_bound)
Lower bound = -1000
Lower bound = 0

Editing the reaction bounds of the StoichiometricData, however, will edit the bounds of the forward and reverse reaction, as well as any instances of the reaction catalyzed by isozymes

In [11]:
stoich_data.lower_bound = 0.
print('Upper Bounds\n--------------------------------------')
for r in stoich_data.parent_reactions:
    direction = 'Forward' if r.reverse is False else 'Reverse'
    print('%s Before Update \n\t%s: %s' % (direction, r.id, r.upper_bound))
    r.update()
    print('%s After Update \n\t%s: %s' % (direction, r.id, r.upper_bound))
Upper Bounds
--------------------------------------
Forward Before Update
        E4PD_FWD_ERYTH4PDEHYDROG-CPLX: 1000.0
Forward After Update
        E4PD_FWD_ERYTH4PDEHYDROG-CPLX: 1000.0
Forward Before Update
        E4PD_FWD_GAPDH-A-CPLX: 1000.0
Forward After Update
        E4PD_FWD_GAPDH-A-CPLX: 1000.0
Reverse Before Update
        E4PD_REV_ERYTH4PDEHYDROG-CPLX: 1000.0
Reverse After Update
        E4PD_REV_ERYTH4PDEHYDROG-CPLX: 0
Reverse Before Update
        E4PD_REV_GAPDH-A-CPLX: 1000.0
Reverse After Update
        E4PD_REV_GAPDH-A-CPLX: 0

4.2. Transcription Reactions

In [12]:
print('number of transcription reactions = %i' %
      len([r.id for r in me.reactions if type(r) == cobrame.TranscriptionReaction]))
print('number of transcription data objects = %i' %  len(list(me.transcription_data)))
print('number of transcribed genes (RNA) = %i' %
      len([m.id for m in me.metabolites if type(m) == cobrame.TranscribedGene]))
number of transcription reactions = 1447
number of transcription data objects = 1447
number of transcribed genes (RNA) = 1679

4.2.1. TranscribedGene (RNA) metabolite properties

Transciption occurs via operons contained within the organisms genome or transcription unit (TU). This means that often, a transcribed region will code for multiple RNAs. The E. coli ME-model has 4 possible RNA types that can be transcribed: - mRNA - tRNA - rRNA - ncRNA (noncoding RNA)

mRNAs can then translated directly from the full transcribed TU, while rRNA, tRNA and ncRNA are spliced out of the TU by endonucleases. In these cases, in order to know which bases need excized, the RNA metabolites (TranscribedGene) themselves have to store information such as: - DNA strand, left and right genome position to identify which TU the RNA is a part of - RNA type to determine whether it needs excised from the TU - nucleotide sequence to determine bases that do/do not need excised if not mRNA and the RNA mass for biomass constraint

An example of a TranscribedGene’s attributes is shown below

In [13]:
pd.DataFrame({i: str(v) for i, v in me.metabolites.RNA_b3201.__dict__.items() if not i.startswith('_') and v},
             index=['Atribute Values']).T
Out[13]:
Atribute Values
RNA_type mRNA
formula C6890H7816N2720O5091P726
id RNA_b3201
left_pos 3341965
nucleotide_sequence ATGGCAACATTAACTGCAAAGAACCTTGCAAAAGCCTATAAAGGCC...
right_pos 3342691
strand +

4.2.2. TranscriptionReaction/TranscriptionData properties

Each TranscriptionReaction in a COBRAme ME-model is associated with exactly one TranscriptionData which includes everything necessary to define a reaction. This includes: - subreactions To handle enzymatic processes not performed by RNA polymerase - RNA Polymerase Different RNA polymerase metabolite for different sigma factors - RNA Products TUs often contain more than one RNA in sequence - Nucleotide sequence

The TranscriptionData for TU containing the gene above is shown below

In [14]:
rxn = me.reactions.transcription_TU_8398_from_RPOE_MONOMER
data = rxn.transcription_data
pd.DataFrame({i: str(v) for i, v in data.__dict__.items()}, index=['Atribute Values']).T
Out[14]:
Atribute Values
RNA_polymerase RNAPE-CPLX
RNA_products {'RNA_b3201', 'RNA_b3202'}
_model iJL1678b-ME
_parent_reactions {'transcription_TU_8398_from_RPOE_MONOMER'}
id TU_8398_from_RPOE_MONOMER
nucleotide_sequence ACAAACTCAGCCTTAATCTTGTGCTTGCCAGCTCACTTCTGGCCGC...
subreactions defaultdict(<class 'int'>, {'Transcription_nor...

This reaction currently uses a subreaction called Transcription_normal_rho_dependent to account for the elongation factors etc. associated with transcription. This TU also requires a rho factor to terminate transcription. These complexes can be removed from the reaction by running the following

In [15]:
print('with subreactions: \n' + rxn.reaction)
print('--------------------')
data.subreactions = {}
for r in data.parent_reactions:
    r.update()
print('\nwithout subreactions: \n' + rxn.reaction)
with subreactions:
4.27350427350427e-6*mu GreA_mono + 4.27350427350427e-6*mu GreB_mono + 4.27350427350427e-6*mu Mfd_mono_mod_1:mg2 + 4.27350427350427e-6*mu NusA_mono + 4.27350427350427e-6*mu NusG_mono + 0.0218585689888099*mu + 0.00855762975911906 RNAPE-CPLX + 4.27350427350427e-6*mu Rho_hexa_mod_3:mg2 + 4.27350427350427e-6*mu RpoZ_mono_mod_1:mg2 + 1020.0 atp_c + 1181 ctp_c + 1190 gtp_c + 3.0 h2o_c + 1186 utp_c --> RNA_b3201 + RNA_b3202 + 3.0 adp_c + 3.0 h_c + 691.702633 mRNA_biomass + 3.0 pi_c + 4574 ppi_c
--------------------

without subreactions:
0.0218585689888099*mu + 0.00855762975911906 RNAPE-CPLX + 1017 atp_c + 1181 ctp_c + 1190 gtp_c + 1186 utp_c --> RNA_b3201 + RNA_b3202 + 691.702633 mRNA_biomass + 4574 ppi_c

This poses a problem where, if RNA_b3201 and RNA_b3202 are not required in equal amounts, the model will become infeasible. To accound for this, all RNAs have a demand reaction associated with them. mRNA_biomass is consumed for each demand reaction with a coefficient equal to the molecular weight of each RNA (in kDa). This prevents the model from overproducing RNA to increase biomass production, and therefore growth rate, in some instances. More on the implications of the biomass constraint can be found in ME-Model Fundamentals

In [16]:
for rna in data.RNA_products:
    r = me.reactions.get_by_id('DM_' + rna)
    print('%s: %s' % (r.id, r.reaction))
DM_RNA_b3201: RNA_b3201 + 232.671391 mRNA_biomass -->
DM_RNA_b3202: RNA_b3202 + 459.03124199999996 mRNA_biomass -->

As is, this reaction produces two mRNAs so no nucleotides are excised. If one or both is changed to a stable RNA (rRNA, tRNA or ncRNA) bases will be excised.

In [17]:
me.metabolites.RNA_b3202.RNA_type = 'rRNA'
for r in data.parent_reactions:
    r.update()
    print(r.reaction)
0.0218585689888099*mu + 0.00855762975911906 RNAPE-CPLX + 1017 atp_c + 1181 ctp_c + 1190 gtp_c + 2414 h2o_c + 1186 utp_c --> RNA_b3201 + RNA_b3202 + 557 amp_c + 605 cmp_c + 613 gmp_c + 2414 h_c + 232.671391 mRNA_biomass + 4574 ppi_c + 459.03124199999996 rRNA_biomass + 639 ump_c

Changing RNA_b3202 to an rRNA and updating the transcription reaction causes both of the RNAs to now be excised from the TU, as indicated by the nucleotide monophosphates that appear in the products. This is not a complete picture because this process is catalyzes by an endonuclease, whose activity can be incorporated as ModificationData. Updating the reaction after adding these processes incorporates

In [18]:
data.subreactions['rRNA_containing_excision'] = len(data.RNA_products) * 2
data.subreactions['RNA_degradation_machine'] = len(data.RNA_products) * 2
data.subreactions['RNA_degradation_atp_requirement'] = sum(data.excised_bases.values())
for r in data.parent_reactions:
    r.update()
    print(r.reaction)
0.0218585689888099*mu + 0.00855762975911906 RNAPE-CPLX + 1.70940170940171e-5*mu RNA_degradosome + 1620.5 atp_c + 1181 ctp_c + 1190 gtp_c + 3017.5 h2o_c + 1.70940170940171e-5*mu rRNA_containing_excision_machinery + 1186 utp_c --> RNA_b3201 + RNA_b3202 + 603.5 adp_c + 557 amp_c + 605 cmp_c + 613 gmp_c + 3017.5 h_c + 232.671391 mRNA_biomass + 603.5 pi_c + 4574 ppi_c + 459.03124199999996 rRNA_biomass + 639 ump_c

4.3. Translation Reactions

In [19]:
print('number of translation reactions = %i' %
      len([r.id for r in me.reactions if type(r) == cobrame.TranslationReaction]))
print('number of translation data objects = %i' %  len(list(me.translation_data)))
print('number of translated genes (proteins) = %i' %
      len([m.id for m in me.metabolites if type(m) == cobrame.TranslatedGene]))
number of translation reactions = 1569
number of translation data objects = 1569
number of translated genes (proteins) = 1569

4.3.1. TranslatedGene (Protein) metabolite properties

For COBRAme ME-models, proteins are translated directly from mRNA metabolites not from TUs. This means that all information required to construct a TranslationReaction can be found in its TranslationData therefore no extra information is contained in a TranslatedGene object.

4.3.2. TranslationReaction/TranslationData properties

Each TranslationReaction in a COBRAme ME-model is associated with exactly one TranslationData which includes everything necessary to define the reaction. This includes: - subreactions To handle enzymatic processes not performed by ribosome and incorporate tRNAs - mRNA ID of mRAN being translated - term_enzyme Enzyme that catalyzes translation termination - Nucleotide sequence

The TranslationData for a TranslationReaction is shown below

In [20]:
rxn = me.reactions.translation_b2020
data = rxn.translation_data
pd.DataFrame({i: str(v) for i, v in data.__dict__.items()}, index=['Atribute Values']).T
Out[20]:
Atribute Values
_model iJL1678b-ME
_parent_reactions {'translation_b2020'}
id b2020
mRNA RNA_b2020
nucleotide_sequence ATGAGCTTTAACACAATCATTGACTGGAATAGCTGTACTGCGGAGC...
protein protein_b2020
subreactions defaultdict(<class 'int'>, {'met_addition_at_A...

The rest of the information required to define a translation reaction can be dynamically computed from these attributes.

For example, the amino acid sequence is calculated from the nucleotide sequence with:

In [21]:
str(data.amino_acid_count)
Out[21]:
"defaultdict(<class 'int'>, {'met__L_c': 7, 'ser__L_c': 34, 'phe__L_c': 12, 'asn__L_c': 13, 'thr__L_c': 30, 'ile__L_c': 22, 'asp__L_c': 22, 'trp__L_c': 2, 'cys__L_c': 7, 'ala__L_c': 63, 'glu__L_c': 28, 'gln__L_c': 23, 'arg__L_c': 21, 'leu__L_c': 39, 'pro__L_c': 22, 'val__L_c': 36, 'lys__L_c': 17, 'gly_c': 24, 'tyr__L_c': 7, 'his__L_c': 5})"

The codon count from the sequence can be used to determine the subreaction required for charged tRNA-mediated amino acid addition.

In [22]:
str(data.subreactions_from_sequence)
Out[22]:
"{'met_addition_at_AUG': 6, 'ser_addition_at_AGC': 12, 'phe_addition_at_UUU': 6, 'asn_addition_at_AAC': 10, 'thr_addition_at_ACA': 2, 'ile_addition_at_AUC': 9, 'ile_addition_at_AUU': 12, 'asp_addition_at_GAC': 8, 'trp_addition_at_UGG': 2, 'asn_addition_at_AAU': 3, 'cys_addition_at_UGU': 3, 'thr_addition_at_ACU': 7, 'ala_addition_at_GCG': 19, 'glu_addition_at_GAG': 12, 'gln_addition_at_CAA': 5, 'arg_addition_at_CGC': 12, 'gln_addition_at_CAG': 18, 'leu_addition_at_CUG': 23, 'leu_addition_at_UUA': 4, 'pro_addition_at_CCG': 17, 'ser_addition_at_UCC': 6, 'ala_addition_at_GCC': 25, 'ser_addition_at_UCU': 5, 'glu_addition_at_GAA': 16, 'thr_addition_at_ACC': 14, 'val_addition_at_GUU': 7, 'asp_addition_at_GAU': 14, 'leu_addition_at_CUC': 4, 'val_addition_at_GUG': 17, 'lys_addition_at_AAA': 12, 'ala_addition_at_GCA': 11, 'gly_addition_at_GGC': 12, 'arg_addition_at_CGG': 1, 'tyr_addition_at_UAC': 5, 'lys_addition_at_AAG': 5, 'thr_addition_at_ACG': 7, 'leu_addition_at_CUA': 2, 'val_addition_at_GUA': 7, 'phe_addition_at_UUC': 6, 'his_addition_at_CAC': 5, 'pro_addition_at_CCA': 2, 'arg_addition_at_CGU': 8, 'cys_addition_at_UGC': 4, 'val_addition_at_GUC': 5, 'ala_addition_at_GCU': 8, 'ser_addition_at_UCA': 6, 'gly_addition_at_GGG': 4, 'leu_addition_at_UUG': 3, 'tyr_addition_at_UAU': 2, 'pro_addition_at_CCU': 1, 'ser_addition_at_AGU': 1, 'leu_addition_at_CUU': 3, 'gly_addition_at_GGU': 8, 'pro_addition_at_CCC': 2, 'ser_addition_at_UCG': 4, 'ile_addition_at_AUA': 1}"

Changing the nucleotide sequence will automically update these values.

In [23]:
data.nucleotide_sequence = 'ATGAGCTTTAAC'
print('Elongation Subreactions')
print(str(data.subreactions_from_sequence))
print('\nOne of each start subreaction')
print(str(data.add_initiation_subreactions()))
print('\nNo termination subreaction (AAC) not valid stop codon')
print(str(data.add_termination_subreactions()))
Elongation Subreactions
{'ser_addition_at_AGC': 1, 'phe_addition_at_UUU': 1}

One of each start subreaction
None

No termination subreaction (AAC) not valid stop codon
None
/home/sbrg-cjlloyd/cobrame/cobrame/core/processdata.py:888 UserWarning: RNA_b2020 starts with 'AUG' which is not a start codon
/home/sbrg-cjlloyd/cobrame/cobrame/core/processdata.py:925 UserWarning: No termination enzyme for RNA_b2020

Stop codons are defined for the organism beforehand. Changing the sequence to a valid stop codon corrects this

In [24]:
data.nucleotide_sequence = 'ATGAGCTTTTAA'
print('Termination subreaction')
print(str(data.add_termination_subreactions()))
Termination subreaction
None
/home/sbrg-cjlloyd/cobrame/cobrame/core/processdata.py:925 UserWarning: No termination enzyme for RNA_b2020

4.4. ComplexFormation Reactions

In [25]:
print('number of complex formation reactions = %i' %
      len([r.id for r in me.reactions if type(r) == cobrame.ComplexFormation]))
print('number of complex data objects = %i' %  len(list(me.complex_data)))
print('')

print('number of complexes = %i' %
      len([m.id for m in me.metabolites if type(m) == cobrame.Complex]))
number of complex formation reactions = 1445
number of complex data objects = 1445

number of complexes = 1538

some complexes are modified in a metabolic process (e.g. Acyl Carrier Protein sidechain reactions) ### Complex Metabolite Properties Like TranslatedGenes, Complex metabolites do not need to store any additional information. ### ComplexFormation / ComplexData Properties Each ComplexFormation reaction in a COBRAme ME-model is associated with exactly one ComplexData which includes everything necessary to define the reaction. This includes: - modification To define the modifications by prosthetic groups or cofactors that can be required for the complex to catalyze cellular processes - stoichiometry Stoichiometry of the protein subunits

The ComplexData for a ComplexFormation reaction is shown below

In [26]:
rxn = me.reactions.get_by_id('formation_2OXOGLUTARATEDEH-CPLX_mod_mg2_mod_lipo')
data = me.process_data.get_by_id(rxn.complex_data_id)
pd.DataFrame({i: str(v) for i, v in data.__dict__.items()}, index=['Atribute Values']).T
Out[26]:
Atribute Values
_complex_id None
_model iJL1678b-ME
_parent_reactions {'AKGDH_FWD_2OXOGLUTARATEDEH-CPLX_mod_mg2_mod_...
id 2OXOGLUTARATEDEH-CPLX_mod_mg2_mod_lipo
stoichiometry defaultdict(<class 'float'>, {'protein_b0726':...
subreactions {'mod_mg2_c': 1.0, 'mod_lipo_c': 1.0}

This complex has two modification mod_lipo_c and mod_mg_c. You can view the properties of these modifications by accessing their ModificationData objects.

In [27]:
for mod in data.subreactions:
    mod_data = me.process_data.get_by_id(mod)
    print(mod_data.id)
    for key, value in mod_data.__dict__.items():
        if not key.startswith('_') and value:
            print('\t', key, value)
mod_mg2_c
         id mod_mg2_c
         stoichiometry {'mg2_c': -1}
         keff 65.0
mod_lipo_c
         id mod_lipo_c
         stoichiometry {'lipoamp_c': -1, 'amp_c': 1, 'h_c': 2}
         enzyme EG11796-MONOMER
         keff 65.0

The information in the ModificationData and ComplexData are assembled in the ComplexFormation reaction shown below

In [28]:
print(rxn.reaction)
4.27350427350427e-6*mu EG11796-MONOMER + lipoamp_c + mg2_c + 2.0 protein_b0116 + 12.0 protein_b0726 + 24.0 protein_b0727 --> 2OXOGLUTARATEDEH-CPLX_mod_mg2_mod_lipo + amp_c + 2.0 h_c + 0.21160733999999998 prosthetic_group_biomass