menu multiTFA 0.1.2 documentation

multiTFA Documentation

1. Getting Started

multiTFA is a python package for accurate thermodynamic flux analysis. multiTFA is built on COBRApy to take advantage of COBRA model structure. Installation instructions can be found in README.rst.

1.1. Building a multiTFA model

To create a multitfa model, we require

  1. A COBRA model

  2. Metabolite database identifiers, such as BIGG, KEGG, SEED etc. - This is required to match and retrieve thermodyanamic information against our database. Metabolite identifiers can be added to the multiTFA model later. For simplicity, we use the same nomneclature as equilibrator-api.

  3. Compartment specific pH and ionic strength information - To calculate Gibbs energy related to transmembrane transport, information about membrane potential is also required in the form of pandas Dataframes.

Optionally, intracellular metabolite concentrations can be provided to calculate the transformed Gibbs energy of reactions. If metabolite concentrations are not available, a broad metabolite concentration range is assumed.

NOTE: All the input information should be in S.I units, e.g: ionic strength, metabolite concentration in Moles.

To demonstrate the usage, we use E. Coli core model from BIGG database.

First step in creating a multitfa model is to get a COBRA model. For this example, we load E. coli core model. This is available in Examples directory.

[1]:
from cobra.io import load_model

# Load the Cobra model
cobra_model = load_model('e_coli_core')

Let us define the compartment properties such as pH, ionic strength (M) etc. For transport calculations, we will provide membrane potential in mV. For convenience, we use pandas DataFrame to define these properties.

[2]:
# compartment specific pH and Ionic strength information. Ionic strength units should be in M.
pH_I_dict = {
    "pH": {"c": 7.5, "e": 7, "p": 7},
    "I": {"c": 0.25, "e": 0, "p": 0},
}

# Membrane potentials between different compartments. Units for membrane potential is mV.
del_psi_dict = {
"c": {"c":0,"e": 0, "p": 150},
"e": {"c": 0,"e":0, "p": 0},
"p": {"c": -150, "e": 0, "p":0},
}

import pandas as pd
# Define membrane potential and compartment information in Dataframes
del_psi = pd.DataFrame.from_dict(data=del_psi_dict)
comp_info = pd.DataFrame.from_dict(data=pH_I_dict)

Now, it is time to list any reactions that needs to be excluded from thermodynamic analysis. For example, Biomass, demand reactions etc.

[3]:
# List of reaction ids that you would like to exclude from thermodynamic analysis, here we exclude demand reactions
Excl = [
    rxn.id
    for rxn in cobra_model.reactions
    if rxn.id.startswith("EX_") or rxn.id.startswith("DM_")
] + ["BIOMASS_Ecoli_core_w_GAM", "O2t", "H2Ot"]

Now, we have all the information we need. We can now create an instance of multitfa model.

Please note, when you use multitfa for the very first time, it downloads datafiles of about 3 GB. This is a one time thing only.

[4]:
# Now build the multiTFA model from COBRA model
from multitfa.core import tmodel
tfa_model = tmodel(
    cobra_model, Exclude_list=Excl, compartment_info=comp_info, membrane_potential=del_psi
)
Downloading package metadata...
Fragments already downloaded
Downloading package metadata...
Fragments already downloaded
Downloading package metadata...
Fragments already downloaded

In order to access the thermodynamic information from equilibrator api, we need to add a external database identifier against all the metabolites in tfa model. This can be done by populating the ‘Kegg_id’ property of the metabolites. If you do not know the identifier for some metabolites, it is okay, you can simply add NA against it. For simplicity, we use the notation similar to equilibrator api. For the example below in the E. coli core model, we use bigg identifiers.

You can follow equilibrator api notation for other database identifiers, for example, if you are using KEGG identifier, simply use kegg:kegg identifier.

[5]:
for met in tfa_model.metabolites:
    kegg_id = 'bigg.metabolite:'+met.id[:-2]
    met.Kegg_id = kegg_id

tfa_model.update()

Now that we have setup the model and populated the thermodynamic constraints, its better to check if constraints are added properly.

[6]:
print("Number of constraints before update {}\n".format(len(cobra_model.constraints)))
print("Number of reactions in the model {}\n".format(len(tfa_model.reactions)))
print("Number of reactions excluded from thermodynamic analysis {}\n".format(len(tfa_model.Exclude_reactions)))
print("Number of mass balance constraints {}\n".format(len(tfa_model.metabolites)))
num_cons = 2*3*(len(tfa_model.reactions) - len(tfa_model.Exclude_reactions)) + len(tfa_model.metabolites)
print("Number of constraints to be present after update {}\n".format(num_cons))
print("Number of constraints present after update {}\n".format(len(tfa_model.constraints)))

Number of constraints before update 72

Number of reactions in the model 95

Number of reactions excluded from thermodynamic analysis 23

Number of mass balance constraints 72

Number of constraints to be present after update 504

Number of constraints present after update 504

1.2 Metabolites

Reactions and metabolites in multiTFA model can be accessed similar to a COBRA model. For demonstration, we will consider cytosolic ATP as our metabolite with atp_c identifier in t_model. We will access ATP and print it’s standard Gibbs energy of formation and Also, change its intracellular concentration lower and upper bounds to ‘2e-3’ and ‘5e-2’ M respectively from it’s default bounds. These changes will reflect in solver variables and constraints.

[7]:
atp = tfa_model.metabolites.get_by_id('atp_c')
print("Standard Gibbs energy of formation of ATP is {} KJ/mol".format(atp.delG_f))
print("initial concentrations")
print('{}\t{}\t{}'.format(atp.id, atp.concentration_min, atp.concentration_max))
atp.concentration_min = 2e-3
atp.concentration_max = 5e-2
print("Concentrations after modifying")
print('{}\t{}\t{}'.format(atp.id, atp.concentration_min, atp.concentration_max))
Standard Gibbs energy of formation of ATP is -2259.1882733696866 KJ/mol
initial concentrations
atp_c   1e-05   0.02

Selected objective sense:  MINIMIZE
Selected objective  name:  f3d52b45-5dba-11eb-bac2-c1c11d7a5b17
Selected RHS        name:  rhs
Selected bound      name:  bnd
Concentrations after modifying
atp_c   0.002   0.05

You can always change the Gibbs energy of formation value of a metabolite by resetting the delG_f property of the metabolite. Also, you can access additional information like, major pseudoisomer at given compartment conditions (major_ms), equilibrator accession of the metabolite.

1.3 Reactions

Reactions in multiTFA model are of type cobra.DictList. Similar to COBRA model, reactions can be accessed/retrieved by its index or by it’s name and properties/attributes can be accessed. Let us consider ATP synthase raction in our model by ‘ATPS4rpp’ and access it’s attributes.

[8]:
atps = tfa_model.reactions.get_by_id('ATPS4r')
print(atps.reaction)
adp_c + 4.0 h_e + pi_c <=> atp_c + h2o_c + 3.0 h_c

Access and change bounds of atps

[9]:
print("lower bound is {} and upper bound is {}\n".format(atps.lower_bound, atps.upper_bound))
atps.lower_bound = 0
print("updated lower and upper bounds are {} \t {}\n".format(atps.lower_bound, atps.upper_bound))

lower bound is -1000.0 and upper bound is 1000.0

updated lower and upper bounds are 0     1000.0

We can access the Gibbs energy of transport and transformed Gibbs energy of the reaction

[10]:
print("Gibbs energy of transport is {} KJ/mol".format(atps.delG_transport))
print("Transform Gibbs energy of is {} KJ/mol".format(atps.delG_prime))
Gibbs energy of transport is -4.605170185988092 KJ/mol
Transform Gibbs energy of is 32.39600409402766 KJ/mol

You can always reset the transport and transformed Gibbs energy of reactions. You can also access the stoichiometric matrix of the reaction, all the thermodynamic constraints associated with the reaction. Please refer to api for all the attributes of the model.

2. Solvers

multitfa model offers two different ways to solve the tMFA problem (Please refer to manuscript),

  1. Univariate type, where each group/component is allowed to vary in 95% confidence interval from respective mean. This is formulated as a mixed integer linear programming problem (MILP)

  2. Multivariate, where group/component are drawn from multivariate normal distribution subjected to the associated linear constraints. This is a mixed integer quadratic constraint program (MIQCP). In general, MIQCP are computationally expensive.

multitfa supports multiple solvers using optlang. We use this to solve the MILP problems mentioned above. Unfortunately, optlang doesnt support quadratic constraints yet, so we use Cplex and Gurobi to solve the MIQC type problems.

Lets see how to use different solvers with E. coli core model example

[1]:
import sys
sys.path.insert(0, "../../examples")
from paper_data_example import build_core_model
Downloading package metadata...
Fragments already downloaded
Downloading package metadata...
Fragments already downloaded
Downloading package metadata...
Fragments already downloaded
[2]:
tfa_model = build_core_model()

We can check which solver tfa_model using now

[3]:
print(tfa_model.solver.interface)
<module 'optlang.cplex_interface' from '/home/moritz/.pyenv/versions/3.6.12/envs/tmfa/lib/python3.6/site-packages/optlang/cplex_interface.py'>

Now lets change to cplex. Changing the solver is as simple as

[4]:
tfa_model.solver = 'cplex'
print(tfa_model.solver.interface)
<module 'optlang.cplex_interface' from '/home/moritz/.pyenv/versions/3.6.12/envs/tmfa/lib/python3.6/site-packages/optlang/cplex_interface.py'>

tfa_model has two separate solver interfaces as properties.

  1. cplex_interface

  2. gurobi_interface

Depending on what solver is installed in your environment and what solver you have selected (as shown in previous section), one of the two interfaces above will be active. These are the Model containers of respective solvers. You can access them the following way,

[5]:
tfa_model.cplex_interface

Selected objective sense:  MINIMIZE
Selected objective  name:  aeb0183b-5dba-11eb-bac2-c1c11d7a5b17
Selected RHS        name:  rhs
Selected bound      name:  bnd
[5]:
<cplex.Cplex at 0x7f05ee229c88>

2.1 Solving a multiTFA model

Once, we populate the model with thermodynamic constraints as described in section 1.1, it is time to solve the model. We demonstrate how to solve the model using both univariate and multivariate methods.

First lets solve using univariate method, lets call it box method.

[6]:
solution_box = tfa_model.optimize(solve_method='mip') # Solve using MILP method
print(solution_box)
<Solution 0.874 at 0x7f05ef657ba8>

Solution for the optimization is stored in the solution_box, which is instance of the Solution class. You can access the attributes like fluxes, Gibbs_energies and metabolite_concentrations.

[7]:
print(solution_box.fluxes,"\n")
print(solution_box.Gibbs_energies)
PFK         7.477382
PFL         0.000000
PGI         4.860861
PGK       -16.023526
PGL         4.959985
             ...
NADH16     38.534610
NADTRHD     0.000000
NH4t        4.765319
O2t        21.799493
PDH         9.282533
Name: fluxes, Length: 95, dtype: float64

dG_err_glc__D_e            -1.565741
dG_err_gln__L_c            -2.667186
dG_err_gln__L_e            -2.667186
dG_err_glu__L_c            -2.189853
dG_err_glu__L_e            -2.189853
                              ...
dG_NADTRHD_reverse_49725    0.368593
dG_NH4t                     9.161904
dG_NH4t_reverse_551ee      -9.161904
dG_PDH                     -6.724673
dG_PDH_reverse_ca160        6.724673
Name: Gibbs_energies, Length: 216, dtype: float64

Now, lets solve the model with MIQC problem with cplex interface.

[8]:
solution_miqc =  tfa_model.optimize() # In this case we are solving the MIQCP
print(solution_miqc)
<Solution -0.874 at 0x7f05ef657a20>

Attributes of solution_miqc can be accessed as described above.

Please note, there is a bug in cplex that reads the model objective in reverse when creating a new model. For example,

max a is read as min -a

So, when reading cplex_interface solution please reverse the sign.

3. Thermodynamic variability analysis (TVA)

There exists multiple flux states that define the same optimum. TVA predicts the ranges of variables such as metabolic fluxes, Gibbs energy of reactions and metabolite concentrations by taking into account of thermodynamic uncertainities using multivariate approach.

As described previously, you can solve the TVA problem on either the MILP or MIQCP. Although, MILP is faster compared to solving MIQCP, MIQCP is thermodynamically consistent as they drawn from multivariate distribution.

There are three different functions for running TVA in multiTFA. One for the MILP, the remaining two for MIQCP, depending on the solver you use (either Gurobi or Cplex).

By default, TVA is performed on all the variables in a model unless variable_list parameter is specified. It is a list of variable names you want to perform TVA on. For example reactons id’s or names of Gibbs energy variables etc.

The example below demonstrates how to perform TVA on E. coli core model with constraints described in previous sections. We will use cplex for our calculations. In our observations Cplex ourperforms Gurobi.

[1]:
import sys
sys.path.insert(0, "../../examples")
from paper_data_example import build_core_model
Downloading package metadata...
Fragments already downloaded
Downloading package metadata...
Fragments already downloaded
Downloading package metadata...
Fragments already downloaded
[2]:
tfa_model = build_core_model()

from multitfa import analysis
dg_vars = [rxn.delG_forward.name for rxn in tfa_model.reactions if rxn.id not in tfa_model.Exclude_reactions] # Gibbs energy variable names

Now let us perform TVA on tfa_model using univariate approch and calculate ranges on reaction flux variables

[3]:
tfa_model.solver = 'cplex'
ranges_box = analysis.variability(tfa_model, variable_list = dg_vars) # TVA using MILP
print(ranges_box)
               minimum    maximum
dG_PFK      -65.527361  26.205939
dG_PFL      -72.197674  33.074152
dG_PGI      -19.085713  24.931203
dG_PGK        0.000000  59.219274
dG_PGL      -40.671665  28.293843
...                ...        ...
dG_ME2      -70.983610  57.688660
dG_NADH16  -177.843571 -61.742679
dG_NADTRHD  -58.319625  31.641641
dG_NH4t     -13.742207  32.066015
dG_PDH     -116.749219  31.763632

[72 rows x 2 columns]

ranges_box is a pandas DataFrame which can be readily exported to a csv.

[4]:
ranges_box.to_csv('ranges.csv')

Now lets solve the TVA with multivariate approach using MIQCP. There are two functions one when using Gurobi and another for Cplex.

[5]:
ranges_MIQC_cplex = analysis.variability_legacy_cplex(tfa_model,variable_list = dg_vars) # TVA using MIQCP with cplex
print(ranges_MIQC_cplex)

Selected objective sense:  MINIMIZE
Selected objective  name:  510d6206-5dba-11eb-bac2-c1c11d7a5b17
Selected RHS        name:  rhs
Selected bound      name:  bnd

Selected objective sense:  MINIMIZE
Selected objective  name:  510d6206-5dba-11eb-bac2-c1c11d7a5b17
Selected RHS        name:  rhs
Selected bound      name:  bnd
                 minimum    maximum
dG_PFK     -6.349569e+01  24.174265
dG_PFL     -7.084883e+01  31.725309
dG_PGI     -1.957310e+01  25.255346
dG_PGK      4.296464e-09  56.021187
dG_PGL     -4.961171e+01  37.233887
...                  ...        ...
dG_ME2     -8.241713e+01  69.129417
dG_NADH16  -2.107758e+02 -33.736290
dG_NADTRHD -4.100270e+01  23.163226
dG_NH4t    -9.670293e+00  27.994102
dG_PDH     -1.200410e+02  31.809464

[72 rows x 2 columns]

Similarly, if you need to use Gurobi, you should use variability_legacy_gurobi function. Optionally, if you wish to warm start the optimization, you should specify warm_start parameter.

Please note, runnig TVA on genome scale model is computationally expensive, especially using MIQC problems. We have observed that relaxing the Gap parameter helped achieving faster run times.

3.1 Alternative way to solve MIQC

When Gurobi or Cplex is not available, and if you wanted to run MIQC problem, we developed a sampling based approach to solve the MIQCP. We sample the uncertainity variables on the surface of the ellipsoid and solve the subsequent MILP problem. This can be solved using MILP solvers. The exit criterion of the sampler can be chosen as either

  1. the number of samples since last improvement or

  2. a fixed number of samples followed by use of a generalized extreme value distribution to infer the maximum value.

[6]:
gev_ranges, gev_samples = analysis.gev_sampling(tfa_model, variable_list=dg_vars, cutoff = 100)
/home/moritz/.pyenv/versions/3.6.12/envs/tmfa/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:400: RuntimeWarning: invalid value encountered in double_scalars
  return m3 / np.power(m2, 1.5)
/home/moritz/.pyenv/versions/3.6.12/envs/tmfa/lib/python3.6/site-packages/scipy/stats/_continuous_distns.py:2666: RuntimeWarning: invalid value encountered in subtract
  -pex2+logpex2-logex2)
[7]:
print(gev_samples)
            minimum        maximum     minimum     maximum     minimum  \
dG_PFK          0.0  6.950160e-310  -57.126492   18.202297  -56.355503
dG_PFL          0.0  6.950167e-310  -57.487150   17.841639  -56.888039
dG_PGI          0.0  6.950162e-310  -15.744384   21.920011  -15.992455
dG_PGK          0.0  6.950167e-310    0.000000   50.801699    0.000000
dG_PGL          0.0  6.950167e-310  -33.529149   22.967443  -33.880045
...             ...            ...         ...         ...         ...
dG_ME2          0.0  6.950236e-310  -48.030535   46.130452  -57.819277
dG_NADH16       0.0  6.950166e-310 -165.244331 -101.608240 -157.179347
dG_NADTRHD      0.0  6.950166e-310  -37.952069   16.907125  -37.511270
dG_NH4t         0.0  6.950236e-310   -9.670293   27.994102   -9.670293
dG_PDH          0.0  6.950236e-310  -87.076334   14.224151  -92.323210

              maximum     minimum    maximum     minimum    maximum  ...  \
dG_PFK      18.973286  -58.369161  16.959629  -57.249094  18.079695  ...
dG_PFL      18.440751  -57.556992  17.771798  -57.588522  17.740267  ...
dG_PGI      21.671940  -16.329137  21.335258  -15.801123  21.863272  ...
dG_PGK      50.528925    0.000000  50.929908    0.000000  51.194993  ...
dG_PGL      22.616547  -32.791313  23.705279  -33.802262  22.694330  ...
...               ...         ...        ...         ...        ...  ...
dG_ME2      36.341710  -44.943975  49.217011  -49.060255  45.100732  ...
dG_NADH16  -93.221033 -146.633223 -83.709714 -154.574690 -91.433939  ...
dG_NADTRHD  18.001545  -38.347183  15.548170  -38.247234  16.037089  ...
dG_NH4t     27.994102   -9.670293  27.994102   -9.670293  27.994102  ...
dG_PDH       9.299499  -79.044267  20.157648  -89.783288  11.021858  ...

               minimum    maximum     minimum    maximum     minimum  \
dG_PFK      -57.361601  17.967189  -57.003212  18.325577  -57.404395
dG_PFL      -57.450801  17.877988  -59.125999  16.202791  -57.345710
dG_PGI      -15.155035  22.509359  -15.955762  21.708633  -16.048769
dG_PGK        0.000000  50.182441    0.000000  50.874615    0.000000
dG_PGL      -38.775775  17.720817  -37.063765  19.432827  -35.456340
...                ...        ...         ...        ...         ...
dG_ME2      -46.149320  48.011667  -55.544344  38.616642  -57.539017
dG_NADH16  -147.696089 -84.412259 -149.651005 -85.956494 -161.408033
dG_NADTRHD  -37.990377  16.290842  -37.853004  17.440327  -37.910998
dG_NH4t      -9.670293  27.994102   -9.670293  27.994102   -9.670293
dG_PDH      -84.435829  14.996814  -91.232542  10.126363  -93.275548

              maximum     minimum    maximum     minimum    maximum
dG_PFK      17.924394  -57.339358  17.989432  -57.041507  18.287283
dG_PFL      17.983080  -54.138492  21.190297  -55.180897  20.147892
dG_PGI      21.615625  -16.099379  21.565016  -15.495405  22.168989
dG_PGK      50.131498    0.000000  51.501463    0.000000  50.647421
dG_PGL      21.040252  -36.490842  20.005750  -35.674083  20.822509
...               ...         ...        ...         ...        ...
dG_ME2      36.621969  -56.770878  37.390109  -57.634138  36.526849
dG_NADH16  -98.454626 -148.205144 -85.254659 -154.115709 -90.590310
dG_NADTRHD  16.392663  -37.467504  17.599419  -37.682847  17.236616
dG_NH4t     27.994102   -9.670293  27.994102   -9.670293  27.994102
dG_PDH       7.342254  -93.586076   7.028803  -94.904108   6.285686

[72 rows x 202 columns]

gev_samples is the alternate set of Gibbs energy of reactions and gev_ranges is the max-min ranges of Gibbs energy of reactions. By specifying cutoff parameter, we specify how many samples we use to train the extreme value distribution (default, 1000). The more samples the better.

Alternatively, you could choose to use the other exit crieteria where if we don’t see improvement since a defined number of samples.

[8]:
cutoff_ranges, cutoff_samples, num_samples = analysis.cutoff_sampling(tfa_model, cutoff=10, variable_list=dg_vars)

Here, cutoff_ranges, cutoff_samples, num_samples represents max-min Gibbs energy ranges of reactions, alternate Gibbs energy of reactions and num of samples to achieve the exit.

[9]:
print(cutoff_samples)
            minimum    maximum     minimum    maximum     minimum    maximum  \
dG_PFK          0.0  19.994462  -57.381509  17.947280  -58.071537  17.257252
dG_PFL          0.0  21.678685  -60.857036  14.471753  -58.668320  16.660470
dG_PGI          0.0  22.843247  -16.019409  21.644985  -15.831312  21.833083
dG_PGK          0.0  52.777768    0.000000  51.082344    0.000000  51.522022
dG_PGL          0.0  25.614162  -36.254780  20.241812  -35.437215  21.059377
...             ...        ...         ...        ...         ...        ...
dG_ME2          0.0  50.162726  -53.493241  40.667746  -46.053466  48.107521
dG_NADH16       0.0 -50.696450 -161.222356 -98.560476 -146.618738 -84.279283
dG_NADTRHD      0.0  19.027818  -38.046908  15.441167  -37.461931  16.860187
dG_NH4t         0.0  27.993973   -9.670293  27.994102   -9.670293  27.994102
dG_PDH          0.0  19.494280  -93.291169   7.035106  -81.992453  17.703287

               minimum    maximum     minimum    maximum  ...     minimum  \
dG_PFK      -57.041093  18.287697  -57.075243  18.253547  ...  -56.378078
dG_PFL      -57.517947  17.810842  -53.692094  21.636696  ...  -58.208255
dG_PGI      -15.754052  21.910342  -16.727598  20.936797  ...  -16.119889
dG_PGK        0.000000  50.589288    0.000000  51.062350  ...    0.000000
dG_PGL      -32.623016  23.873576  -31.511479  24.985113  ...  -31.982957
...                ...        ...         ...        ...  ...         ...
dG_ME2      -54.088576  40.072410  -54.682189  39.478797  ...  -56.093864
dG_NADH16  -149.744292 -85.730297 -163.037216 -99.770295  ... -159.546298
dG_NADTRHD  -37.532229  17.851851  -38.441989  16.392084  ...  -37.486707
dG_NH4t      -9.670293  27.994102   -9.670293  27.994102  ...   -9.670293
dG_PDH      -88.726706  12.951684  -89.362655  11.568661  ...  -94.402224

              maximum     minimum    maximum     minimum    maximum  \
dG_PFK      18.950711  -56.928126  18.400663  -58.604627  16.724162
dG_PFL      17.120534  -57.489656  17.839133  -56.151032  19.177758
dG_PGI      21.544506  -15.870513  21.793882  -16.476337  21.188058
dG_PGK      50.387229    0.000000  49.334484    0.000000  50.463590
dG_PGL      24.513635  -37.281422  19.215170  -36.178770  20.317822
...               ...         ...        ...         ...        ...
dG_ME2      38.067123  -49.907013  44.253974  -50.748066  43.412921
dG_NADH16  -95.548451 -157.436641 -93.708334 -159.538517 -95.410680
dG_NADTRHD  17.803702  -37.723270  17.408852  -37.521470  17.786028
dG_NH4t     27.994102   -9.670293  27.994102   -9.670293  27.994102
dG_PDH       7.260018  -85.238851  15.998135  -85.434103  16.358129

               minimum    maximum     minimum    maximum
dG_PFK      -57.907485  17.421305  -56.873109  18.455681
dG_PFL      -56.184370  19.144420  -60.974034  14.354756
dG_PGI      -15.347801  22.316594  -16.130546  21.533848
dG_PGK        0.000000  50.097605    0.000000  50.037495
dG_PGL      -36.129072  20.367520  -32.519994  23.976598
...                ...        ...         ...        ...
dG_ME2      -52.093254  42.067733  -51.961487  42.199500
dG_NADH16  -151.118261 -87.355727 -158.860935 -95.815908
dG_NADTRHD  -38.440229  17.719414  -37.858607  17.706442
dG_NH4t      -9.670293  27.994102   -9.670293  27.994102
dG_PDH      -88.995090  12.431838  -89.969652  10.739770

[72 rows x 24 columns]

3.2 Network embeded thermodynamics (NET)

By using the TVA apparoches demonstrated above, one can perform NET type of analysis. For example, we can fix the directionalities of all the reactions and sample for the metabolite concentrations.

Reaction directionalities can be fixed by adjusting the lower or upper bounds of the reactions.

[10]:
atps = tfa_model.reactions.get_by_id('ATPS4r')
print("lower bound is {} and upper bound is {}\n".format(atps.lower_bound, atps.upper_bound))
atps.lower_bound = 0
print("updated lower and upper bounds are {} \t {}\n".format(atps.lower_bound, atps.upper_bound))
lower bound is -1000.0 and upper bound is 1000.0

updated lower and upper bounds are 0     1000.0

We can access the metabolite concentration variables as shown below,

[11]:
concentration_vars = [var.name for var in tfa_model.variables if var.name.startswith('lnc_')]
tfa_model.solver = 'cplex'
concentration_ranges = analysis.variability_legacy_cplex(tfa_model,variable_list = concentration_vars) # TVA using MIQCP with cplex
print(concentration_ranges)

Selected objective sense:  MINIMIZE
Selected objective  name:  510d6206-5dba-11eb-bac2-c1c11d7a5b17
Selected RHS        name:  rhs
Selected bound      name:  bnd
                minimum   maximum
lnc_glc__D_e -11.512925 -3.912023
lnc_gln__L_c -11.512925 -3.912023
lnc_gln__L_e -11.512925 -3.912023
lnc_glu__L_c -11.512925 -3.912023
lnc_glu__L_e -11.512925 -3.912023
...                 ...       ...
lnc_fru_e    -11.512925 -3.912023
lnc_fum_c    -11.512925 -3.912023
lnc_fum_e    -11.512925 -3.912023
lnc_g3p_c    -11.512925 -3.912023
lnc_g6p_c    -11.512925 -3.912023

[72 rows x 2 columns]

4. Debugging

It is often observed in genome scale models that adding thermodynamic constraints make the model infeasible. This might be due to variety of reasons, like ill-formatted covariance between groups/components, lumping of reactions etc. It is always not easy to find the exact cause of the problem. But, if you are using Gurobi/Cplex, you can find the constraints that render the model infeasible.

Please note that these methods give you random set of constraints that represent infeasibility. Also, a model can contain multiple infeasible sets. While modifying/removing these constraints, it is advisable to look for biological meaning.

Let us demonstrate the usage by breaking the model.

[1]:
import sys
sys.path.insert(0, "../../examples")
from paper_data_example import build_core_model
Downloading package metadata...
Fragments already downloaded
Downloading package metadata...
Fragments already downloaded
Downloading package metadata...
Fragments already downloaded
[2]:
tfa_model = build_core_model()
Using license file /home/vishnu/gurobi.lic
Academic license - for non-commercial use only
Read LP format model from file /tmp/tmpdzvydb7y.lp
Reading time = 0.00 seconds
: 72 rows, 190 columns, 720 nonzeros

Now lets change the concentration atp and find the objective value. Then, we set the lower bound of objective variable higher than the solution to break the model.

[3]:
atp = tfa_model.metabolites.get_by_id('atp_c')
atp.concentration_min = 2e-3
atp.concentration_max = 5e-2
solution = tfa_model.optimize()
print(solution)

<Solution 0.374 at 0x7f3c6ad4dda0>

Now lets adjust the biomass lower bound to 0.5 and find out the IIS set of reactions using Gurobi

[4]:
biomass = tfa_model.reactions.get_by_id('BIOMASS_Ecoli_core_w_GAM')
biomass.lower_bound = 0.5
solution = tfa_model.optimize()
tfa_model.solver.problem.computeIIS() # model.solver.problem represents the solver interface model
for cons in tfa_model.solver.problem.getConstrs():
    if cons.IISConstr:
        print(cons)
<gurobi.Constr glc__D_e>
<gurobi.Constr gln__L_c>
<gurobi.Constr gln__L_e>
<gurobi.Constr glu__L_c>
<gurobi.Constr glu__L_e>
<gurobi.Constr icit_c>
<gurobi.Constr lac__D_c>
<gurobi.Constr lac__D_e>
<gurobi.Constr mal__L_c>
<gurobi.Constr mal__L_e>
<gurobi.Constr 13dpg_c>
<gurobi.Constr 2pg_c>
<gurobi.Constr 3pg_c>
<gurobi.Constr oaa_c>
<gurobi.Constr pep_c>
<gurobi.Constr 6pgc_c>
<gurobi.Constr 6pgl_c>
<gurobi.Constr pyr_c>
<gurobi.Constr pyr_e>
<gurobi.Constr r5p_c>
<gurobi.Constr ru5p__D_c>
<gurobi.Constr acald_c>
<gurobi.Constr s7p_c>
<gurobi.Constr acald_e>
<gurobi.Constr accoa_c>
<gurobi.Constr succ_c>
<gurobi.Constr succ_e>
<gurobi.Constr succoa_c>
<gurobi.Constr acon_C_c>
<gurobi.Constr xu5p__D_c>
<gurobi.Constr actp_c>
<gurobi.Constr adp_c>
<gurobi.Constr akg_c>
<gurobi.Constr akg_e>
<gurobi.Constr amp_c>
<gurobi.Constr cit_c>
<gurobi.Constr dhap_c>
<gurobi.Constr e4p_c>
<gurobi.Constr etoh_c>
<gurobi.Constr etoh_e>
<gurobi.Constr f6p_c>
<gurobi.Constr fdp_c>
<gurobi.Constr fru_e>
<gurobi.Constr fum_c>
<gurobi.Constr fum_e>
<gurobi.Constr g3p_c>
<gurobi.Constr g6p_c>
<gurobi.Constr directionality_ACONTb>
<gurobi.Constr ind_ACONTb>
<gurobi.Constr delG_ACONTb>
<gurobi.Constr directionality_ATPS4r>
<gurobi.Constr ind_ATPS4r>
<gurobi.Constr delG_ATPS4r>

This returned us a list of constraints that can be modified to make the model status optimal. One has to choose and modify the list of constraints to manipulate based on biological meaning.

multitfa package

Subpackages

multitfa.analysis package

Submodules
multitfa.analysis.sampling module
multitfa.analysis.sampling_util module
multitfa.analysis.variability module
Module contents

multitfa.core package

Submodules
multitfa.core.compound module
multitfa.core.reaction module
multitfa.core.solution module
multitfa.core.tmodel module
Module contents

multitfa.data package

Module contents

multitfa.util package

Submodules
multitfa.util.constraints module
multitfa.util.linalg_fun module
multitfa.util.thermo_constants module
multitfa.util.util_func module
Module contents

Submodules

multitfa.helpers module

Module contents

Indices and tables