3. Thermodynamic variability analysis (TVA)¶
There exists multiple flux states that define the same optimum. TVA predicts the ranges of variables such as metabolic fluxes, Gibbs energy of reactions and metabolite concentrations by taking into account of thermodynamic uncertainities using multivariate approach.
As described previously, you can solve the TVA problem on either the MILP or MIQCP. Although, MILP is faster compared to solving MIQCP, MIQCP is thermodynamically consistent as they drawn from multivariate distribution.
There are three different functions for running TVA in multiTFA. One for the MILP, the remaining two for MIQCP, depending on the solver you use (either Gurobi or Cplex).
By default, TVA is performed on all the variables in a model unless variable_list
parameter is specified. It is a list
of variable names you want to perform TVA on. For example reactons id’s or names of Gibbs energy variables etc.
The example below demonstrates how to perform TVA on E. coli core model with constraints described in previous sections. We will use cplex
for our calculations. In our observations Cplex
ourperforms Gurobi
.
[1]:
import sys
sys.path.insert(0, "../../examples")
from paper_data_example import build_core_model
Downloading package metadata...
Fragments already downloaded
Downloading package metadata...
Fragments already downloaded
Downloading package metadata...
Fragments already downloaded
[2]:
tfa_model = build_core_model()
from multitfa import analysis
dg_vars = [rxn.delG_forward.name for rxn in tfa_model.reactions if rxn.id not in tfa_model.Exclude_reactions] # Gibbs energy variable names
Now let us perform TVA on tfa_model
using univariate approch and calculate ranges on reaction flux variables
[3]:
tfa_model.solver = 'cplex'
ranges_box = analysis.variability(tfa_model, variable_list = dg_vars) # TVA using MILP
print(ranges_box)
minimum maximum
dG_PFK -65.527361 26.205939
dG_PFL -72.197674 33.074152
dG_PGI -19.085713 24.931203
dG_PGK 0.000000 59.219274
dG_PGL -40.671665 28.293843
... ... ...
dG_ME2 -70.983610 57.688660
dG_NADH16 -177.843571 -61.742679
dG_NADTRHD -58.319625 31.641641
dG_NH4t -13.742207 32.066015
dG_PDH -116.749219 31.763632
[72 rows x 2 columns]
ranges_box
is a pandas DataFrame
which can be readily exported to a csv.
[4]:
ranges_box.to_csv('ranges.csv')
Now lets solve the TVA with multivariate approach using MIQCP. There are two functions one when using Gurobi
and another for Cplex
.
[5]:
ranges_MIQC_cplex = analysis.variability_legacy_cplex(tfa_model,variable_list = dg_vars) # TVA using MIQCP with cplex
print(ranges_MIQC_cplex)
Selected objective sense: MINIMIZE
Selected objective name: 510d6206-5dba-11eb-bac2-c1c11d7a5b17
Selected RHS name: rhs
Selected bound name: bnd
Selected objective sense: MINIMIZE
Selected objective name: 510d6206-5dba-11eb-bac2-c1c11d7a5b17
Selected RHS name: rhs
Selected bound name: bnd
minimum maximum
dG_PFK -6.349569e+01 24.174265
dG_PFL -7.084883e+01 31.725309
dG_PGI -1.957310e+01 25.255346
dG_PGK 4.296464e-09 56.021187
dG_PGL -4.961171e+01 37.233887
... ... ...
dG_ME2 -8.241713e+01 69.129417
dG_NADH16 -2.107758e+02 -33.736290
dG_NADTRHD -4.100270e+01 23.163226
dG_NH4t -9.670293e+00 27.994102
dG_PDH -1.200410e+02 31.809464
[72 rows x 2 columns]
Similarly, if you need to use Gurobi
, you should use variability_legacy_gurobi
function. Optionally, if you wish to warm start the optimization, you should specify warm_start
parameter.
Please note, runnig TVA on genome scale model is computationally expensive, especially using MIQC problems. We have observed that relaxing the Gap
parameter helped achieving faster run times.
3.1 Alternative way to solve MIQC¶
When Gurobi
or Cplex
is not available, and if you wanted to run MIQC problem, we developed a sampling based approach to solve the MIQCP. We sample the uncertainity variables on the surface of the ellipsoid and solve the subsequent MILP problem. This can be solved using MILP solvers. The exit criterion of the sampler can be chosen as either
the number of samples since last improvement or
a fixed number of samples followed by use of a generalized extreme value distribution to infer the maximum value.
[6]:
gev_ranges, gev_samples = analysis.gev_sampling(tfa_model, variable_list=dg_vars, cutoff = 100)
/home/moritz/.pyenv/versions/3.6.12/envs/tmfa/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:400: RuntimeWarning: invalid value encountered in double_scalars
return m3 / np.power(m2, 1.5)
/home/moritz/.pyenv/versions/3.6.12/envs/tmfa/lib/python3.6/site-packages/scipy/stats/_continuous_distns.py:2666: RuntimeWarning: invalid value encountered in subtract
-pex2+logpex2-logex2)
[7]:
print(gev_samples)
minimum maximum minimum maximum minimum \
dG_PFK 0.0 6.950160e-310 -57.126492 18.202297 -56.355503
dG_PFL 0.0 6.950167e-310 -57.487150 17.841639 -56.888039
dG_PGI 0.0 6.950162e-310 -15.744384 21.920011 -15.992455
dG_PGK 0.0 6.950167e-310 0.000000 50.801699 0.000000
dG_PGL 0.0 6.950167e-310 -33.529149 22.967443 -33.880045
... ... ... ... ... ...
dG_ME2 0.0 6.950236e-310 -48.030535 46.130452 -57.819277
dG_NADH16 0.0 6.950166e-310 -165.244331 -101.608240 -157.179347
dG_NADTRHD 0.0 6.950166e-310 -37.952069 16.907125 -37.511270
dG_NH4t 0.0 6.950236e-310 -9.670293 27.994102 -9.670293
dG_PDH 0.0 6.950236e-310 -87.076334 14.224151 -92.323210
maximum minimum maximum minimum maximum ... \
dG_PFK 18.973286 -58.369161 16.959629 -57.249094 18.079695 ...
dG_PFL 18.440751 -57.556992 17.771798 -57.588522 17.740267 ...
dG_PGI 21.671940 -16.329137 21.335258 -15.801123 21.863272 ...
dG_PGK 50.528925 0.000000 50.929908 0.000000 51.194993 ...
dG_PGL 22.616547 -32.791313 23.705279 -33.802262 22.694330 ...
... ... ... ... ... ... ...
dG_ME2 36.341710 -44.943975 49.217011 -49.060255 45.100732 ...
dG_NADH16 -93.221033 -146.633223 -83.709714 -154.574690 -91.433939 ...
dG_NADTRHD 18.001545 -38.347183 15.548170 -38.247234 16.037089 ...
dG_NH4t 27.994102 -9.670293 27.994102 -9.670293 27.994102 ...
dG_PDH 9.299499 -79.044267 20.157648 -89.783288 11.021858 ...
minimum maximum minimum maximum minimum \
dG_PFK -57.361601 17.967189 -57.003212 18.325577 -57.404395
dG_PFL -57.450801 17.877988 -59.125999 16.202791 -57.345710
dG_PGI -15.155035 22.509359 -15.955762 21.708633 -16.048769
dG_PGK 0.000000 50.182441 0.000000 50.874615 0.000000
dG_PGL -38.775775 17.720817 -37.063765 19.432827 -35.456340
... ... ... ... ... ...
dG_ME2 -46.149320 48.011667 -55.544344 38.616642 -57.539017
dG_NADH16 -147.696089 -84.412259 -149.651005 -85.956494 -161.408033
dG_NADTRHD -37.990377 16.290842 -37.853004 17.440327 -37.910998
dG_NH4t -9.670293 27.994102 -9.670293 27.994102 -9.670293
dG_PDH -84.435829 14.996814 -91.232542 10.126363 -93.275548
maximum minimum maximum minimum maximum
dG_PFK 17.924394 -57.339358 17.989432 -57.041507 18.287283
dG_PFL 17.983080 -54.138492 21.190297 -55.180897 20.147892
dG_PGI 21.615625 -16.099379 21.565016 -15.495405 22.168989
dG_PGK 50.131498 0.000000 51.501463 0.000000 50.647421
dG_PGL 21.040252 -36.490842 20.005750 -35.674083 20.822509
... ... ... ... ... ...
dG_ME2 36.621969 -56.770878 37.390109 -57.634138 36.526849
dG_NADH16 -98.454626 -148.205144 -85.254659 -154.115709 -90.590310
dG_NADTRHD 16.392663 -37.467504 17.599419 -37.682847 17.236616
dG_NH4t 27.994102 -9.670293 27.994102 -9.670293 27.994102
dG_PDH 7.342254 -93.586076 7.028803 -94.904108 6.285686
[72 rows x 202 columns]
gev_samples
is the alternate set of Gibbs energy of reactions and gev_ranges
is the max-min ranges of Gibbs energy of reactions. By specifying cutoff
parameter, we specify how many samples we use to train the extreme value distribution (default, 1000). The more samples the better.
Alternatively, you could choose to use the other exit crieteria where if we don’t see improvement since a defined number of samples.
[8]:
cutoff_ranges, cutoff_samples, num_samples = analysis.cutoff_sampling(tfa_model, cutoff=10, variable_list=dg_vars)
Here, cutoff_ranges
, cutoff_samples
, num_samples
represents max-min Gibbs energy ranges of reactions, alternate Gibbs energy of reactions and num of samples to achieve the exit.
[9]:
print(cutoff_samples)
minimum maximum minimum maximum minimum maximum \
dG_PFK 0.0 19.994462 -57.381509 17.947280 -58.071537 17.257252
dG_PFL 0.0 21.678685 -60.857036 14.471753 -58.668320 16.660470
dG_PGI 0.0 22.843247 -16.019409 21.644985 -15.831312 21.833083
dG_PGK 0.0 52.777768 0.000000 51.082344 0.000000 51.522022
dG_PGL 0.0 25.614162 -36.254780 20.241812 -35.437215 21.059377
... ... ... ... ... ... ...
dG_ME2 0.0 50.162726 -53.493241 40.667746 -46.053466 48.107521
dG_NADH16 0.0 -50.696450 -161.222356 -98.560476 -146.618738 -84.279283
dG_NADTRHD 0.0 19.027818 -38.046908 15.441167 -37.461931 16.860187
dG_NH4t 0.0 27.993973 -9.670293 27.994102 -9.670293 27.994102
dG_PDH 0.0 19.494280 -93.291169 7.035106 -81.992453 17.703287
minimum maximum minimum maximum ... minimum \
dG_PFK -57.041093 18.287697 -57.075243 18.253547 ... -56.378078
dG_PFL -57.517947 17.810842 -53.692094 21.636696 ... -58.208255
dG_PGI -15.754052 21.910342 -16.727598 20.936797 ... -16.119889
dG_PGK 0.000000 50.589288 0.000000 51.062350 ... 0.000000
dG_PGL -32.623016 23.873576 -31.511479 24.985113 ... -31.982957
... ... ... ... ... ... ...
dG_ME2 -54.088576 40.072410 -54.682189 39.478797 ... -56.093864
dG_NADH16 -149.744292 -85.730297 -163.037216 -99.770295 ... -159.546298
dG_NADTRHD -37.532229 17.851851 -38.441989 16.392084 ... -37.486707
dG_NH4t -9.670293 27.994102 -9.670293 27.994102 ... -9.670293
dG_PDH -88.726706 12.951684 -89.362655 11.568661 ... -94.402224
maximum minimum maximum minimum maximum \
dG_PFK 18.950711 -56.928126 18.400663 -58.604627 16.724162
dG_PFL 17.120534 -57.489656 17.839133 -56.151032 19.177758
dG_PGI 21.544506 -15.870513 21.793882 -16.476337 21.188058
dG_PGK 50.387229 0.000000 49.334484 0.000000 50.463590
dG_PGL 24.513635 -37.281422 19.215170 -36.178770 20.317822
... ... ... ... ... ...
dG_ME2 38.067123 -49.907013 44.253974 -50.748066 43.412921
dG_NADH16 -95.548451 -157.436641 -93.708334 -159.538517 -95.410680
dG_NADTRHD 17.803702 -37.723270 17.408852 -37.521470 17.786028
dG_NH4t 27.994102 -9.670293 27.994102 -9.670293 27.994102
dG_PDH 7.260018 -85.238851 15.998135 -85.434103 16.358129
minimum maximum minimum maximum
dG_PFK -57.907485 17.421305 -56.873109 18.455681
dG_PFL -56.184370 19.144420 -60.974034 14.354756
dG_PGI -15.347801 22.316594 -16.130546 21.533848
dG_PGK 0.000000 50.097605 0.000000 50.037495
dG_PGL -36.129072 20.367520 -32.519994 23.976598
... ... ... ... ...
dG_ME2 -52.093254 42.067733 -51.961487 42.199500
dG_NADH16 -151.118261 -87.355727 -158.860935 -95.815908
dG_NADTRHD -38.440229 17.719414 -37.858607 17.706442
dG_NH4t -9.670293 27.994102 -9.670293 27.994102
dG_PDH -88.995090 12.431838 -89.969652 10.739770
[72 rows x 24 columns]
3.2 Network embeded thermodynamics (NET)¶
By using the TVA apparoches demonstrated above, one can perform NET type of analysis. For example, we can fix the directionalities of all the reactions and sample for the metabolite concentrations.
Reaction directionalities can be fixed by adjusting the lower or upper bounds of the reactions.
[10]:
atps = tfa_model.reactions.get_by_id('ATPS4r')
print("lower bound is {} and upper bound is {}\n".format(atps.lower_bound, atps.upper_bound))
atps.lower_bound = 0
print("updated lower and upper bounds are {} \t {}\n".format(atps.lower_bound, atps.upper_bound))
lower bound is -1000.0 and upper bound is 1000.0
updated lower and upper bounds are 0 1000.0
We can access the metabolite concentration variables as shown below,
[11]:
concentration_vars = [var.name for var in tfa_model.variables if var.name.startswith('lnc_')]
tfa_model.solver = 'cplex'
concentration_ranges = analysis.variability_legacy_cplex(tfa_model,variable_list = concentration_vars) # TVA using MIQCP with cplex
print(concentration_ranges)
Selected objective sense: MINIMIZE
Selected objective name: 510d6206-5dba-11eb-bac2-c1c11d7a5b17
Selected RHS name: rhs
Selected bound name: bnd
minimum maximum
lnc_glc__D_e -11.512925 -3.912023
lnc_gln__L_c -11.512925 -3.912023
lnc_gln__L_e -11.512925 -3.912023
lnc_glu__L_c -11.512925 -3.912023
lnc_glu__L_e -11.512925 -3.912023
... ... ...
lnc_fru_e -11.512925 -3.912023
lnc_fum_c -11.512925 -3.912023
lnc_fum_e -11.512925 -3.912023
lnc_g3p_c -11.512925 -3.912023
lnc_g6p_c -11.512925 -3.912023
[72 rows x 2 columns]