{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 3. Thermodynamic variability analysis (TVA)\n", "\n", "There exists multiple flux states that define the same optimum. TVA predicts the ranges of variables such as metabolic fluxes, Gibbs energy of reactions and metabolite concentrations by taking into account of thermodynamic uncertainities using multivariate approach.\n", "\n", "As described previously, you can solve the TVA problem on either the MILP or MIQCP. Although, MILP is faster compared to solving MIQCP, MIQCP is thermodynamically consistent as they drawn from multivariate distribution. \n", "\n", "There are three different functions for running TVA in multiTFA. One for the MILP, the remaining two for MIQCP, depending on the solver you use (either Gurobi or Cplex).\n", "\n", "By default, TVA is performed on all the variables in a model unless `variable_list` parameter is specified. It is a `list` of variable names you want to perform TVA on. For example reactons id's or names of Gibbs energy variables etc.\n", "\n", "The example below demonstrates how to perform TVA on *E. coli* core model with constraints described in previous sections. We will use `cplex` for our calculations. In our observations `Cplex` ourperforms `Gurobi`. \n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading package metadata...\n", "Fragments already downloaded\n", "Downloading package metadata...\n", "Fragments already downloaded\n", "Downloading package metadata...\n", "Fragments already downloaded\n" ] } ], "source": [ "import sys\n", "sys.path.insert(0, \"../../examples\")\n", "from paper_data_example import build_core_model" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [] }, "outputs": [], "source": [ "tfa_model = build_core_model()\n", "\n", "from multitfa import analysis\n", "dg_vars = [rxn.delG_forward.name for rxn in tfa_model.reactions if rxn.id not in tfa_model.Exclude_reactions] # Gibbs energy variable names\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let us perform TVA on `tfa_model` using univariate approch and calculate ranges on reaction flux variables" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " minimum maximum\n", "dG_PFK -65.527361 26.205939\n", "dG_PFL -72.197674 33.074152\n", "dG_PGI -19.085713 24.931203\n", "dG_PGK 0.000000 59.219274\n", "dG_PGL -40.671665 28.293843\n", "... ... ...\n", "dG_ME2 -70.983610 57.688660\n", "dG_NADH16 -177.843571 -61.742679\n", "dG_NADTRHD -58.319625 31.641641\n", "dG_NH4t -13.742207 32.066015\n", "dG_PDH -116.749219 31.763632\n", "\n", "[72 rows x 2 columns]\n" ] } ], "source": [ "tfa_model.solver = 'cplex'\n", "ranges_box = analysis.variability(tfa_model, variable_list = dg_vars) # TVA using MILP\n", "print(ranges_box)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " `ranges_box` is a `pandas DataFrame` which can be readily exported to a csv. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "ranges_box.to_csv('ranges.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now lets solve the TVA with multivariate approach using MIQCP. There are two functions one when using `Gurobi` and another for `Cplex`." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Selected objective sense: MINIMIZE\n", "Selected objective name: 510d6206-5dba-11eb-bac2-c1c11d7a5b17\n", "Selected RHS name: rhs\n", "Selected bound name: bnd\n", "\n", "Selected objective sense: MINIMIZE\n", "Selected objective name: 510d6206-5dba-11eb-bac2-c1c11d7a5b17\n", "Selected RHS name: rhs\n", "Selected bound name: bnd\n", " minimum maximum\n", "dG_PFK -6.349569e+01 24.174265\n", "dG_PFL -7.084883e+01 31.725309\n", "dG_PGI -1.957310e+01 25.255346\n", "dG_PGK 4.296464e-09 56.021187\n", "dG_PGL -4.961171e+01 37.233887\n", "... ... ...\n", "dG_ME2 -8.241713e+01 69.129417\n", "dG_NADH16 -2.107758e+02 -33.736290\n", "dG_NADTRHD -4.100270e+01 23.163226\n", "dG_NH4t -9.670293e+00 27.994102\n", "dG_PDH -1.200410e+02 31.809464\n", "\n", "[72 rows x 2 columns]\n" ] } ], "source": [ "ranges_MIQC_cplex = analysis.variability_legacy_cplex(tfa_model,variable_list = dg_vars) # TVA using MIQCP with cplex\n", "print(ranges_MIQC_cplex)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, if you need to use `Gurobi`, you should use `variability_legacy_gurobi` function. Optionally, if you wish to warm start the optimization, you should specify `warm_start` parameter." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Please note, runnig TVA on genome scale model is computationally expensive, especially using MIQC problems. We have observed that relaxing the `Gap` parameter helped achieving faster run times. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.1 Alternative way to solve MIQC\n", "\n", "When `Gurobi` or `Cplex` is not available, and if you wanted to run MIQC problem, we developed a sampling based approach to solve the MIQCP. We sample the uncertainity variables on the surface of the ellipsoid and solve the subsequent MILP problem. This can be solved using MILP solvers. The exit criterion of the sampler can be chosen as either\n", "\n", "(1) the number of samples since last improvement or \n", "(2) a fixed number of samples followed by use of a generalized extreme value distribution to infer the maximum value." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/moritz/.pyenv/versions/3.6.12/envs/tmfa/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:400: RuntimeWarning: invalid value encountered in double_scalars\n", " return m3 / np.power(m2, 1.5)\n", "/home/moritz/.pyenv/versions/3.6.12/envs/tmfa/lib/python3.6/site-packages/scipy/stats/_continuous_distns.py:2666: RuntimeWarning: invalid value encountered in subtract\n", " -pex2+logpex2-logex2)\n" ] } ], "source": [ "gev_ranges, gev_samples = analysis.gev_sampling(tfa_model, variable_list=dg_vars, cutoff = 100)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " minimum maximum minimum maximum minimum \\\n", "dG_PFK 0.0 6.950160e-310 -57.126492 18.202297 -56.355503 \n", "dG_PFL 0.0 6.950167e-310 -57.487150 17.841639 -56.888039 \n", "dG_PGI 0.0 6.950162e-310 -15.744384 21.920011 -15.992455 \n", "dG_PGK 0.0 6.950167e-310 0.000000 50.801699 0.000000 \n", "dG_PGL 0.0 6.950167e-310 -33.529149 22.967443 -33.880045 \n", "... ... ... ... ... ... \n", "dG_ME2 0.0 6.950236e-310 -48.030535 46.130452 -57.819277 \n", "dG_NADH16 0.0 6.950166e-310 -165.244331 -101.608240 -157.179347 \n", "dG_NADTRHD 0.0 6.950166e-310 -37.952069 16.907125 -37.511270 \n", "dG_NH4t 0.0 6.950236e-310 -9.670293 27.994102 -9.670293 \n", "dG_PDH 0.0 6.950236e-310 -87.076334 14.224151 -92.323210 \n", "\n", " maximum minimum maximum minimum maximum ... \\\n", "dG_PFK 18.973286 -58.369161 16.959629 -57.249094 18.079695 ... \n", "dG_PFL 18.440751 -57.556992 17.771798 -57.588522 17.740267 ... \n", "dG_PGI 21.671940 -16.329137 21.335258 -15.801123 21.863272 ... \n", "dG_PGK 50.528925 0.000000 50.929908 0.000000 51.194993 ... \n", "dG_PGL 22.616547 -32.791313 23.705279 -33.802262 22.694330 ... \n", "... ... ... ... ... ... ... \n", "dG_ME2 36.341710 -44.943975 49.217011 -49.060255 45.100732 ... \n", "dG_NADH16 -93.221033 -146.633223 -83.709714 -154.574690 -91.433939 ... \n", "dG_NADTRHD 18.001545 -38.347183 15.548170 -38.247234 16.037089 ... \n", "dG_NH4t 27.994102 -9.670293 27.994102 -9.670293 27.994102 ... \n", "dG_PDH 9.299499 -79.044267 20.157648 -89.783288 11.021858 ... \n", "\n", " minimum maximum minimum maximum minimum \\\n", "dG_PFK -57.361601 17.967189 -57.003212 18.325577 -57.404395 \n", "dG_PFL -57.450801 17.877988 -59.125999 16.202791 -57.345710 \n", "dG_PGI -15.155035 22.509359 -15.955762 21.708633 -16.048769 \n", "dG_PGK 0.000000 50.182441 0.000000 50.874615 0.000000 \n", "dG_PGL -38.775775 17.720817 -37.063765 19.432827 -35.456340 \n", "... ... ... ... ... ... \n", "dG_ME2 -46.149320 48.011667 -55.544344 38.616642 -57.539017 \n", "dG_NADH16 -147.696089 -84.412259 -149.651005 -85.956494 -161.408033 \n", "dG_NADTRHD -37.990377 16.290842 -37.853004 17.440327 -37.910998 \n", "dG_NH4t -9.670293 27.994102 -9.670293 27.994102 -9.670293 \n", "dG_PDH -84.435829 14.996814 -91.232542 10.126363 -93.275548 \n", "\n", " maximum minimum maximum minimum maximum \n", "dG_PFK 17.924394 -57.339358 17.989432 -57.041507 18.287283 \n", "dG_PFL 17.983080 -54.138492 21.190297 -55.180897 20.147892 \n", "dG_PGI 21.615625 -16.099379 21.565016 -15.495405 22.168989 \n", "dG_PGK 50.131498 0.000000 51.501463 0.000000 50.647421 \n", "dG_PGL 21.040252 -36.490842 20.005750 -35.674083 20.822509 \n", "... ... ... ... ... ... \n", "dG_ME2 36.621969 -56.770878 37.390109 -57.634138 36.526849 \n", "dG_NADH16 -98.454626 -148.205144 -85.254659 -154.115709 -90.590310 \n", "dG_NADTRHD 16.392663 -37.467504 17.599419 -37.682847 17.236616 \n", "dG_NH4t 27.994102 -9.670293 27.994102 -9.670293 27.994102 \n", "dG_PDH 7.342254 -93.586076 7.028803 -94.904108 6.285686 \n", "\n", "[72 rows x 202 columns]\n" ] } ], "source": [ "print(gev_samples)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`gev_samples` is the alternate set of Gibbs energy of reactions and `gev_ranges` is the max-min ranges of Gibbs energy of reactions. By specifying `cutoff` parameter, we specify how many samples we use to train the extreme value distribution (default, 1000). The more samples the better. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, you could choose to use the other exit crieteria where if we don't see improvement since a defined number of samples." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "cutoff_ranges, cutoff_samples, num_samples = analysis.cutoff_sampling(tfa_model, cutoff=10, variable_list=dg_vars)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, `cutoff_ranges`, `cutoff_samples`, `num_samples` represents max-min Gibbs energy ranges of reactions, alternate Gibbs energy of reactions and num of samples to achieve the exit." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " minimum maximum minimum maximum minimum maximum \\\n", "dG_PFK 0.0 19.994462 -57.381509 17.947280 -58.071537 17.257252 \n", "dG_PFL 0.0 21.678685 -60.857036 14.471753 -58.668320 16.660470 \n", "dG_PGI 0.0 22.843247 -16.019409 21.644985 -15.831312 21.833083 \n", "dG_PGK 0.0 52.777768 0.000000 51.082344 0.000000 51.522022 \n", "dG_PGL 0.0 25.614162 -36.254780 20.241812 -35.437215 21.059377 \n", "... ... ... ... ... ... ... \n", "dG_ME2 0.0 50.162726 -53.493241 40.667746 -46.053466 48.107521 \n", "dG_NADH16 0.0 -50.696450 -161.222356 -98.560476 -146.618738 -84.279283 \n", "dG_NADTRHD 0.0 19.027818 -38.046908 15.441167 -37.461931 16.860187 \n", "dG_NH4t 0.0 27.993973 -9.670293 27.994102 -9.670293 27.994102 \n", "dG_PDH 0.0 19.494280 -93.291169 7.035106 -81.992453 17.703287 \n", "\n", " minimum maximum minimum maximum ... minimum \\\n", "dG_PFK -57.041093 18.287697 -57.075243 18.253547 ... -56.378078 \n", "dG_PFL -57.517947 17.810842 -53.692094 21.636696 ... -58.208255 \n", "dG_PGI -15.754052 21.910342 -16.727598 20.936797 ... -16.119889 \n", "dG_PGK 0.000000 50.589288 0.000000 51.062350 ... 0.000000 \n", "dG_PGL -32.623016 23.873576 -31.511479 24.985113 ... -31.982957 \n", "... ... ... ... ... ... ... \n", "dG_ME2 -54.088576 40.072410 -54.682189 39.478797 ... -56.093864 \n", "dG_NADH16 -149.744292 -85.730297 -163.037216 -99.770295 ... -159.546298 \n", "dG_NADTRHD -37.532229 17.851851 -38.441989 16.392084 ... -37.486707 \n", "dG_NH4t -9.670293 27.994102 -9.670293 27.994102 ... -9.670293 \n", "dG_PDH -88.726706 12.951684 -89.362655 11.568661 ... -94.402224 \n", "\n", " maximum minimum maximum minimum maximum \\\n", "dG_PFK 18.950711 -56.928126 18.400663 -58.604627 16.724162 \n", "dG_PFL 17.120534 -57.489656 17.839133 -56.151032 19.177758 \n", "dG_PGI 21.544506 -15.870513 21.793882 -16.476337 21.188058 \n", "dG_PGK 50.387229 0.000000 49.334484 0.000000 50.463590 \n", "dG_PGL 24.513635 -37.281422 19.215170 -36.178770 20.317822 \n", "... ... ... ... ... ... \n", "dG_ME2 38.067123 -49.907013 44.253974 -50.748066 43.412921 \n", "dG_NADH16 -95.548451 -157.436641 -93.708334 -159.538517 -95.410680 \n", "dG_NADTRHD 17.803702 -37.723270 17.408852 -37.521470 17.786028 \n", "dG_NH4t 27.994102 -9.670293 27.994102 -9.670293 27.994102 \n", "dG_PDH 7.260018 -85.238851 15.998135 -85.434103 16.358129 \n", "\n", " minimum maximum minimum maximum \n", "dG_PFK -57.907485 17.421305 -56.873109 18.455681 \n", "dG_PFL -56.184370 19.144420 -60.974034 14.354756 \n", "dG_PGI -15.347801 22.316594 -16.130546 21.533848 \n", "dG_PGK 0.000000 50.097605 0.000000 50.037495 \n", "dG_PGL -36.129072 20.367520 -32.519994 23.976598 \n", "... ... ... ... ... \n", "dG_ME2 -52.093254 42.067733 -51.961487 42.199500 \n", "dG_NADH16 -151.118261 -87.355727 -158.860935 -95.815908 \n", "dG_NADTRHD -38.440229 17.719414 -37.858607 17.706442 \n", "dG_NH4t -9.670293 27.994102 -9.670293 27.994102 \n", "dG_PDH -88.995090 12.431838 -89.969652 10.739770 \n", "\n", "[72 rows x 24 columns]\n" ] } ], "source": [ "print(cutoff_samples)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.2 Network embeded thermodynamics (NET)\n", "\n", "By using the TVA apparoches demonstrated above, one can perform NET type of analysis. For example, we can fix the directionalities of all the reactions and sample for the metabolite concentrations. \n", "\n", "Reaction directionalities can be fixed by adjusting the lower or upper bounds of the reactions." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "lower bound is -1000.0 and upper bound is 1000.0\n", "\n", "updated lower and upper bounds are 0 \t 1000.0\n", "\n" ] } ], "source": [ "atps = tfa_model.reactions.get_by_id('ATPS4r')\n", "print(\"lower bound is {} and upper bound is {}\\n\".format(atps.lower_bound, atps.upper_bound))\n", "atps.lower_bound = 0\n", "print(\"updated lower and upper bounds are {} \\t {}\\n\".format(atps.lower_bound, atps.upper_bound))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can access the metabolite concentration variables as shown below," ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Selected objective sense: MINIMIZE\n", "Selected objective name: 510d6206-5dba-11eb-bac2-c1c11d7a5b17\n", "Selected RHS name: rhs\n", "Selected bound name: bnd\n", " minimum maximum\n", "lnc_glc__D_e -11.512925 -3.912023\n", "lnc_gln__L_c -11.512925 -3.912023\n", "lnc_gln__L_e -11.512925 -3.912023\n", "lnc_glu__L_c -11.512925 -3.912023\n", "lnc_glu__L_e -11.512925 -3.912023\n", "... ... ...\n", "lnc_fru_e -11.512925 -3.912023\n", "lnc_fum_c -11.512925 -3.912023\n", "lnc_fum_e -11.512925 -3.912023\n", "lnc_g3p_c -11.512925 -3.912023\n", "lnc_g6p_c -11.512925 -3.912023\n", "\n", "[72 rows x 2 columns]\n" ] } ], "source": [ "concentration_vars = [var.name for var in tfa_model.variables if var.name.startswith('lnc_')]\n", "tfa_model.solver = 'cplex'\n", "concentration_ranges = analysis.variability_legacy_cplex(tfa_model,variable_list = concentration_vars) # TVA using MIQCP with cplex\n", "print(concentration_ranges)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.12" } }, "nbformat": 4, "nbformat_minor": 2 }