Revision as of 09:40, 20 October 2017

Modeling

VOC Classification

Overview

The VOC device is designed to judge whether the tobacco is heathy or gets infected. Since this is an inquiry experiment, algorithms in data analysis are widely use in our modeling. We do data preprocessing, data analysis, and algorithm optimization on the data collected by VOC device. Finally, we use Logistic regression and detect the infected tobacco with 91% confidence.

Data preprocessing

First we defragment the raw input data, and reorganize them into a matrix. 10 VOC factors are served as features, and the status(heathy or infected) is served as tag to be predicted.

Then we analysis the data using box plot and discover that most data are normal, but some records are singular, whose box plot are show as folowing:

We remove those records with singular value, and the data left obey normal distribution:

Data analysis

Our target is to create a model and predict tobacco's status according to 10 input features. This is a classic two classification problem, and there are several algrithm to solve it. The sampling algorithm is cross validation and the scoring policy we apply is ridit test.

Decision Tree

First we use decision tree based on information theory. ID3 decision tree is used to reduce the most information gain, and CART tree is used to reduce the GINI index. The performance of these two algorithm is almost the same. R = 0.83

MLP

The second algorithm we apply is Multi-Layer Perception, also called neutral network. In this model, we use more than 100 neurons in each layer and the activation function is relu.

The result of MLP is much better than decision tree.R = 0.89

Leaner Model

Although the performance of MLP has been good enough, it's difficult to extract konwledge learn by algorithm, the interpretability is weak. Why don't we try a simple model with high interpretability? First we try LDA algorithm to compress the 10dimensions data into 2 dimensions.

$J=\frac{||w^T\mu_0-w^T\mu_1||^2}{w^T\Sigma_0w+w^T\Sigma_1w}=\frac{w^T(\mu_0-\mu_1)(\mu_0-\mu_1)^Tw}{w^T\Sigma_0w+w^T\Sigma_1w}$

We define $S_w$ as within-class scatter matrix

$S_w=\Sigma_0+\Sigma_1=\sum_{x\in X_0}(x-\mu_0)(x-\mu_0)^T+\sum_{x\in X_1}(x-\mu_1)(x-\mu_1)^T$

We define $S_b$ as between-class scatter matrix

$S_b=(\mu_0-\mu_1)(\mu_0-\mu_1)^T$

So

$J=\frac{w^TS_bw}{w^TS_ww}$ , the target of LDA is maxmize the $J$

The result of LDA algorithm is as following and $R=0.89$ :

This result prove the data are linear separable, then we choose logistics regression algorithm.

We difine $logitP=ln\frac{y}{1-y}\in (-\infty,+\infty)$

$p(y=1|x)=\frac{e^{w^Tx+b}}{1+e^{w^Tx+b}}$

$p(y=1|x)=\frac{1}{1+e^{w^Tx+b}}$

$l(w,b)=\sum_{i=1}^{m}lnp(y_i|x_i;w,b)$

Then we can apply maximum likelihood method algorithm to estimate the paramaters.

The result is as following:


                            Weight:
                            [[ 0.1819504 0.38788225 0.01350023 0.39594948 0.17799418
                            0.42087034
                            -0.57733395 -0.23876003 -0.00532918 -0.46174515]]
                            Intercept:
                            [ 0.00937812]
                            Effect:
                            D    35.300735
                            B    22.596339
                            F    18.289277
                            E    10.265025
                            C     0.393225
                            I    -1.575564
                            A   -10.679026
                            H   -14.398440
                            G   -26.211964
                            J   -39.130542
                            dtype: float64
                            Score:
                            0.894333333333

Algorithm optimization

From the result of logistics regression, factor C and I and etc. are with less important weight, these factors maybe disturb the classifaction. We try to reduce unimportant factors and simplify the model.

Finally, we reserve 4 factors with which we can predict the tobacco in 91% confidence and also reduce the VOC device.


                    Weight:
                    [[ 0.53196697  0.3404023  -0.53555988 -0.45588715]]
                    Intercept:
                    [-0.01204088]
                    Effect:
                    D    33.217011
                    F    15.492680
                    G   -17.319760
                    J   -33.967849
                    dtype: float64
                    Score:
                    0.912444444444

Summary

In this model, we try different algorithm to abttain a robust, interpretable, and accurate solution to predict whether the tobacco is infected only according to 4 features in 91% confidence. Since there are 6 VOC sensors are meaningless in this model, we the device can also be simplified by reduce them.

@@ Line 387: / Line 387: @@
 <!-- Docs master nav -->
 <!-- <h1><a class="navbar-brand" href="index.html">MuMei Lab</a></h1> -->
 <div class="container">
@@ Line 398: / Line 399: @@
                  <a href="https://2017.igem.org/Team:ZJU-China">
-                     <img style="margin-top:11px" class="navbar-brand"
+                     <img style="margin-top:11px" class="navbar-brand"  src="https://static.igem.org/mediawiki/2017/d/d5/ZJUChina_logo.png">
-                         src="https://static.igem.org/mediawiki/2017/7/77/ZJU_China_logo3.png">
                  </a>
@@ Line 409: / Line 409: @@
                  <ul class="nav navbar-nav navbar-right cl-effect-15">
                      <!-- Hidden li included to remove active class from about link when scrolled up past about section -->
-                     <li class="hidden"><a class="page-scroll" href="#page-top"></a></li>
+                     <li class="hidden"><a class="page-scroll" href="#page-top"></a> </li>
                      <li class="m_nav_item dropdown">
-                         <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Overview<b
+                         <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Overview<b class="caret"></b></a>
-                                class="caret"></b></a>
                          <ul class="dropdown-menu ">
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/overview">Project Description</a></li>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/Overview">Project Description</a></li>
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/overview/achievements">Achievements</a>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/Achievements">Achievements</a></li>
-                             </li>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/InterLab">InterLab</a></li>
+                            <li><a href="https://2017.igem.org/Team:ZJU-China/ImproveParts">Improve Parts</a></li>
                          </ul>
                      </li>
@@ Line 424: / Line 424: @@
                          <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Project<b class="caret"></b></a>
                          <ul class="dropdown-menu ">
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/Project">Project Home</a></li>
+                             <!--<li><a href="https://2017.igem.org/Team:ZJU-China/Project">Project Home</a></li>-->
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/Project/improvement">Improvement</a></li>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/Project/tp">Trichoderma Proof</a></li>
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/InterLab">Interlab</a></li>
+                            <li><a href="https://2017.igem.org/Team:ZJU-China/Project/voc">VOC sensors</a></li>
+                            <li><a href="https://2017.igem.org/Team:ZJU-China/Project/st">Signal Transduction</a></li>
+                            <li><a href="https://2017.igem.org/Team:ZJU-China/Project/ms">Mat Synthesis</a></li>
+                            <li><a href="https://2017.igem.org/Team:ZJU-China/Project/conclusion">Conclusion</a></li>
+                            <!--<li><a href="https://2017.igem.org/Team:ZJU-China/Project/improvement">Improvement</a></li>-->
+                             <!--<li><a href="https://2017.igem.org/Team:ZJU-China/InterLab">Interlab</a></li>-->
                              <li><a href="https://2017.igem.org/Team:ZJU-China/Notebook">Notebook</a></li>
+                        </ul>
+                    </li>
+                    <li class="m_nav_item dropdown" >
+                        <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Modelling<b class="caret"></b></a>
+                        <ul class="dropdown-menu ">
+                            <!--<li><a href="https://2017.igem.org/Team:ZJU-China/Model">Summery</a></li>-->
+                            <li><a href="https://2017.igem.org/Team:ZJU-China/Model">Coculture</a></li>
+                            <li><a href="https://2017.igem.org/Team:ZJU-China/Model/VOC">VOC analysis</a></li>
                          </ul>
                      </li>
                      <li class="m_nav_item dropdown">
-                         <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Modelling<b
+                         <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Parts<b class="caret"></b></a>
-                                class="caret"></b></a>
                          <ul class="dropdown-menu ">
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/Model">Summery</a></li>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/Parts">All Parts</a></li>
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/Model/coculture">Coculture</a></li>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/Parts/Basic">Basic Parts</a></li>
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/Model/voc">VOC analysis</a></li>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/Parts/Composite">Composite Parts</a></li>
+                            <li><a href="https://2017.igem.org/Team:ZJU-China/Parts/Collection">Parts Collection</a></li>
                          </ul>
                      </li>
-                     <li><a href="https://2017.igem.org/Team:ZJU-China/HP/Hardware">Hardware</a></li>
+                     <li><a href="https://2017.igem.org/Team:ZJU-China/Hardware">Hardware</a></li>
+                     <li class="m_nav_item dropdown" >
-                     <li class="m_nav_item dropdown">
                          <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Safety<b class="caret"></b></a>
                          <ul class="dropdown-menu ">
                              <li><a href="https://2017.igem.org/Team:ZJU-China/Safety">Environment</a></li>
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/Safety/lab">Laboratory</a></li>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/Safety/Lab">Laboratory</a></li>
                          </ul>
                      </li>
@@ Line 455: / Line 469: @@
                      <li class="m_nav_item dropdown">
-                         <a href="#" class="dropdown-toggle link" data-toggle="dropdown">HP<b
+                         <a href="#" class="dropdown-toggle link" data-toggle="dropdown">HP<b class="caret"></b></a>
-                                class="caret"></b></a>
                          <ul class="dropdown-menu ">
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/HP">Summary</a></li>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/Human_Practices">Summary</a></li>
                              <li><a href="https://2017.igem.org/Team:ZJU-China/HP/Silver">Silver</a></li>
                              <li><a href="https://2017.igem.org/Team:ZJU-China/HP/Gold_Integrated">Gold</a></li>
@@ Line 464: / Line 477: @@
                      </li>
-                    <li class="m_nav_item dropdown">
+                     <li class="m_nav_item dropdown" >
-                        <a href="#" class="dropdown-toggle link" data-toggle="dropdown">HP<b
-                                class="caret"></b></a>
-                        <ul class="dropdown-menu ">
-                            <li><a href="https://2017.igem.org/Team:ZJU-China/HP">Summary</a></li>
-                            <li><a href="https://2017.igem.org/Team:ZJU-China/HP/Silver">Silver</a></li>
-                            <li><a href="https://2017.igem.org/Team:ZJU-China/HP/Gold_Integrated">Gold</a></li>
-                        </ul>
-                    </li>
-                     <li class="m_nav_item dropdown">
                          <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Team<b class="caret"></b></a>
                          <ul class="dropdown-menu ">
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/team">Teammates</a></li>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/Team">Teammates</a></li>
-                             <li><a href="https://2017.igem.org/Team:ZJU-China/team/attribution">Attribution</a></li>
+                             <li><a href="https://2017.igem.org/Team:ZJU-China/Attributions">Attribution</a></li>
+                            <li><a href="https://2017.igem.org/Team:ZJU-China/Collaborations">Collaboration</a></li>
                          </ul>
                      </li>
@@ Line 518: / Line 522: @@
                          <p class="PP">Our target is to create a model and predict tobacco's status according to 10 input features. This is a classic two classification problem, and there are several algrithm to solve it. The sampling algorithm is cross validation and the scoring policy we apply is ridit test.</p>
                          <p class="PP"><strong>Decision Tree</strong></p>
-                         <p class="PP">First we use decision tree based on information theory. ID3 decision tree is used to reduce the most information gain, and CART tree is used to reduce the GINI index. The performance of these two algorithm is almost the same.</p>
+                         <p class="PP">First we use decision tree based on information theory. ID3 decision tree is used to reduce the most information gain, and CART tree is used to reduce the GINI index. The performance of these two algorithm is almost the same. <strong>R = 0.83</strong></p>
-                        <span class="MathJax_Preview"></span>
-                        <span class="MathJax_SVG_Display" style="text-align: center;">
-                            <span class="MathJax_SVG" id="MathJax-Element-1-Frame" tabindex="-1" style="font-size: 100%; display: inline-block;">
-                                <svg
-                                    xmlns:xlink="http://www.w3.org/1999/xlink" width="8.997ex" height="2.009ex"
-                                    viewBox="0 -755.5 3873.6 865.1" role="img" focusable="false"
-                                    style="vertical-align: -0.255ex;"><defs><path stroke-width="1" id="E1-MJMATHI-52"
-                                                                                  d="M230 637Q203 637 198 638T193 649Q193 676 204 682Q206 683 378 683Q550 682 564 680Q620 672 658 652T712 606T733 563T739 529Q739 484 710 445T643 385T576 351T538 338L545 333Q612 295 612 223Q612 212 607 162T602 80V71Q602 53 603 43T614 25T640 16Q668 16 686 38T712 85Q717 99 720 102T735 105Q755 105 755 93Q755 75 731 36Q693 -21 641 -21H632Q571 -21 531 4T487 82Q487 109 502 166T517 239Q517 290 474 313Q459 320 449 321T378 323H309L277 193Q244 61 244 59Q244 55 245 54T252 50T269 48T302 46H333Q339 38 339 37T336 19Q332 6 326 0H311Q275 2 180 2Q146 2 117 2T71 2T50 1Q33 1 33 10Q33 12 36 24Q41 43 46 45Q50 46 61 46H67Q94 46 127 49Q141 52 146 61Q149 65 218 339T287 628Q287 635 230 637ZM630 554Q630 586 609 608T523 636Q521 636 500 636T462 637H440Q393 637 386 627Q385 624 352 494T319 361Q319 360 388 360Q466 361 492 367Q556 377 592 426Q608 449 619 486T630 554Z"></path><path
-                                    stroke-width="1" id="E1-MJMAIN-3D"
-                                    d="M56 347Q56 360 70 367H707Q722 359 722 347Q722 336 708 328L390 327H72Q56 332 56 347ZM56 153Q56 168 72 173H708Q722 163 722 153Q722 140 707 133H70Q56 140 56 153Z"></path><path
-                                    stroke-width="1" id="E1-MJMAIN-30"
-                                    d="M96 585Q152 666 249 666Q297 666 345 640T423 548Q460 465 460 320Q460 165 417 83Q397 41 362 16T301 -15T250 -22Q224 -22 198 -16T137 16T82 83Q39 165 39 320Q39 494 96 585ZM321 597Q291 629 250 629Q208 629 178 597Q153 571 145 525T137 333Q137 175 145 125T181 46Q209 16 250 16Q290 16 318 46Q347 76 354 130T362 333Q362 478 354 524T321 597Z"></path><path
-                                    stroke-width="1" id="E1-MJMAIN-2E"
-                                    d="M78 60Q78 84 95 102T138 120Q162 120 180 104T199 61Q199 36 182 18T139 0T96 17T78 60Z"></path><path
-                                    stroke-width="1" id="E1-MJMAIN-38"
-                                    d="M70 417T70 494T124 618T248 666Q319 666 374 624T429 515Q429 485 418 459T392 417T361 389T335 371T324 363L338 354Q352 344 366 334T382 323Q457 264 457 174Q457 95 399 37T249 -22Q159 -22 101 29T43 155Q43 263 172 335L154 348Q133 361 127 368Q70 417 70 494ZM286 386L292 390Q298 394 301 396T311 403T323 413T334 425T345 438T355 454T364 471T369 491T371 513Q371 556 342 586T275 624Q268 625 242 625Q201 625 165 599T128 534Q128 511 141 492T167 463T217 431Q224 426 228 424L286 386ZM250 21Q308 21 350 55T392 137Q392 154 387 169T375 194T353 216T330 234T301 253T274 270Q260 279 244 289T218 306L210 311Q204 311 181 294T133 239T107 157Q107 98 150 60T250 21Z"></path><path
-                                    stroke-width="1" id="E1-MJMAIN-33"
-                                    d="M127 463Q100 463 85 480T69 524Q69 579 117 622T233 665Q268 665 277 664Q351 652 390 611T430 522Q430 470 396 421T302 350L299 348Q299 347 308 345T337 336T375 315Q457 262 457 175Q457 96 395 37T238 -22Q158 -22 100 21T42 130Q42 158 60 175T105 193Q133 193 151 175T169 130Q169 119 166 110T159 94T148 82T136 74T126 70T118 67L114 66Q165 21 238 21Q293 21 321 74Q338 107 338 175V195Q338 290 274 322Q259 328 213 329L171 330L168 332Q166 335 166 348Q166 366 174 366Q202 366 232 371Q266 376 294 413T322 525V533Q322 590 287 612Q265 626 240 626Q208 626 181 615T143 592T132 580H135Q138 579 143 578T153 573T165 566T175 555T183 540T186 520Q186 498 172 481T127 463Z"></path></defs><g
-                                    stroke="currentColor" fill="currentColor" stroke-width="0"
-                                    transform="matrix(1 0 0 -1 0 0)"><use xlink:href="#E1-MJMATHI-52" x="0" y="0"></use><use
-                                    xlink:href="#E1-MJMAIN-3D" x="1037" y="0"></use><g transform="translate(2093,0)"><use
-                                    xlink:href="#E1-MJMAIN-30"></use><use xlink:href="#E1-MJMAIN-2E" x="500"
-                                                                          y="0"></use><use xlink:href="#E1-MJMAIN-38"
-                                                                                           x="779" y="0"></use><use
-                                    xlink:href="#E1-MJMAIN-33" x="1279" y="0"></use></g></g></svg></span></span>
-                                    <script type="math/tex; mode=display" id="MathJax-Element-1">R=0.83</script>
                          <img class="textimg" src='https://static.igem.org/mediawiki/2017/3/30/ZJU_China_VOC_5.png' alt=''/>
@@ Line 551: / Line 530: @@
                      <p class="PP"><strong>MLP</strong></p>
                          <p class="PP">The second algorithm we apply is Multi-Layer Perception, also called neutral network. In this model, we use more than 100 neurons in each layer and the activation function is relu.</p>
-                         <p class="PP">The result of MLP is much better than decision tree.</p>
+                         <p class="PP">The result of MLP is much better than decision tree.<strong>R = 0.89</strong></p>
-                            <span
-                                class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
-                                                                     style="text-align: center;"><span
-                                class="MathJax_SVG" id="MathJax-Element-2-Frame" tabindex="-1"
-                                style="font-size: 100%; display: inline-block;"><svg
-                                xmlns:xlink="http://www.w3.org/1999/xlink" width="8.997ex" height="2.009ex"
-                                viewBox="0 -755.5 3873.6 865.1" role="img" focusable="false"
-                                style="vertical-align: -0.255ex;"><defs><path stroke-width="1" id="E2-MJMATHI-52"
-                                                                              d="M230 637Q203 637 198 638T193 649Q193 676 204 682Q206 683 378 683Q550 682 564 680Q620 672 658 652T712 606T733 563T739 529Q739 484 710 445T643 385T576 351T538 338L545 333Q612 295 612 223Q612 212 607 162T602 80V71Q602 53 603 43T614 25T640 16Q668 16 686 38T712 85Q717 99 720 102T735 105Q755 105 755 93Q755 75 731 36Q693 -21 641 -21H632Q571 -21 531 4T487 82Q487 109 502 166T517 239Q517 290 474 313Q459 320 449 321T378 323H309L277 193Q244 61 244 59Q244 55 245 54T252 50T269 48T302 46H333Q339 38 339 37T336 19Q332 6 326 0H311Q275 2 180 2Q146 2 117 2T71 2T50 1Q33 1 33 10Q33 12 36 24Q41 43 46 45Q50 46 61 46H67Q94 46 127 49Q141 52 146 61Q149 65 218 339T287 628Q287 635 230 637ZM630 554Q630 586 609 608T523 636Q521 636 500 636T462 637H440Q393 637 386 627Q385 624 352 494T319 361Q319 360 388 360Q466 361 492 367Q556 377 592 426Q608 449 619 486T630 554Z"></path><path
-                                stroke-width="1" id="E2-MJMAIN-3D"
-                                d="M56 347Q56 360 70 367H707Q722 359 722 347Q722 336 708 328L390 327H72Q56 332 56 347ZM56 153Q56 168 72 173H708Q722 163 722 153Q722 140 707 133H70Q56 140 56 153Z"></path><path
-                                stroke-width="1" id="E2-MJMAIN-30"
-                                d="M96 585Q152 666 249 666Q297 666 345 640T423 548Q460 465 460 320Q460 165 417 83Q397 41 362 16T301 -15T250 -22Q224 -22 198 -16T137 16T82 83Q39 165 39 320Q39 494 96 585ZM321 597Q291 629 250 629Q208 629 178 597Q153 571 145 525T137 333Q137 175 145 125T181 46Q209 16 250 16Q290 16 318 46Q347 76 354 130T362 333Q362 478 354 524T321 597Z"></path><path
-                                stroke-width="1" id="E2-MJMAIN-2E"
-                                d="M78 60Q78 84 95 102T138 120Q162 120 180 104T199 61Q199 36 182 18T139 0T96 17T78 60Z"></path><path
-                                stroke-width="1" id="E2-MJMAIN-38"
-                                d="M70 417T70 494T124 618T248 666Q319 666 374 624T429 515Q429 485 418 459T392 417T361 389T335 371T324 363L338 354Q352 344 366 334T382 323Q457 264 457 174Q457 95 399 37T249 -22Q159 -22 101 29T43 155Q43 263 172 335L154 348Q133 361 127 368Q70 417 70 494ZM286 386L292 390Q298 394 301 396T311 403T323 413T334 425T345 438T355 454T364 471T369 491T371 513Q371 556 342 586T275 624Q268 625 242 625Q201 625 165 599T128 534Q128 511 141 492T167 463T217 431Q224 426 228 424L286 386ZM250 21Q308 21 350 55T392 137Q392 154 387 169T375 194T353 216T330 234T301 253T274 270Q260 279 244 289T218 306L210 311Q204 311 181 294T133 239T107 157Q107 98 150 60T250 21Z"></path><path
-                                stroke-width="1" id="E2-MJMAIN-39"
-                                d="M352 287Q304 211 232 211Q154 211 104 270T44 396Q42 412 42 436V444Q42 537 111 606Q171 666 243 666Q245 666 249 666T257 665H261Q273 665 286 663T323 651T370 619T413 560Q456 472 456 334Q456 194 396 97Q361 41 312 10T208 -22Q147 -22 108 7T68 93T121 149Q143 149 158 135T173 96Q173 78 164 65T148 49T135 44L131 43Q131 41 138 37T164 27T206 22H212Q272 22 313 86Q352 142 352 280V287ZM244 248Q292 248 321 297T351 430Q351 508 343 542Q341 552 337 562T323 588T293 615T246 625Q208 625 181 598Q160 576 154 546T147 441Q147 358 152 329T172 282Q197 248 244 248Z"></path></defs><g
-                                stroke="currentColor" fill="currentColor" stroke-width="0"
-                                transform="matrix(1 0 0 -1 0 0)"><use xlink:href="#E2-MJMATHI-52" x="0" y="0"></use><use
-                                xlink:href="#E2-MJMAIN-3D" x="1037" y="0"></use><g transform="translate(2093,0)"><use
-                                xlink:href="#E2-MJMAIN-30"></use><use xlink:href="#E2-MJMAIN-2E" x="500" y="0"></use><use
-                                xlink:href="#E2-MJMAIN-38" x="779" y="0"></use><use xlink:href="#E2-MJMAIN-39" x="1279"
-                                                                                    y="0"></use></g></g></svg></span></span>
-                            <script type="math/tex; mode=display" id="MathJax-Element-2">R=0.89</script>
                          <p><img class="textimg" src='https://static.igem.org/mediawiki/2017/9/91/ZJU_China_VOC_7.png' alt=''/>
@@ Line 584: / Line 537: @@
                          <p class="PP">Although the performance of MLP has been good enough, it&#39;s difficult to extract konwledge
                              learn by algorithm, the interpretability is weak. Why don&#39;t we try a simple model with
-                             high interpretability? First we try LDA algorithm to compress the 10dimensions data into 2
+                             high interpretability? First we try LDA algorithm to compress the 10dimensions data into 2 dimensions.</p>
-                            dimensions.</p>
+                         <p class="PP" style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
-                         <p class="PP"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
                                                                        style="text-align: center;"><span
                                  class="MathJax_SVG" id="MathJax-Element-3-Frame" tabindex="-1"
@@ Line 708: / Line 660: @@
                              <script type="math/tex" id="MathJax-Element-4">S_w</script>
                              as <strong>within-class scatter matrix</strong></p>
-                         <p class="PP"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
+                         <p class="PP" style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
                                                                        style="text-align: center;"><span
                                  class="MathJax_SVG" id="MathJax-Element-5-Frame" tabindex="-1"
@@ Line 824: / Line 776: @@
                              <script type="math/tex" id="MathJax-Element-6">S_b</script>
                              as <strong>between-class scatter matrix</strong></p>
-                         <p class="PP"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
+                         <p class="PP"  style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
                                                                        style="text-align: center;"><span
                                  class="MathJax_SVG" id="MathJax-Element-7-Frame" tabindex="-1"
@@ Line 877: / Line 829: @@
                          </p>
                          <p class="PP">So</p>
-                         <p class="PP"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
+                         <p class="PP" style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
                                                                        style="text-align: center;"><span
                                  class="MathJax_SVG" id="MathJax-Element-8-Frame" tabindex="-1"
@@ Line 1,028: / Line 980: @@
                              </script>
                          </p>
-                         <p><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
+                         <p style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
                                                                        style="text-align: center;"><span
                                  class="MathJax_SVG" id="MathJax-Element-12-Frame" tabindex="-1"
@@ Line 1,094: / Line 1,046: @@
                              </script>
                          </p>
-                         <p><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
+                         <p style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
                                                                        style="text-align: center;"><span
                                  class="MathJax_SVG" id="MathJax-Element-13-Frame" tabindex="-1"
@@ Line 1,155: / Line 1,107: @@
                              </script>
                          </p>
-                         <p><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
+                         <p  style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display"
                                                                        style="text-align: center;"><span
                                  class="MathJax_SVG" id="MathJax-Element-14-Frame" tabindex="-1"
@@ Line 1,230: / Line 1,182: @@
                          <p class="PP">Then we can apply maximum likelihood method algorithm to estimate the paramaters.</p>
                          <p class="PP">The result is as following:</p>
+                        <figure class="highlight"><pre><code class="language-html" data-lang="html">
-                            <p class="PP">Weight:</p>
+                            Weight:
-                            <p class="PP">[[ 0.1819504   0.38788225  0.01350023  0.39594948  0.17799418</p>
+                            [[ 0.1819504 0.38788225 0.01350023 0.39594948 0.17799418
-                             <p class="PP">0.42087034</p>
+.42087034
-                             <p class="PP">-0.57733395 -0.23876003 -0.00532918 -0.46174515]]</p>
+                             -0.57733395 -0.23876003 -0.00532918 -0.46174515]]
-                             <p class="PP">Intercept:</p>
+                             Intercept:
-                             <p class="PP">[ 0.00937812]</p>
+                             [ 0.00937812]
-                             <p class="PP">Effect:</p>
+                             Effect:
-                             <p class="PP">D    35.300735</p>
+                             D    35.300735
-                             <p class="PP">B    22.596339</p>
+                             B    22.596339
-                             <p class="PP">F    18.289277</p>
+                             F    18.289277
-                             <p class="PP">E    10.265025</p>
+                             E    10.265025
-                             <p class="PP">C     0.393225</p>
+                             C     0.393225
-                             <p class="PP">I    -1.575564</p>
+                             I    -1.575564
-                             <p class="PP">A   -10.679026</p>
+                             A   -10.679026
-                             <p class="PP">H   -14.398440</p>
+                             H   -14.398440
-                             <p class="PP">G   -26.211964</p>
+                             G   -26.211964
-                             <p class="PP">J   -39.130542</p>
+                             J   -39.130542
-                             <p class="PP">dtype: float64</p>
+                             dtype: float64
-                             <p class="PP">Score:</p>
+                             Score:
-                             <p class="PP">0.894333333333</p>
+.894333333333
+                        </code></pre></figure>
-                 <h2 class="H2Head">Algorithm optimization</h2>
+                 <h2 id="algorithmoptimization" class="H2Head">Algorithm optimization</h2>
                  <p class="PP">From the result of logistics regression, factor C and I and etc. are with less important weight,
                      these factors maybe disturb the classifaction. We try to reduce unimportant factors and simplify the
@@ Line 1,258: / Line 1,210: @@
                  <p class="PP">Finally, we reserve 4 factors with which we can predict the tobacco in 91% confidence and also reduce
                      the VOC device.</p>
+            <figure class="highlight"><pre><code class="language-html" data-lang="html">
-                    <p class="PP">Weight:</p>
+                    Weight:
-                    <p class="PP">[[ 0.53196697  0.3404023  -0.53555988 -0.45588715]]</p>
+                    [[ 0.53196697  0.3404023  -0.53555988 -0.45588715]]
-                     <p class="PP">Intercept:</p>
+                     Intercept:
-                     <p class="PP">[-0.01204088]</p>
+                     [-0.01204088]
-                     <p class="PP">Effect:</p>
+                     Effect:
-                     <p class="PP">D    33.217011</p>
+                     D    33.217011
-                     <p class="PP">F    15.492680</p>
+                     F    15.492680
-                     <p class="PP">G   -17.319760</p>
+                     G   -17.319760
-                     <p class="PP">J   -33.967849</p>
+                     J   -33.967849
-                     <p class="PP">dtype: float64</p>
+                     dtype: float64
-                     <p class="PP">Score:</p>
+                     Score:
-                    <p class="PP">0.912444444444</p>
+.912444444444
+                </code></pre></figure>
                  <img class="textimg" src='https://static.igem.org/mediawiki/2017/7/73/ZJU_China_VOC_9.png' alt=''/>
-                 <h2 class="H2Head">Summary</h2>
+                 <h2 id="summary" class="H2Head">Summary</h2>
                      <p class="PP">In this model, we try different algorithm to abttain a robust, interpretable, and accurate solution
                      to predict whether the tobacco is infected only according to 4 features in 91% confidence. Since
@@ Line 1,296: / Line 1,249: @@
                              <li><a href="#datapreprocessing">Transformation</a></li>
                              <li><a href="#dataanalysis">Data Analysis</a></li>
+                            <li><a href="#algorithmoptimization">Algorithm Optimization</a></li>
+                            <li><a href="#summary">Summary</a></li>
                          </ul>
                      </li>
-                    <li>
-                        <a href="#results">Results</a>
-                        <ul class="nav">
-                            <li><a href="#OD600">OD600</a></li>
-                            <li><a href="#fluorescein">Fluorescein</a></li>
-                            <li><a href="#discussion">Discussion</a></li>
-                            <li><a href="#reflection">Reflection</a></li>
-                        </ul>
-                    </li>
-                    <li>
-                        <a href="#feedback">Feedback</a>
-                        <ul class="nav">
-                            <li><a href="#advantages">Advantages</a></li>
-                            <li><a href="#improvements">Improvements</a></li>
-                        </ul>
-                    </li>
                  </ul>

Difference between revisions of "Team:ZJU-China/Model"