Zhiling Zhou (Talk | contribs) |
Zhiling Zhou (Talk | contribs) |
||
Line 387: | Line 387: | ||
<!-- Docs master nav --> | <!-- Docs master nav --> | ||
<!-- <h1><a class="navbar-brand" href="index.html">MuMei Lab</a></h1> --> | <!-- <h1><a class="navbar-brand" href="index.html">MuMei Lab</a></h1> --> | ||
+ | |||
<div class="container"> | <div class="container"> | ||
Line 398: | Line 399: | ||
<a href="https://2017.igem.org/Team:ZJU-China"> | <a href="https://2017.igem.org/Team:ZJU-China"> | ||
− | <img style="margin-top:11px" class="navbar-brand" | + | <img style="margin-top:11px" class="navbar-brand" src="https://static.igem.org/mediawiki/2017/d/d5/ZJUChina_logo.png"> |
− | + | ||
</a> | </a> | ||
Line 409: | Line 409: | ||
<ul class="nav navbar-nav navbar-right cl-effect-15"> | <ul class="nav navbar-nav navbar-right cl-effect-15"> | ||
<!-- Hidden li included to remove active class from about link when scrolled up past about section --> | <!-- Hidden li included to remove active class from about link when scrolled up past about section --> | ||
− | <li class="hidden"><a class="page-scroll" href="#page-top"></a></li> | + | <li class="hidden"><a class="page-scroll" href="#page-top"></a> </li> |
<li class="m_nav_item dropdown"> | <li class="m_nav_item dropdown"> | ||
− | <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Overview<b | + | <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Overview<b class="caret"></b></a> |
− | + | ||
<ul class="dropdown-menu "> | <ul class="dropdown-menu "> | ||
− | <li><a href="https://2017.igem.org/Team:ZJU-China/ | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Overview">Project Description</a></li> |
− | <li><a href="https://2017.igem.org/Team:ZJU-China/ | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Achievements">Achievements</a></li> |
− | </li> | + | <li><a href="https://2017.igem.org/Team:ZJU-China/InterLab">InterLab</a></li> |
+ | <li><a href="https://2017.igem.org/Team:ZJU-China/ImproveParts">Improve Parts</a></li> | ||
</ul> | </ul> | ||
</li> | </li> | ||
Line 424: | Line 424: | ||
<a href="#" class="dropdown-toggle link" data-toggle="dropdown">Project<b class="caret"></b></a> | <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Project<b class="caret"></b></a> | ||
<ul class="dropdown-menu "> | <ul class="dropdown-menu "> | ||
− | <li><a href="https://2017.igem.org/Team:ZJU-China/Project">Project Home</a></li> | + | <!--<li><a href="https://2017.igem.org/Team:ZJU-China/Project">Project Home</a></li>--> |
− | <li><a href="https://2017.igem.org/Team:ZJU-China/Project/improvement">Improvement</a></li> | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Project/tp">Trichoderma Proof</a></li> |
− | <li><a href="https://2017.igem.org/Team:ZJU-China/InterLab">Interlab</a></li> | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Project/voc">VOC sensors</a></li> |
+ | <li><a href="https://2017.igem.org/Team:ZJU-China/Project/st">Signal Transduction</a></li> | ||
+ | <li><a href="https://2017.igem.org/Team:ZJU-China/Project/ms">Mat Synthesis</a></li> | ||
+ | <li><a href="https://2017.igem.org/Team:ZJU-China/Project/conclusion">Conclusion</a></li> | ||
+ | <!--<li><a href="https://2017.igem.org/Team:ZJU-China/Project/improvement">Improvement</a></li>--> | ||
+ | <!--<li><a href="https://2017.igem.org/Team:ZJU-China/InterLab">Interlab</a></li>--> | ||
<li><a href="https://2017.igem.org/Team:ZJU-China/Notebook">Notebook</a></li> | <li><a href="https://2017.igem.org/Team:ZJU-China/Notebook">Notebook</a></li> | ||
+ | </ul> | ||
+ | </li> | ||
+ | |||
+ | <li class="m_nav_item dropdown" > | ||
+ | <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Modelling<b class="caret"></b></a> | ||
+ | <ul class="dropdown-menu "> | ||
+ | <!--<li><a href="https://2017.igem.org/Team:ZJU-China/Model">Summery</a></li>--> | ||
+ | <li><a href="https://2017.igem.org/Team:ZJU-China/Model">Coculture</a></li> | ||
+ | <li><a href="https://2017.igem.org/Team:ZJU-China/Model/VOC">VOC analysis</a></li> | ||
+ | |||
</ul> | </ul> | ||
</li> | </li> | ||
<li class="m_nav_item dropdown"> | <li class="m_nav_item dropdown"> | ||
− | <a href="#" class="dropdown-toggle link" data-toggle="dropdown"> | + | <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Parts<b class="caret"></b></a> |
− | + | ||
<ul class="dropdown-menu "> | <ul class="dropdown-menu "> | ||
− | <li><a href="https://2017.igem.org/Team:ZJU-China/ | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Parts">All Parts</a></li> |
− | <li><a href="https://2017.igem.org/Team:ZJU-China/ | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Parts/Basic">Basic Parts</a></li> |
− | <li><a href="https://2017.igem.org/Team:ZJU-China/ | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Parts/Composite">Composite Parts</a></li> |
+ | <li><a href="https://2017.igem.org/Team:ZJU-China/Parts/Collection">Parts Collection</a></li> | ||
</ul> | </ul> | ||
</li> | </li> | ||
− | <li><a href="https://2017.igem.org/Team:ZJU-China | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Hardware">Hardware</a></li> |
− | + | <li class="m_nav_item dropdown" > | |
− | <li class="m_nav_item dropdown"> | + | |
<a href="#" class="dropdown-toggle link" data-toggle="dropdown">Safety<b class="caret"></b></a> | <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Safety<b class="caret"></b></a> | ||
<ul class="dropdown-menu "> | <ul class="dropdown-menu "> | ||
<li><a href="https://2017.igem.org/Team:ZJU-China/Safety">Environment</a></li> | <li><a href="https://2017.igem.org/Team:ZJU-China/Safety">Environment</a></li> | ||
− | <li><a href="https://2017.igem.org/Team:ZJU-China/Safety/ | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Safety/Lab">Laboratory</a></li> |
</ul> | </ul> | ||
</li> | </li> | ||
Line 455: | Line 469: | ||
<li class="m_nav_item dropdown"> | <li class="m_nav_item dropdown"> | ||
− | <a href="#" class="dropdown-toggle link" data-toggle="dropdown">HP<b | + | <a href="#" class="dropdown-toggle link" data-toggle="dropdown">HP<b class="caret"></b></a> |
− | + | ||
<ul class="dropdown-menu "> | <ul class="dropdown-menu "> | ||
− | <li><a href="https://2017.igem.org/Team:ZJU-China/ | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Human_Practices">Summary</a></li> |
<li><a href="https://2017.igem.org/Team:ZJU-China/HP/Silver">Silver</a></li> | <li><a href="https://2017.igem.org/Team:ZJU-China/HP/Silver">Silver</a></li> | ||
<li><a href="https://2017.igem.org/Team:ZJU-China/HP/Gold_Integrated">Gold</a></li> | <li><a href="https://2017.igem.org/Team:ZJU-China/HP/Gold_Integrated">Gold</a></li> | ||
Line 464: | Line 477: | ||
</li> | </li> | ||
− | + | <li class="m_nav_item dropdown" > | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | <li class="m_nav_item dropdown"> | + | |
<a href="#" class="dropdown-toggle link" data-toggle="dropdown">Team<b class="caret"></b></a> | <a href="#" class="dropdown-toggle link" data-toggle="dropdown">Team<b class="caret"></b></a> | ||
<ul class="dropdown-menu "> | <ul class="dropdown-menu "> | ||
− | <li><a href="https://2017.igem.org/Team:ZJU-China/ | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Team">Teammates</a></li> |
− | <li><a href="https://2017.igem.org/Team:ZJU-China/ | + | <li><a href="https://2017.igem.org/Team:ZJU-China/Attributions">Attribution</a></li> |
+ | <li><a href="https://2017.igem.org/Team:ZJU-China/Collaborations">Collaboration</a></li> | ||
</ul> | </ul> | ||
</li> | </li> | ||
Line 518: | Line 522: | ||
<p class="PP">Our target is to create a model and predict tobacco's status according to 10 input features. This is a classic two classification problem, and there are several algrithm to solve it. The sampling algorithm is cross validation and the scoring policy we apply is ridit test.</p> | <p class="PP">Our target is to create a model and predict tobacco's status according to 10 input features. This is a classic two classification problem, and there are several algrithm to solve it. The sampling algorithm is cross validation and the scoring policy we apply is ridit test.</p> | ||
<p class="PP"><strong>Decision Tree</strong></p> | <p class="PP"><strong>Decision Tree</strong></p> | ||
− | <p class="PP">First we use decision tree based on information theory. ID3 decision tree is used to reduce the most information gain, and CART tree is used to reduce the GINI index. The performance of these two algorithm is almost the same.< | + | <p class="PP">First we use decision tree based on information theory. ID3 decision tree is used to reduce the most information gain, and CART tree is used to reduce the GINI index. The performance of these two algorithm is almost the same. <strong>R = 0.83</strong></p> |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
<img class="textimg" src='https://static.igem.org/mediawiki/2017/3/30/ZJU_China_VOC_5.png' alt=''/> | <img class="textimg" src='https://static.igem.org/mediawiki/2017/3/30/ZJU_China_VOC_5.png' alt=''/> | ||
Line 551: | Line 530: | ||
<p class="PP"><strong>MLP</strong></p> | <p class="PP"><strong>MLP</strong></p> | ||
<p class="PP">The second algorithm we apply is Multi-Layer Perception, also called neutral network. In this model, we use more than 100 neurons in each layer and the activation function is relu.</p> | <p class="PP">The second algorithm we apply is Multi-Layer Perception, also called neutral network. In this model, we use more than 100 neurons in each layer and the activation function is relu.</p> | ||
− | <p class="PP">The result of MLP is much better than decision tree.< | + | <p class="PP">The result of MLP is much better than decision tree.<strong>R = 0.89</strong></p> |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
<p><img class="textimg" src='https://static.igem.org/mediawiki/2017/9/91/ZJU_China_VOC_7.png' alt=''/> | <p><img class="textimg" src='https://static.igem.org/mediawiki/2017/9/91/ZJU_China_VOC_7.png' alt=''/> | ||
Line 584: | Line 537: | ||
<p class="PP">Although the performance of MLP has been good enough, it's difficult to extract konwledge | <p class="PP">Although the performance of MLP has been good enough, it's difficult to extract konwledge | ||
learn by algorithm, the interpretability is weak. Why don't we try a simple model with | learn by algorithm, the interpretability is weak. Why don't we try a simple model with | ||
− | high interpretability? First we try LDA algorithm to compress the 10dimensions data into 2 | + | high interpretability? First we try LDA algorithm to compress the 10dimensions data into 2 dimensions.</p> |
− | + | <p class="PP" style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" | |
− | <p class="PP"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" | + | |
style="text-align: center;"><span | style="text-align: center;"><span | ||
class="MathJax_SVG" id="MathJax-Element-3-Frame" tabindex="-1" | class="MathJax_SVG" id="MathJax-Element-3-Frame" tabindex="-1" | ||
Line 708: | Line 660: | ||
<script type="math/tex" id="MathJax-Element-4">S_w</script> | <script type="math/tex" id="MathJax-Element-4">S_w</script> | ||
as <strong>within-class scatter matrix</strong></p> | as <strong>within-class scatter matrix</strong></p> | ||
− | <p class="PP"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" | + | <p class="PP" style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" |
style="text-align: center;"><span | style="text-align: center;"><span | ||
class="MathJax_SVG" id="MathJax-Element-5-Frame" tabindex="-1" | class="MathJax_SVG" id="MathJax-Element-5-Frame" tabindex="-1" | ||
Line 824: | Line 776: | ||
<script type="math/tex" id="MathJax-Element-6">S_b</script> | <script type="math/tex" id="MathJax-Element-6">S_b</script> | ||
as <strong>between-class scatter matrix</strong></p> | as <strong>between-class scatter matrix</strong></p> | ||
− | <p class="PP"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" | + | <p class="PP" style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" |
style="text-align: center;"><span | style="text-align: center;"><span | ||
class="MathJax_SVG" id="MathJax-Element-7-Frame" tabindex="-1" | class="MathJax_SVG" id="MathJax-Element-7-Frame" tabindex="-1" | ||
Line 877: | Line 829: | ||
</p> | </p> | ||
<p class="PP">So</p> | <p class="PP">So</p> | ||
− | <p class="PP"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" | + | <p class="PP" style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" |
style="text-align: center;"><span | style="text-align: center;"><span | ||
class="MathJax_SVG" id="MathJax-Element-8-Frame" tabindex="-1" | class="MathJax_SVG" id="MathJax-Element-8-Frame" tabindex="-1" | ||
Line 1,028: | Line 980: | ||
</script> | </script> | ||
</p> | </p> | ||
− | <p><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" | + | <p style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" |
style="text-align: center;"><span | style="text-align: center;"><span | ||
class="MathJax_SVG" id="MathJax-Element-12-Frame" tabindex="-1" | class="MathJax_SVG" id="MathJax-Element-12-Frame" tabindex="-1" | ||
Line 1,094: | Line 1,046: | ||
</script> | </script> | ||
</p> | </p> | ||
− | <p><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" | + | <p style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" |
style="text-align: center;"><span | style="text-align: center;"><span | ||
class="MathJax_SVG" id="MathJax-Element-13-Frame" tabindex="-1" | class="MathJax_SVG" id="MathJax-Element-13-Frame" tabindex="-1" | ||
Line 1,155: | Line 1,107: | ||
</script> | </script> | ||
</p> | </p> | ||
− | <p><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" | + | <p style="text-align: center !important;"><span class="MathJax_Preview"></span><span class="MathJax_SVG_Display" |
style="text-align: center;"><span | style="text-align: center;"><span | ||
class="MathJax_SVG" id="MathJax-Element-14-Frame" tabindex="-1" | class="MathJax_SVG" id="MathJax-Element-14-Frame" tabindex="-1" | ||
Line 1,230: | Line 1,182: | ||
<p class="PP">Then we can apply maximum likelihood method algorithm to estimate the paramaters.</p> | <p class="PP">Then we can apply maximum likelihood method algorithm to estimate the paramaters.</p> | ||
<p class="PP">The result is as following:</p> | <p class="PP">The result is as following:</p> | ||
− | + | <figure class="highlight"><pre><code class="language-html" data-lang="html"> | |
− | + | Weight: | |
− | + | [[ 0.1819504 0.38788225 0.01350023 0.39594948 0.17799418 | |
− | + | 0.42087034 | |
− | + | -0.57733395 -0.23876003 -0.00532918 -0.46174515]] | |
− | + | Intercept: | |
− | + | [ 0.00937812] | |
− | + | Effect: | |
− | + | D 35.300735 | |
− | + | B 22.596339 | |
− | + | F 18.289277 | |
− | + | E 10.265025 | |
− | + | C 0.393225 | |
− | + | I -1.575564 | |
− | + | A -10.679026 | |
− | + | H -14.398440 | |
− | + | G -26.211964 | |
− | + | J -39.130542 | |
− | + | dtype: float64 | |
− | + | Score: | |
− | + | 0.894333333333 | |
− | + | </code></pre></figure> | |
− | <h2 class="H2Head">Algorithm optimization</h2> | + | <h2 id="algorithmoptimization" class="H2Head">Algorithm optimization</h2> |
<p class="PP">From the result of logistics regression, factor C and I and etc. are with less important weight, | <p class="PP">From the result of logistics regression, factor C and I and etc. are with less important weight, | ||
these factors maybe disturb the classifaction. We try to reduce unimportant factors and simplify the | these factors maybe disturb the classifaction. We try to reduce unimportant factors and simplify the | ||
Line 1,258: | Line 1,210: | ||
<p class="PP">Finally, we reserve 4 factors with which we can predict the tobacco in 91% confidence and also reduce | <p class="PP">Finally, we reserve 4 factors with which we can predict the tobacco in 91% confidence and also reduce | ||
the VOC device.</p> | the VOC device.</p> | ||
− | + | <figure class="highlight"><pre><code class="language-html" data-lang="html"> | |
− | + | Weight: | |
− | + | [[ 0.53196697 0.3404023 -0.53555988 -0.45588715]] | |
− | + | Intercept: | |
− | + | [-0.01204088] | |
− | + | Effect: | |
− | + | D 33.217011 | |
− | + | F 15.492680 | |
− | + | G -17.319760 | |
− | + | J -33.967849 | |
− | + | dtype: float64 | |
− | + | Score: | |
− | + | 0.912444444444 | |
+ | </code></pre></figure> | ||
<img class="textimg" src='https://static.igem.org/mediawiki/2017/7/73/ZJU_China_VOC_9.png' alt=''/> | <img class="textimg" src='https://static.igem.org/mediawiki/2017/7/73/ZJU_China_VOC_9.png' alt=''/> | ||
− | <h2 class="H2Head">Summary</h2> | + | <h2 id="summary" class="H2Head">Summary</h2> |
<p class="PP">In this model, we try different algorithm to abttain a robust, interpretable, and accurate solution | <p class="PP">In this model, we try different algorithm to abttain a robust, interpretable, and accurate solution | ||
to predict whether the tobacco is infected only according to 4 features in 91% confidence. Since | to predict whether the tobacco is infected only according to 4 features in 91% confidence. Since | ||
Line 1,296: | Line 1,249: | ||
<li><a href="#datapreprocessing">Transformation</a></li> | <li><a href="#datapreprocessing">Transformation</a></li> | ||
<li><a href="#dataanalysis">Data Analysis</a></li> | <li><a href="#dataanalysis">Data Analysis</a></li> | ||
+ | <li><a href="#algorithmoptimization">Algorithm Optimization</a></li> | ||
+ | <li><a href="#summary">Summary</a></li> | ||
</ul> | </ul> | ||
</li> | </li> | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
</ul> | </ul> | ||
Revision as of 09:40, 20 October 2017
Modeling
VOC Classification
Overview
The VOC device is designed to judge whether the tobacco is heathy or gets infected. Since this is an inquiry experiment, algorithms in data analysis are widely use in our modeling. We do data preprocessing, data analysis, and algorithm optimization on the data collected by VOC device. Finally, we use Logistic regression and detect the infected tobacco with 91% confidence.
Data preprocessing
First we defragment the raw input data, and reorganize them into a matrix. 10 VOC factors are served as features, and the status(heathy or infected) is served as tag to be predicted.
Then we analysis the data using box plot and discover that most data are normal, but some records are singular, whose box plot are show as folowing:
We remove those records with singular value, and the data left obey normal distribution:
Data analysis
Our target is to create a model and predict tobacco's status according to 10 input features. This is a classic two classification problem, and there are several algrithm to solve it. The sampling algorithm is cross validation and the scoring policy we apply is ridit test.
Decision Tree
First we use decision tree based on information theory. ID3 decision tree is used to reduce the most information gain, and CART tree is used to reduce the GINI index. The performance of these two algorithm is almost the same. R = 0.83
MLP
The second algorithm we apply is Multi-Layer Perception, also called neutral network. In this model, we use more than 100 neurons in each layer and the activation function is relu.
The result of MLP is much better than decision tree.R = 0.89
Leaner Model
Although the performance of MLP has been good enough, it's difficult to extract konwledge learn by algorithm, the interpretability is weak. Why don't we try a simple model with high interpretability? First we try LDA algorithm to compress the 10dimensions data into 2 dimensions.
We define as within-class scatter matrix
We define as between-class scatter matrix
So
, the target of LDA is maxmize the
The result of LDA algorithm is as following and :
This result prove the data are linear separable, then we choose logistics regression algorithm.
We difine
Then we can apply maximum likelihood method algorithm to estimate the paramaters.
The result is as following:
Algorithm optimization
From the result of logistics regression, factor C and I and etc. are with less important weight, these factors maybe disturb the classifaction. We try to reduce unimportant factors and simplify the model.
Finally, we reserve 4 factors with which we can predict the tobacco in 91% confidence and also reduce the VOC device.
Summary
In this model, we try different algorithm to abttain a robust, interpretable, and accurate solution to predict whether the tobacco is infected only according to 4 features in 91% confidence. Since there are 6 VOC sensors are meaningless in this model, we the device can also be simplified by reduce them.