Day 8: Transfer learning - Impact of the number of hidden layers 4
Yesterday, I found out that additional hidden layer does not have any impact on the time to process each batch. Today, I want to find out whether it has any impact on the test loss and accuracy.
The result for one hidden layer,
epochs steps processing_time/batch training_loss test_loss accuracy
The result for two hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy
The result for three hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy
The result for five hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy
The result for ten hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy
The results show me that additional hidden layer will slow down the accuracy. With only one hidden layer we will reach higher than 90% quicker compare with others while ten hidden layers will be the slower to reach more than 90% accuracy.
I am thinking is that because dog and cat classification is only two outputs? Ia that why additional hidden layer will not be beneficial to out training? What do you think?
The result for one hidden layer,
epochs steps processing_time/batch training_loss test_loss accuracy
1/3 5 9.998 2.583 1.164 0.504
1/3 10 10.004 0.817 0.242 0.911
1/3 15 10.019 0.247 0.103 0.970
1/3 20 10.009 0.177 0.132 0.953
1/3 25 9.999 0.197 0.075 0.975
1/3 30 9.986 0.221 0.066 0.978
1/3 35 10.028 0.214 0.062 0.978
1/3 40 10.017 0.164 0.058 0.977
1/3 45 10.005 0.112 0.069 0.972
1/3 50 9.959 0.228 0.055 0.978
2/3 5 9.866 0.197 0.060 0.978
2/3 10 10.019 0.178 0.084 0.967
2/3 15 9.690 0.252 0.064 0.977
2/3 20 10.035 0.295 0.054 0.983
2/3 25 9.698 0.148 0.061 0.981
2/3 30 10.009 0.140 0.113 0.958
2/3 35 9.697 0.215 0.196 0.927
2/3 40 10.030 0.275 0.053 0.978
2/3 45 9.724 0.213 0.042 0.984
2/3 50 10.020 0.148 0.053 0.983
3/3 5 10.193 0.133 0.045 0.984
3/3 10 10.004 0.164 0.041 0.984
3/3 15 9.994 0.100 0.042 0.984
3/3 20 10.018 0.148 0.039 0.985
3/3 25 10.017 0.115 0.048 0.983
3/3 30 10.007 0.128 0.070 0.975
3/3 35 9.995 0.235 0.050 0.981
3/3 40 10.033 0.193 0.069 0.973
3/3 45 9.989 0.181 0.091 0.966
3/3 50 10.020 0.178 0.086 0.964
The result for two hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy
1/3 5 9.808 3.117 0.860 0.494
1/3 10 9.779 0.765 0.627 0.506
1/3 15 9.803 0.632 0.417 0.950
1/3 20 9.798 0.376 0.147 0.977
1/3 25 9.812 0.279 0.077 0.976
1/3 30 9.799 0.251 0.128 0.953
1/3 35 9.804 0.558 0.185 0.925
1/3 40 9.812 0.292 0.146 0.955
1/3 45 9.791 0.333 0.245 0.899
1/3 50 9.793 0.307 0.114 0.978
2/3 5 9.992 0.296 0.063 0.981
2/3 10 9.792 0.203 0.053 0.981
2/3 15 9.801 0.208 0.047 0.983
2/3 20 9.807 0.169 0.132 0.953
2/3 25 9.791 0.237 0.075 0.974
2/3 30 9.826 0.172 0.071 0.977
2/3 35 9.817 0.143 0.055 0.982
2/3 40 9.848 0.104 0.060 0.977
2/3 45 9.834 0.224 0.087 0.971
2/3 50 9.865 0.461 0.105 0.959
3/3 5 10.053 0.318 0.081 0.980
3/3 10 9.915 0.173 0.067 0.980
3/3 15 9.899 0.198 0.069 0.975
3/3 20 9.917 0.212 0.056 0.981
3/3 25 9.921 0.197 0.058 0.978
3/3 30 9.867 0.135 0.110 0.959
3/3 35 9.905 0.305 0.077 0.975
3/3 40 9.911 0.171 0.068 0.979
3/3 45 9.900 0.254 0.089 0.969
3/3 50 9.901 0.173 0.055 0.981
The result for three hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy
1/3 5 9.849 1.203 0.726 0.494
1/3 10 9.892 0.675 0.690 0.494
1/3 15 9.926 0.683 0.393 0.976
1/3 20 9.906 0.381 0.135 0.957
1/3 25 9.914 0.235 0.061 0.979
1/3 30 9.924 0.296 0.060 0.977
1/3 35 9.928 0.304 0.057 0.983
1/3 40 9.918 0.434 0.085 0.982
1/3 45 9.910 0.319 0.158 0.966
1/3 50 9.911 0.167 0.077 0.981
2/3 5 9.896 0.199 0.046 0.984
2/3 10 9.729 0.199 0.082 0.968
2/3 15 9.717 0.199 0.060 0.979
2/3 20 9.727 0.203 0.056 0.983
2/3 25 9.739 0.111 0.091 0.967
2/3 30 9.720 0.207 0.067 0.974
2/3 35 9.764 0.170 0.056 0.980
2/3 40 9.728 0.190 0.060 0.979
2/3 45 9.743 0.183 0.067 0.971
2/3 50 9.726 0.196 0.045 0.982
3/3 5 10.085 0.290 0.051 0.984
3/3 10 9.902 0.136 0.054 0.980
3/3 15 9.903 0.170 0.060 0.978
3/3 20 9.886 0.182 0.041 0.984
3/3 25 9.906 0.219 0.052 0.983
3/3 30 9.919 0.139 0.066 0.969
3/3 35 9.934 0.257 0.072 0.981
3/3 40 9.908 0.171 0.044 0.984
3/3 45 9.924 0.129 0.039 0.986
3/3 50 9.887 0.209 0.057 0.979
The result for five hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy
1/3 5 9.616 0.699 0.837 0.506
1/3 10 9.980 0.705 0.578 0.494
1/3 15 9.652 0.479 0.182 0.974
1/3 20 9.643 0.310 0.227 0.947
1/3 25 9.667 0.776 0.203 0.913
1/3 30 10.050 0.316 0.273 0.958
1/3 35 9.655 0.337 0.183 0.974
1/3 40 9.642 0.219 0.123 0.951
1/3 45 9.639 0.246 0.053 0.977
1/3 50 10.026 0.137 0.060 0.977
2/3 5 9.796 0.231 0.055 0.978
2/3 10 10.014 0.245 0.061 0.979
2/3 15 9.662 0.155 0.092 0.965
2/3 20 9.664 0.173 0.054 0.978
2/3 25 9.639 0.143 0.082 0.973
2/3 30 10.013 0.249 0.106 0.966
2/3 35 9.635 0.154 0.077 0.976
2/3 40 9.648 0.220 0.075 0.970
2/3 45 9.656 0.175 0.074 0.970
2/3 50 10.026 0.385 0.113 0.961
3/3 5 9.831 0.213 0.094 0.970
3/3 10 10.008 0.172 0.054 0.981
3/3 15 9.638 0.334 0.085 0.964
3/3 20 9.660 0.222 0.099 0.977
3/3 25 9.652 0.258 0.054 0.981
3/3 30 10.033 0.149 0.044 0.982
3/3 35 9.658 0.155 0.060 0.976
3/3 40 9.620 0.146 0.088 0.970
3/3 45 9.658 0.251 0.066 0.981
3/3 50 10.037 0.175 0.077 0.977
The result for ten hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy
1/3 5 9.891 0.728 0.694 0.494
1/3 10 9.858 0.697 0.686 0.494
1/3 15 9.832 0.785 0.693 0.494
1/3 20 9.852 0.693 0.676 0.506
1/3 25 9.866 0.676 0.540 0.506
1/3 30 9.856 0.650 0.634 0.845
1/3 35 9.859 0.556 0.309 0.947
1/3 40 9.848 0.301 0.106 0.955
1/3 45 9.861 0.199 0.099 0.961
1/3 50 9.853 0.585 0.147 0.921
2/3 5 9.858 0.558 0.360 0.956
2/3 10 9.676 0.780 0.065 0.978
2/3 15 9.719 0.385 0.316 0.943
2/3 20 9.685 0.516 0.491 0.693
2/3 25 9.689 0.566 0.286 0.960
2/3 30 9.705 0.319 0.074 0.981
2/3 35 9.689 0.311 0.154 0.962
2/3 40 9.696 0.340 0.136 0.978
2/3 45 9.689 0.192 0.052 0.981
2/3 50 9.737 0.318 1.108 0.846
3/3 5 10.054 0.538 0.372 0.766
3/3 10 9.860 0.504 0.406 0.761
3/3 15 9.880 0.439 0.141 0.951
3/3 20 9.843 0.258 0.158 0.970
3/3 25 9.866 0.178 0.058 0.977
3/3 30 9.853 0.165 0.092 0.955
3/3 35 9.884 0.189 0.086 0.982
3/3 40 9.892 0.221 0.057 0.983
3/3 45 9.824 0.144 0.065 0.978
3/3 50 9.878 0.194 0.050 0.983
The results show me that additional hidden layer will slow down the accuracy. With only one hidden layer we will reach higher than 90% quicker compare with others while ten hidden layers will be the slower to reach more than 90% accuracy.
I am thinking is that because dog and cat classification is only two outputs? Ia that why additional hidden layer will not be beneficial to out training? What do you think?
Comments
Post a Comment