Day 8: Transfer learning - Impact of the number of hidden layers 4

Yesterday, I found out that additional hidden layer does not have any impact on the time to process each batch. Today, I want to find out whether it has any impact on the test loss and accuracy.

The result for one hidden layer,
epochs steps processing_time/batch training_loss test_loss accuracy

1/3 5 9.998 2.583 1.164 0.504
1/3 10 10.004 0.817 0.242 0.911
1/3 15 10.019 0.247 0.103 0.970
1/3 20 10.009 0.177 0.132 0.953
1/3 25 9.999 0.197 0.075 0.975
1/3 30 9.986 0.221 0.066 0.978
1/3 35 10.028 0.214 0.062 0.978
1/3 40 10.017 0.164 0.058 0.977
1/3 45 10.005 0.112 0.069 0.972
1/3 50 9.959 0.228 0.055 0.978
2/3 5 9.866 0.197 0.060 0.978
2/3 10 10.019 0.178 0.084 0.967
2/3 15 9.690 0.252 0.064 0.977
2/3 20 10.035 0.295 0.054 0.983
2/3 25 9.698 0.148 0.061 0.981
2/3 30 10.009 0.140 0.113 0.958
2/3 35 9.697 0.215 0.196 0.927
2/3 40 10.030 0.275 0.053 0.978
2/3 45 9.724 0.213 0.042 0.984
2/3 50 10.020 0.148 0.053 0.983
3/3 5 10.193 0.133 0.045 0.984
3/3 10 10.004 0.164 0.041 0.984
3/3 15 9.994 0.100 0.042 0.984
3/3 20 10.018 0.148 0.039 0.985
3/3 25 10.017 0.115 0.048 0.983
3/3 30 10.007 0.128 0.070 0.975
3/3 35 9.995 0.235 0.050 0.981
3/3 40 10.033 0.193 0.069 0.973
3/3 45 9.989 0.181 0.091 0.966
3/3 50 10.020 0.178 0.086 0.964

The result for two hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy

1/3 5 9.808 3.117 0.860 0.494
1/3 10 9.779 0.765 0.627 0.506
1/3 15 9.803 0.632 0.417 0.950
1/3 20 9.798 0.376 0.147 0.977
1/3 25 9.812 0.279 0.077 0.976
1/3 30 9.799 0.251 0.128 0.953
1/3 35 9.804 0.558 0.185 0.925
1/3 40 9.812 0.292 0.146 0.955
1/3 45 9.791 0.333 0.245 0.899
1/3 50 9.793 0.307 0.114 0.978
2/3 5 9.992 0.296 0.063 0.981
2/3 10 9.792 0.203 0.053 0.981
2/3 15 9.801 0.208 0.047 0.983
2/3 20 9.807 0.169 0.132 0.953
2/3 25 9.791 0.237 0.075 0.974
2/3 30 9.826 0.172 0.071 0.977
2/3 35 9.817 0.143 0.055 0.982
2/3 40 9.848 0.104 0.060 0.977
2/3 45 9.834 0.224 0.087 0.971
2/3 50 9.865 0.461 0.105 0.959
3/3 5 10.053 0.318 0.081 0.980
3/3 10 9.915 0.173 0.067 0.980
3/3 15 9.899 0.198 0.069 0.975
3/3 20 9.917 0.212 0.056 0.981
3/3 25 9.921 0.197 0.058 0.978
3/3 30 9.867 0.135 0.110 0.959
3/3 35 9.905 0.305 0.077 0.975
3/3 40 9.911 0.171 0.068 0.979
3/3 45 9.900 0.254 0.089 0.969
3/3 50 9.901 0.173 0.055 0.981

The result for three hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy

1/3 5 9.849 1.203 0.726 0.494
1/3 10 9.892 0.675 0.690 0.494
1/3 15 9.926 0.683 0.393 0.976
1/3 20 9.906 0.381 0.135 0.957
1/3 25 9.914 0.235 0.061 0.979
1/3 30 9.924 0.296 0.060 0.977
1/3 35 9.928 0.304 0.057 0.983
1/3 40 9.918 0.434 0.085 0.982
1/3 45 9.910 0.319 0.158 0.966
1/3 50 9.911 0.167 0.077 0.981
2/3 5 9.896 0.199 0.046 0.984
2/3 10 9.729 0.199 0.082 0.968
2/3 15 9.717 0.199 0.060 0.979
2/3 20 9.727 0.203 0.056 0.983
2/3 25 9.739 0.111 0.091 0.967
2/3 30 9.720 0.207 0.067 0.974
2/3 35 9.764 0.170 0.056 0.980
2/3 40 9.728 0.190 0.060 0.979
2/3 45 9.743 0.183 0.067 0.971
2/3 50 9.726 0.196 0.045 0.982
3/3 5 10.085 0.290 0.051 0.984
3/3 10 9.902 0.136 0.054 0.980
3/3 15 9.903 0.170 0.060 0.978
3/3 20 9.886 0.182 0.041 0.984
3/3 25 9.906 0.219 0.052 0.983
3/3 30 9.919 0.139 0.066 0.969
3/3 35 9.934 0.257 0.072 0.981
3/3 40 9.908 0.171 0.044 0.984
3/3 45 9.924 0.129 0.039 0.986
3/3 50 9.887 0.209 0.057 0.979

The result for five hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy

1/3 5 9.616 0.699 0.837 0.506
1/3 10 9.980 0.705 0.578 0.494
1/3 15 9.652 0.479 0.182 0.974
1/3 20 9.643 0.310 0.227 0.947
1/3 25 9.667 0.776 0.203 0.913
1/3 30 10.050 0.316 0.273 0.958
1/3 35 9.655 0.337 0.183 0.974
1/3 40 9.642 0.219 0.123 0.951
1/3 45 9.639 0.246 0.053 0.977
1/3 50 10.026 0.137 0.060 0.977
2/3 5 9.796 0.231 0.055 0.978
2/3 10 10.014 0.245 0.061 0.979
2/3 15 9.662 0.155 0.092 0.965
2/3 20 9.664 0.173 0.054 0.978
2/3 25 9.639 0.143 0.082 0.973
2/3 30 10.013 0.249 0.106 0.966
2/3 35 9.635 0.154 0.077 0.976
2/3 40 9.648 0.220 0.075 0.970
2/3 45 9.656 0.175 0.074 0.970
2/3 50 10.026 0.385 0.113 0.961
3/3 5 9.831 0.213 0.094 0.970
3/3 10 10.008 0.172 0.054 0.981
3/3 15 9.638 0.334 0.085 0.964
3/3 20 9.660 0.222 0.099 0.977
3/3 25 9.652 0.258 0.054 0.981
3/3 30 10.033 0.149 0.044 0.982
3/3 35 9.658 0.155 0.060 0.976
3/3 40 9.620 0.146 0.088 0.970
3/3 45 9.658 0.251 0.066 0.981
3/3 50 10.037 0.175 0.077 0.977

The result for ten hidden layers,
epochs steps processing_time/batch training_loss test_loss accuracy

1/3 5 9.891 0.728 0.694 0.494
1/3 10 9.858 0.697 0.686 0.494
1/3 15 9.832 0.785 0.693 0.494
1/3 20 9.852 0.693 0.676 0.506
1/3 25 9.866 0.676 0.540 0.506
1/3 30 9.856 0.650 0.634 0.845
1/3 35 9.859 0.556 0.309 0.947
1/3 40 9.848 0.301 0.106 0.955
1/3 45 9.861 0.199 0.099 0.961
1/3 50 9.853 0.585 0.147 0.921
2/3 5 9.858 0.558 0.360 0.956
2/3 10 9.676 0.780 0.065 0.978
2/3 15 9.719 0.385 0.316 0.943
2/3 20 9.685 0.516 0.491 0.693
2/3 25 9.689 0.566 0.286 0.960
2/3 30 9.705 0.319 0.074 0.981
2/3 35 9.689 0.311 0.154 0.962
2/3 40 9.696 0.340 0.136 0.978
2/3 45 9.689 0.192 0.052 0.981
2/3 50 9.737 0.318 1.108 0.846
3/3 5 10.054 0.538 0.372 0.766
3/3 10 9.860 0.504 0.406 0.761
3/3 15 9.880 0.439 0.141 0.951
3/3 20 9.843 0.258 0.158 0.970
3/3 25 9.866 0.178 0.058 0.977
3/3 30 9.853 0.165 0.092 0.955
3/3 35 9.884 0.189 0.086 0.982
3/3 40 9.892 0.221 0.057 0.983
3/3 45 9.824 0.144 0.065 0.978
3/3 50 9.878 0.194 0.050 0.983

The results show me that additional hidden layer will slow down the accuracy. With only one hidden layer we will reach higher than 90% quicker compare with others while ten hidden layers will be the slower to reach more than 90% accuracy.

I am thinking is that because dog and cat classification is only two outputs? Ia that why additional hidden layer will not be beneficial to out training? What do you think?

Search This Blog

60 days of udacity

Day 8: Transfer learning - Impact of the number of hidden layers 4

Comments

Post a Comment

Popular posts from this blog

Day 38: fast.ai Lesson 3 - Performance, validation and model interpretation revisited I

Day 34: RF from scartch and gradient descent

Day 36: fast.ai Lesson 1 - Introduction fo random forest revisited