Bank of Weight Filters for Deep CNNs

Suresh Kirthi Kumaraswamy, PS Sastry, Kalpathi Ramakrishnan
Proceedings of The 8th Asian Conference on Machine Learning, PMLR 63:334-349, 2016.

Abstract

Convolutional neural networks (CNNs) are seen to be extremely effective in many large object recognition tasks. One of the reasons for this is that they learn appropriate features also from the training data. The convolutional layers of a CNN have these feature generating filters whose weights are learnt. However, this entails learning millions of weights (across different layers) and hence learning times are very large even on the best available hardware. In some studies in transfer learning it has been observed that the network learnt on one task can be reused on another task (by some finetuning). In this context, this paper presents a systematic study of the exchangeability of weight filters of CNNs across different object recognition tasks. The paper proposes the concept of bank of weight-filters (BWF) which consists of all the weight vectors of filters learnt by different CNNs on different tasks. The BWF can be viewed at multiple levels of granularity such as network-level, layer-level and filter-level. Through extensive empirical investigations we show that one can efficiently learn CNNs for new tasks by randomly selecting from the bank of filters for initializing the convolutional layers of the new CNN. Our study is done at all the multiple levels of granularity mentioned above. Our results show that the concept of BWF proposed here would offer a very good strategy for initializing the filters while learning CNNs. We also show that the dependency among the filters and the layers of the CNN is not strict. One can choose any pre-trained filter instead of a fixed pre-trained net, as a whole, for initialization. This paper is a first step in the direction of creating and characterizing a Universal BWF for efficient learning of CNNs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v63-kumaraswamy29, title = {Bank of Weight Filters for Deep CNNs}, author = {Kumaraswamy, Suresh Kirthi and Sastry, PS and Ramakrishnan, Kalpathi}, booktitle = {Proceedings of The 8th Asian Conference on Machine Learning}, pages = {334--349}, year = {2016}, editor = {Durrant, Robert J. and Kim, Kee-Eung}, volume = {63}, series = {Proceedings of Machine Learning Research}, address = {The University of Waikato, Hamilton, New Zealand}, month = {16--18 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v63/kumaraswamy29.pdf}, url = {https://proceedings.mlr.press/v63/kumaraswamy29.html}, abstract = {Convolutional neural networks (CNNs) are seen to be extremely effective in many large object recognition tasks. One of the reasons for this is that they learn appropriate features also from the training data. The convolutional layers of a CNN have these feature generating filters whose weights are learnt. However, this entails learning millions of weights (across different layers) and hence learning times are very large even on the best available hardware. In some studies in transfer learning it has been observed that the network learnt on one task can be reused on another task (by some finetuning). In this context, this paper presents a systematic study of the exchangeability of weight filters of CNNs across different object recognition tasks. The paper proposes the concept of bank of weight-filters (BWF) which consists of all the weight vectors of filters learnt by different CNNs on different tasks. The BWF can be viewed at multiple levels of granularity such as network-level, layer-level and filter-level. Through extensive empirical investigations we show that one can efficiently learn CNNs for new tasks by randomly selecting from the bank of filters for initializing the convolutional layers of the new CNN. Our study is done at all the multiple levels of granularity mentioned above. Our results show that the concept of BWF proposed here would offer a very good strategy for initializing the filters while learning CNNs. We also show that the dependency among the filters and the layers of the CNN is not strict. One can choose any pre-trained filter instead of a fixed pre-trained net, as a whole, for initialization. This paper is a first step in the direction of creating and characterizing a Universal BWF for efficient learning of CNNs.} }
Endnote
%0 Conference Paper %T Bank of Weight Filters for Deep CNNs %A Suresh Kirthi Kumaraswamy %A PS Sastry %A Kalpathi Ramakrishnan %B Proceedings of The 8th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Robert J. Durrant %E Kee-Eung Kim %F pmlr-v63-kumaraswamy29 %I PMLR %P 334--349 %U https://proceedings.mlr.press/v63/kumaraswamy29.html %V 63 %X Convolutional neural networks (CNNs) are seen to be extremely effective in many large object recognition tasks. One of the reasons for this is that they learn appropriate features also from the training data. The convolutional layers of a CNN have these feature generating filters whose weights are learnt. However, this entails learning millions of weights (across different layers) and hence learning times are very large even on the best available hardware. In some studies in transfer learning it has been observed that the network learnt on one task can be reused on another task (by some finetuning). In this context, this paper presents a systematic study of the exchangeability of weight filters of CNNs across different object recognition tasks. The paper proposes the concept of bank of weight-filters (BWF) which consists of all the weight vectors of filters learnt by different CNNs on different tasks. The BWF can be viewed at multiple levels of granularity such as network-level, layer-level and filter-level. Through extensive empirical investigations we show that one can efficiently learn CNNs for new tasks by randomly selecting from the bank of filters for initializing the convolutional layers of the new CNN. Our study is done at all the multiple levels of granularity mentioned above. Our results show that the concept of BWF proposed here would offer a very good strategy for initializing the filters while learning CNNs. We also show that the dependency among the filters and the layers of the CNN is not strict. One can choose any pre-trained filter instead of a fixed pre-trained net, as a whole, for initialization. This paper is a first step in the direction of creating and characterizing a Universal BWF for efficient learning of CNNs.
RIS
TY - CPAPER TI - Bank of Weight Filters for Deep CNNs AU - Suresh Kirthi Kumaraswamy AU - PS Sastry AU - Kalpathi Ramakrishnan BT - Proceedings of The 8th Asian Conference on Machine Learning DA - 2016/11/20 ED - Robert J. Durrant ED - Kee-Eung Kim ID - pmlr-v63-kumaraswamy29 PB - PMLR DP - Proceedings of Machine Learning Research VL - 63 SP - 334 EP - 349 L1 - http://proceedings.mlr.press/v63/kumaraswamy29.pdf UR - https://proceedings.mlr.press/v63/kumaraswamy29.html AB - Convolutional neural networks (CNNs) are seen to be extremely effective in many large object recognition tasks. One of the reasons for this is that they learn appropriate features also from the training data. The convolutional layers of a CNN have these feature generating filters whose weights are learnt. However, this entails learning millions of weights (across different layers) and hence learning times are very large even on the best available hardware. In some studies in transfer learning it has been observed that the network learnt on one task can be reused on another task (by some finetuning). In this context, this paper presents a systematic study of the exchangeability of weight filters of CNNs across different object recognition tasks. The paper proposes the concept of bank of weight-filters (BWF) which consists of all the weight vectors of filters learnt by different CNNs on different tasks. The BWF can be viewed at multiple levels of granularity such as network-level, layer-level and filter-level. Through extensive empirical investigations we show that one can efficiently learn CNNs for new tasks by randomly selecting from the bank of filters for initializing the convolutional layers of the new CNN. Our study is done at all the multiple levels of granularity mentioned above. Our results show that the concept of BWF proposed here would offer a very good strategy for initializing the filters while learning CNNs. We also show that the dependency among the filters and the layers of the CNN is not strict. One can choose any pre-trained filter instead of a fixed pre-trained net, as a whole, for initialization. This paper is a first step in the direction of creating and characterizing a Universal BWF for efficient learning of CNNs. ER -
APA
Kumaraswamy, S.K., Sastry, P. & Ramakrishnan, K.. (2016). Bank of Weight Filters for Deep CNNs. Proceedings of The 8th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 63:334-349 Available from https://proceedings.mlr.press/v63/kumaraswamy29.html.

Related Material