A Convolutional Attention Network for Extreme Summarization of Source Code

Miltiadis Allamanis, Hao Peng, Charles Sutton
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2091-2100, 2016.

Abstract

Attention mechanisms in neural networks have proved useful for problems in which the input and output do not have fixed dimension. Often there exist features that are locally translation invariant and would be valuable for directing the model’s attention, but previous attentional architectures are not constructed to learn such features specifically. We introduce an attentional neural network that employs convolution on the input tokens to detect local time-invariant and long-range topical attention features in a context-dependent way. We apply this architecture to the problem of extreme summarization of source code snippets into short, descriptive function name-like summaries. Using those features, the model sequentially generates a summary by marginalizing over two attention mechanisms: one that predicts the next summary token based on the attention weights of the input tokens and another that is able to copy a code token as-is directly into the summary. We demonstrate our convolutional attention neural network’s performance on 10 popular Java projects showing that it achieves better performance compared to previous attentional mechanisms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-allamanis16, title = {A Convolutional Attention Network for Extreme Summarization of Source Code}, author = {Allamanis, Miltiadis and Peng, Hao and Sutton, Charles}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {2091--2100}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/allamanis16.pdf}, url = {https://proceedings.mlr.press/v48/allamanis16.html}, abstract = {Attention mechanisms in neural networks have proved useful for problems in which the input and output do not have fixed dimension. Often there exist features that are locally translation invariant and would be valuable for directing the model’s attention, but previous attentional architectures are not constructed to learn such features specifically. We introduce an attentional neural network that employs convolution on the input tokens to detect local time-invariant and long-range topical attention features in a context-dependent way. We apply this architecture to the problem of extreme summarization of source code snippets into short, descriptive function name-like summaries. Using those features, the model sequentially generates a summary by marginalizing over two attention mechanisms: one that predicts the next summary token based on the attention weights of the input tokens and another that is able to copy a code token as-is directly into the summary. We demonstrate our convolutional attention neural network’s performance on 10 popular Java projects showing that it achieves better performance compared to previous attentional mechanisms.} }
Endnote
%0 Conference Paper %T A Convolutional Attention Network for Extreme Summarization of Source Code %A Miltiadis Allamanis %A Hao Peng %A Charles Sutton %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-allamanis16 %I PMLR %P 2091--2100 %U https://proceedings.mlr.press/v48/allamanis16.html %V 48 %X Attention mechanisms in neural networks have proved useful for problems in which the input and output do not have fixed dimension. Often there exist features that are locally translation invariant and would be valuable for directing the model’s attention, but previous attentional architectures are not constructed to learn such features specifically. We introduce an attentional neural network that employs convolution on the input tokens to detect local time-invariant and long-range topical attention features in a context-dependent way. We apply this architecture to the problem of extreme summarization of source code snippets into short, descriptive function name-like summaries. Using those features, the model sequentially generates a summary by marginalizing over two attention mechanisms: one that predicts the next summary token based on the attention weights of the input tokens and another that is able to copy a code token as-is directly into the summary. We demonstrate our convolutional attention neural network’s performance on 10 popular Java projects showing that it achieves better performance compared to previous attentional mechanisms.
RIS
TY - CPAPER TI - A Convolutional Attention Network for Extreme Summarization of Source Code AU - Miltiadis Allamanis AU - Hao Peng AU - Charles Sutton BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-allamanis16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 2091 EP - 2100 L1 - http://proceedings.mlr.press/v48/allamanis16.pdf UR - https://proceedings.mlr.press/v48/allamanis16.html AB - Attention mechanisms in neural networks have proved useful for problems in which the input and output do not have fixed dimension. Often there exist features that are locally translation invariant and would be valuable for directing the model’s attention, but previous attentional architectures are not constructed to learn such features specifically. We introduce an attentional neural network that employs convolution on the input tokens to detect local time-invariant and long-range topical attention features in a context-dependent way. We apply this architecture to the problem of extreme summarization of source code snippets into short, descriptive function name-like summaries. Using those features, the model sequentially generates a summary by marginalizing over two attention mechanisms: one that predicts the next summary token based on the attention weights of the input tokens and another that is able to copy a code token as-is directly into the summary. We demonstrate our convolutional attention neural network’s performance on 10 popular Java projects showing that it achieves better performance compared to previous attentional mechanisms. ER -
APA
Allamanis, M., Peng, H. & Sutton, C.. (2016). A Convolutional Attention Network for Extreme Summarization of Source Code. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:2091-2100 Available from https://proceedings.mlr.press/v48/allamanis16.html.

Related Material