1. Bibliography¶

Arjovsky et al., 2017: Arjovsky, M., Chintala, S., & Bottou, L. (2017 , January). Wasserstein GAN. arXiv:1701.07875 [cs, stat]. arXiv:1701.07875
Atito et al., 2021: Atito, S., Awais, M., & Kittler, J. (2021 , November). SiT: Self-supervised vIsion Transformer. arXiv:2104.03602 [cs]. arXiv:2104.03602
Ba et al., 2016: Ba, J. L., Kiros, J. R., & Hinton, G. E. (2016 , July). Layer Normalization. arXiv:1607.06450 [cs, stat]. arXiv:1607.06450
Badrinarayanan et al., 2016: Badrinarayanan, V., Kendall, A., & Cipolla, R. (2016 , October). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. arXiv:1511.00561 [cs]. arXiv:1511.00561
Bahdanau et al., 2016: Bahdanau, D., Cho, K., & Bengio, Y. (2016 , May). Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473 [cs, stat]. arXiv:1409.0473
Bi & Poo, 2001: Bi, G.-q., & Poo, M.-m. (2001). Synaptic Modification by Correlated Activity: Hebb's Postulate Revisited. Annual Review of Neuroscience, 24(1), 139–166. doi:10.1146/annurev.neuro.24.1.139
Bienenstock et al., 1982: Bienenstock, E. L., Cooper, L. N., & Munro, P. W. (1982 , January). Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex. J Neurosci, 2(1), 32–48.
Binder et al., 2016: Binder, A., Montavon, G., Bach, S., Müller, K.-R., & Samek, W. (2016 , April). Layer-wise Relevance Propagation for Neural Networks with Local Renormalization Layers. arXiv:1604.00825 [cs]. arXiv:1604.00825
Bojarski et al., 2016: Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., … Zieba, K. (2016 , April). End to End Learning for Self-Driving Cars. arXiv:1604.07316 [cs]. arXiv:1604.07316
Brette & Gerstner, 2005: Brette, R., & Gerstner, W. (2005 , November). Adaptive Exponential Integrate-and-Fire Model as an Effective Description of Neuronal Activity. Journal of Neurophysiology, 94(5), 3637–3642. doi:10.1152/jn.00686.2005
Caron et al., 2021: Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., & Joulin, A. (2021 , May). Emerging Properties in Self-Supervised Vision Transformers. arXiv:2104.14294 [cs]. arXiv:2104.14294
Chollet, 2017a: Chollet, F. (2017). Deep Learning with Python. Manning publications.
Chollet, 2017b: Chollet, F. (2017 , April). Xception: Deep Learning with Depthwise Separable Convolutions. arXiv:1610.02357 [cs]. arXiv:1610.02357
Chung et al., 2014: Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014 , December). Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv:1412.3555 [cs]. arXiv:1412.3555
Clopath et al., 2010: Clopath, C., Büsing, L., Vasilaki, E., & Gerstner, W. (2010 , March). Connectivity reflects coding: a model of voltage-based STDP with homeostasis. Nature Neuroscience, 13(3), 344–352. doi:10.1038/nn.2479
Dayan & Abbott, 2001: Dayan, P., & Abbott, L. F. (2001 , September). Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. The MIT Press.
Demircigil et al., 2017: Demircigil, M., Heusel, J., Löwe, M., Upgang, S., & Vermet, F. (2017 , July). On a model of associative memory with huge storage capacity. Journal of Statistical Physics, 168(2), 288–299. arXiv:1702.01929, doi:10.1007/s10955-017-1806-y
Devlin et al., 2019: Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019 , May). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs]. arXiv:1810.04805
Dosovitskiy et al., 2021: Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., … Houlsby, N. (2021 , June). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv:2010.11929 [cs]. arXiv:2010.11929
Fukushima, 1980: Fukushima, K. (1980 , April). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202. doi:10.1007/BF00344251
Gers & Schmidhuber, 2000: Gers, F. A., & Schmidhuber, J. (2000 , July). Recurrent nets that time and count. Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium (pp. 189–194 vol.3). doi:10.1109/IJCNN.2000.861302
Gerstner et al., 2014: Gerstner, W., Kistler, W., Naud, R., & Paninski, L. (2014). Neuronal Dynamics - a Neuroscience Textbook. Cambridge University Press.
Gerstner & Kistler, 2002: Gerstner, W., & Kistler, W. M. (2002 , December). Mathematical formulations of Hebbian learning. Biological Cybernetics, 87(5), 404–415. doi:10.1007/s00422-002-0353-y
Girshick, 2015: Girshick, R. (2015 , September). Fast R-CNN. arXiv:1504.08083 [cs]. arXiv:1504.08083
Girshick et al., 2014: Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014 , October). Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv:1311.2524 [cs]. arXiv:1311.2524
Glorot & Bengio, 2010: Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. AISTATS (p. 8).
Goodfellow et al., 2016: Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
Goodfellow et al., 2015: Goodfellow, I. J., Shlens, J., & Szegedy, C. (2015 , March). Explaining and Harnessing Adversarial Examples. arXiv:1412.6572 [cs, stat]. arXiv:1412.6572
Goodfellow et al., 2014: Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., … Bengio, Y. (2014 , June). Generative Adversarial Networks. arXiv:1406.2661 [cs]. arXiv:1406.2661
Gou et al., 2020: Gou, J., Yu, B., Maybank, S. J., & Tao, D. (2020 , June). Knowledge Distillation: A Survey. arXiv:2006.05525 [cs, stat]. arXiv:2006.05525
Guo et al., 2017: Guo, X., Liu, X., Zhu, E., & Yin, J. (2017). Liu, D., Xie, S., Li, Y., Zhao, D., & El-Alfy, E.-S. M. (Eds.). Deep Clustering with Convolutional Autoencoders. Neural Information Processing (pp. 373–382). Cham: Springer International Publishing. doi:10.1007/978-3-319-70096-0_39
Gupta et al., 2014: Gupta, S., Girshick, R., Arbeláez, P., & Malik, J. (2014 , July). Learning Rich Features from RGB-D Images for Object Detection and Segmentation. arXiv:1407.5736 [cs]. arXiv:1407.5736
Hannun et al., 2014: Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., … Ng, A. Y. (2014 , December). Deep Speech: Scaling up end-to-end speech recognition. arXiv:1412.5567 [cs]. arXiv:1412.5567
Haykin, 2009: Haykin, S. S. (2009). Neural Networks and Learning Machines, 3rd Edition. Pearson.
He et al., 2018: He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2018 , January). Mask R-CNN. arXiv:1703.06870 [cs]. arXiv:1703.06870
He et al., 2015a: He, K., Zhang, X., Ren, S., & Sun, J. (2015 , December). Deep Residual Learning for Image Recognition. arXiv:1512.03385 [cs]. arXiv:1512.03385
He et al., 2015b: He, K., Zhang, X., Ren, S., & Sun, J. (2015 , February). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. arXiv:1502.01852 [cs]. arXiv:1502.01852
Higgins et al., 2016: Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., … Lerchner, A. (2016 , November). Beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. ICLR 2017.
Hinton & Salakhutdinov, 2006: Hinton, G. E., & Salakhutdinov, R. R. (2006 , July). Reducing the Dimensionality of Data with Neural Networks. Science, 313(5786), 504–507. doi:10.1126/science.1127647
Hinton et al., 2015: Hinton, G., Vinyals, O., & Dean, J. (2015 , March). Distilling the Knowledge in a Neural Network. arXiv:1503.02531 [cs, stat]. arXiv:1503.02531
Hinton et al., 2006: Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006 , July). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554. doi:10.1162/neco.2006.18.7.1527
Hochreiter & Schmidhuber, 1997: Hochreiter, S., & Schmidhuber, J. (1997 , November). Long short-term memory. Neural computation, 9(8), 1735–80.
Hochreiter, 1991: Hochreiter, S. (1991). Untersuchungen Zu Dynamischen Neuronalen Netzen (Doctoral dissertation). TU München.
Hopfield, 1982: Hopfield, J. J. (1982 , April). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79(8), 2554–2558. doi:10.1073/pnas.79.8.2554
Hopfield et al., 1983: Hopfield, J. J., Feinstein, D. I., & Palmer, R. G. (1983 , July). `Unlearning' has a stabilizing effect in collective memories. Nature, 304(5922), 158–159. doi:10.1038/304158a0
Huang et al., 2018: Huang, G., Liu, Z., van der Maaten, L., & Weinberger, K. Q. (2018 , January). Densely Connected Convolutional Networks. arXiv:1608.06993 [cs]. arXiv:1608.06993
Intrator & Cooper, 1992: Intrator, N., & Cooper, L. N. (1992 , January). Objective function formulation of the BCM theory of visual cortical plasticity: Statistical connections, stability conditions. Neural Networks, 5(1), 3–17. doi:10.1016/S0893-6080(05)80003-6
Ioffe & Szegedy, 2015: Ioffe, S., & Szegedy, C. (2015 , February). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv:1502.03167 [cs.LG]. arXiv:1502.03167
Isola et al., 2018: Isola, P., Zhu, J.-Y., Zhou, T., & Efros, A. A. (2018 , November). Image-to-Image Translation with Conditional Adversarial Networks. arXiv:1611.07004 [cs]. arXiv:1611.07004
Izhikevich, 2003: Izhikevich, E. M. (2003 , January). Simple model of spiking neurons. IEEE transactions on neural networks, 14(6), 1569–72. doi:10.1109/TNN.2003.820440
Jaeger, 2001: Jaeger, H. (2001). The "Echo State" Approach to Analysing and Training Recurrent Neural Networks. Jacobs Universität Bremen.
Joshi & Triesch, 2009: Joshi, P., & Triesch, J. (2009 , June). Rules for information maximization in spiking neurons using intrinsic plasticity. 2009 International Joint Conference on Neural Networks (pp. 1456–1461). doi:10.1109/IJCNN.2009.5178625
Karras et al., 2020: Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020 , March). Analyzing and Improving the Image Quality of StyleGAN. arXiv:1912.04958 [cs, eess, stat]. arXiv:1912.04958
Kendall et al., 2016: Kendall, A., Grimes, M., & Cipolla, R. (2016 , February). PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization. arXiv:1505.07427 [cs]. arXiv:1505.07427
Kheradpisheh et al., 2018: Kheradpisheh, S. R., Ganjtabesh, M., Thorpe, S. J., & Masquelier, T. (2018 , March). STDP-based spiking deep convolutional neural networks for object recognition. Neural Networks, 99, 56–67. doi:10.1016/j.neunet.2017.12.005
Kim, 2014: Kim, Y. (2014 , September). Convolutional Neural Networks for Sentence Classification. arXiv:1408.5882 [cs]. arXiv:1408.5882
Kingma & Ba, 2014: Kingma, D., & Ba, J. (2014). Adam: A Method for Stochastic Optimization. Proc. ICLR (pp. 1–13). doi:10.1145/1830483.1830503
Kingma & Welling, 2013: Kingma, D. P., & Welling, M. (2013 , December). Auto-Encoding Variational Bayes. arXiv:1312.6114 [cs]. arXiv:1312.6114
Krizhevsky et al., 2012: Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems (NIPS).
Krotov & Hopfield, 2016: Krotov, D., & Hopfield, J. J. (2016 , September). Dense Associative Memory for Pattern Recognition. arXiv:1606.01164 [cond-mat, q-bio, stat]. arXiv:1606.01164
Laje & Buonomano, 2013: Laje, R., & Buonomano, D. V. (2013 , July). Robust timing and motor patterns by taming chaos in recurrent neural networks. Nature neuroscience, 16(7), 925–33. doi:10.1038/nn.3405
Lapuschkin et al., 2019: Lapuschkin, S., Wäldchen, S., Binder, A., Montavon, G., Samek, W., & Müller, K.-R. (2019 , March). Unmasking Clever Hans predictors and assessing what machines really learn. Nature Communications, 10(1), 1096. doi:10.1038/s41467-019-08987-4
Le, 2013: Le, Q. V. (2013 , May). Building high-level features using large scale unsupervised learning. 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 8595–8598). Vancouver, BC, Canada: IEEE. doi:10.1109/ICASSP.2013.6639343
LeCun et al., 1998: LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86(11), 2278–2324. doi:10.1109/5.726791
Li et al., 2018: Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018 , November). Visualizing the Loss Landscape of Neural Nets. arXiv:1712.09913 [cs, stat]. arXiv:1712.09913
Lillicrap et al., 2016: Lillicrap, T. P., Cownden, D., Tweed, D. B., & Akerman, C. J. (2016 , November). Random synaptic feedback weights support error backpropagation for deep learning. Nature Communications, 7(1), 1–10. doi:10.1038/ncomms13276
Linnainmaa, 1970: Linnainmaa, S. (1970). The Representation of the Cumulative Rounding Error of an Algorithm as a Taylor Expansion of the Local Rounding Errors (Master's thesis). Univ. Helsinki.
Liu et al., 2016: Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., & Berg, A. C. (2016). SSD: Single Shot MultiBox Detector. arXiv:1512.02325 [cs], 9905, 21–37. arXiv:1512.02325, doi:10.1007/978-3-319-46448-0_2
Maas et al., 2013: Maas, A. L., Hannun, A. Y., & Ng, A. Y. (2013). Rectifier Nonlinearities Improve Neural Network Acoustic Models. ICML (p. 6).
Maass et al., 2002: Maass, W., Natschläger, T., & Markram, H. (2002 , November). Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural computation, 14(11), 2531–60. doi:10.1162/089976602760407955
Malinowski et al., 2015: Malinowski, M., Rohrbach, M., & Fritz, M. (2015 , October). Ask Your Neurons: A Neural-based Approach to Answering Questions about Images. arXiv:1505.01121 [cs]. arXiv:1505.01121
McEliece et al., 1987: McEliece, R., Posner, E., Rodemich, E., & Venkatesh, S. (1987 , July). The capacity of the Hopfield associative memory. IEEE Transactions on Information Theory, 33(4), 461–482. doi:10.1109/TIT.1987.1057328
McInnes et al., 2020: McInnes, L., Healy, J., & Melville, J. (2020 , September). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:1802.03426 [cs, stat]. arXiv:1802.03426
Mikolov et al., 2013: Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013 , September). Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781 [cs]. arXiv:1301.3781
Mirza & Osindero, 2014: Mirza, M., & Osindero, S. (2014 , November). Conditional Generative Adversarial Nets. arXiv:1411.1784 [cs]. arXiv:1411.1784
Nowozin et al., 2016: Nowozin, S., Cseke, B., & Tomioka, R. (2016 , June). F-GAN: Training Generative Neural Samplers using Variational Divergence Minimization. arXiv:1606.00709 [cs, stat]. arXiv:1606.00709
Oja, 1982: Oja, E. (1982 , January). A simplified neuron model as a principal component analyzer. Journal of mathematical biology, 15(3), 267–73.
Olshausen & Field, 1997: Olshausen, B. A., & Field, D. J. (1997 , December). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37(23), 3311–3325. doi:10.1016/S0042-6989(97)00169-7
Radford et al., 2015: Radford, A., Metz, L., & Chintala, S. (2015 , November). Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv:1511.06434 [cs]. arXiv:1511.06434
Ramsauer et al., 2020: Ramsauer, H., Schäfl, B., Lehner, J., Seidl, P., Widrich, M., Adler, T., … Hochreiter, S. (2020 , December). Hopfield Networks is All You Need. arXiv:2008.02217 [cs, stat]. arXiv:2008.02217
Razavi et al., 2019: Razavi, A., van den Oord, A., & Vinyals, O. (2019 , June). Generating Diverse High-Fidelity Images with VQ-VAE-2. arXiv:1906.00446 [cs, stat]. arXiv:1906.00446
Redmon et al., 2016: Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016 , May). You Only Look Once: Unified, Real-Time Object Detection. arXiv:1506.02640 [cs]. arXiv:1506.02640
Redmon & Farhadi, 2016: Redmon, J., & Farhadi, A. (2016 , December). YOLO9000: Better, Faster, Stronger. arXiv:1612.08242 [cs]. arXiv:1612.08242
Redmon & Farhadi, 2018: Redmon, J., & Farhadi, A. (2018 , April). YOLOv3: An Incremental Improvement. arXiv:1804.02767 [cs]. arXiv:1804.02767
Reed et al., 2016: Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & Lee, H. (2016 , June). Generative Adversarial Text to Image Synthesis. arXiv:1605.05396 [cs]. arXiv:1605.05396
Ren et al., 2016: Ren, S., He, K., Girshick, R., & Sun, J. (2016 , January). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv:1506.01497 [cs]. arXiv:1506.01497
Ronneberger et al., 2015: Ronneberger, O., Fischer, P., & Brox, T. (2015 , May). U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv:1505.04597 [cs]. arXiv:1505.04597
Rossant et al., 2011: Rossant, C., Goodman, D. F. M., Fontaine, B., Platkiewicz, J., Magnusson, A. K., & Brette, R. (2011). Fitting Neuron Models to Spike Trains. Frontiers in Neuroscience, 5. doi:10.3389/fnins.2011.00009
Rumelhart et al., 1986: Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986 , October). Learning representations by back-propagating errors. Nature, 323(6088), 533–536. doi:10.1038/323533a0
Salimans et al., 2016: Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., & Chen, X. (2016 , June). Improved Techniques for Training GANs. arXiv:1606.03498 [cs]. arXiv:1606.03498
Simoncelli & Olshausen, 2001: Simoncelli, E. P., & Olshausen, B. A. (2001 , March). Natural Image Statistics and Neural Representation. Annual Review of Neuroscience, 24(1), 1193–1216. doi:10.1146/annurev.neuro.24.1.1193
Simonyan & Zisserman, 2015: Simonyan, K., & Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. International Conference on Learning Representations (ICRL), pp. 1–14. doi:10.1016/j.infsof.2008.09.005
Sohn et al., 2015: Sohn, K., Lee, H., & Yan, X. (2015). Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., & Garnett, R. (Eds.). Learning Structured Output Representation using Deep Conditional Generative Models. Advances in Neural Information Processing Systems 28 (pp. 3483–3491). Curran Associates, Inc.
Springenberg et al., 2015: Springenberg, J. T., Dosovitskiy, A., Brox, T., & Riedmiller, M. (2015 , April). Striving for Simplicity: The All Convolutional Net. arXiv:1412.6806 [cs]. arXiv:1412.6806
Srivastava et al., 2014: Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15(56), 1929–1958.
Srivastava et al., 2015: Srivastava, R. K., Greff, K., & Schmidhuber, J. (2015 , November). Highway Networks. arXiv:1505.00387 [cs]. arXiv:1505.00387
Sussillo & Abbott, 2009: Sussillo, D., & Abbott, L. F. (2009 , August). Generating coherent patterns of activity from chaotic neural networks. Neuron, 63(4), 544–57. doi:10.1016/j.neuron.2009.07.018
Sutskever et al., 2014: Sutskever, I., Vinyals, O., & Le, Q. V. (2014 , December). Sequence to Sequence Learning with Neural Networks. arXiv:1409.3215 [cs]. arXiv:1409.3215
Szegedy et al., 2015: Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2015 , December). Rethinking the Inception Architecture for Computer Vision. arXiv:1512.00567 [cs]. arXiv:1512.00567
Taigman et al., 2014: Taigman, Y., Yang, M., Ranzato, Marc'Aurelio, & Wolf, L. (2014 , June). DeepFace: Closing the Gap to Human-Level Performance in Face Verification. 2014 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1701–1708). Columbus, OH, USA: IEEE. doi:10.1109/CVPR.2014.220
Tanaka et al., 2019: Tanaka, G., Yamane, T., Héroux, J. B., Nakane, R., Kanazawa, N., Takeda, S., … Hirose, A. (2019 , July). Recent advances in physical reservoir computing: A review. Neural Networks, 115, 100–123. doi:10.1016/j.neunet.2019.03.005
Oord et al., 2016: van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., … Kavukcuoglu, K. (2016 , September). WaveNet: A Generative Model for Raw Audio. arXiv:1609.03499 [cs]. arXiv:1609.03499
Vaswani et al., 2017: Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017 , June). Attention Is All You Need. arXiv:1706.03762 [cs]. arXiv:1706.03762
Vincent et al., 2010: Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P.-A. (2010). Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion. Journal of Machine Learning Research, p. 38.
Vinyals et al., 2015: Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015 , April). Show and Tell: A Neural Image Caption Generator. arXiv:1411.4555 [cs]. arXiv:1411.4555
Vogels et al., 2011: Vogels, T. P., Sprekeler, H., Zenke, F., Clopath, C., & Gerstner, W. (2011 , December). Inhibitory Plasticity Balances Excitation and Inhibition in Sensory Pathways and Memory Networks. Science, 334(6062), 1569–1573. doi:10.1126/science.1211095
Wang et al., 2018: Wang, B., Zheng, H., Liang, X., Chen, Y., Lin, L., & Yang, M. (2018 , September). Toward Characteristic-Preserving Image-based Virtual Try-On Network. arXiv:1807.07688 [cs]. arXiv:1807.07688
Werbos, 1982: Werbos, P.J. (1982). Applications of advances in nonlinear sensitivity analysis. System Modeling and Optimization: Proc. IFIP. Springer.
Wu et al., 2020: Wu, N., Green, B., Ben, X., & O'Banion, S. (2020 , January). Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case. arXiv:2001.08317 [cs, stat]. arXiv:2001.08317
Wu et al., 2016: Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., … Dean, J. (2016 , September). Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv:1609.08144 [cs]. arXiv:1609.08144
Xu et al., 2015: Xu, K., Ba, J. L., Kiros, R., Cho, K., Courville, A., Salakhutdinov, R., … Bengio, Y. (2015). Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. Proceedings of the 32nd International Conference on Machine Learning - Volume 37 (pp. 2048–2057). JMLR.org.
Zhou & Tuzel, 2017: Zhou, Y., & Tuzel, O. (2017 , November). VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. arXiv:1711.06396 [cs]. arXiv:1711.06396
Zhu et al., 2020: Zhu, Y., Gao, T., Fan, L., Huang, S., Edmonds, M., Liu, H., … Zhu, S.-C. (2020 , February). Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense. Engineering. doi:10.1016/j.eng.2020.01.011

13.2. Recurrent neural networks

Neurocomputing

1. Bibliography¶