Involution: Inverting the Inherence of Convolution for Visual Recognition

Convolution: spatial-agnostic and channel-specific
- it deprives convolution kernels of the ability to adapt to diverse visual patterns with respect to different spatial positions
- posing challenges for capturing long-range spatial interactions in a single shot
Involution: spatial-specific and channel-agnostic
- wider spatial arrangement
- adaptively allocate the weights over different positions
primary contributions
- rethink the inherent properties of convolution
- bridge the emerging philosophy of incorporating self-attention into the learning procedure of visual representation
- work universally well across a wide array of vision tasks