Author: Susang Kim

Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition

Read more

MetaFormer Is Actually What You Need for Vision

Read more

Offline Reinforcement Learning:From Algorithms to Practical Challenges

Read more

Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth

Read more

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model

Read more

Relational SelfAttention What’s Missing in Attention for Video Understanding

Read more

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

https://healess.github.io/paper/Paper-How-to-train-your-ViT/

Read more

Improved Multiscale Vision Transformers for Classification and Detection

2021년 12월 2일 arXiv에 올라온 Facebook의 MViT Version 2인 Improved Multiscale Vision Transformers for Classification and Detection을 Review하고자 합니다. https://healess.github.io/paper/Paper-Improved-Multiscale-Vision-Transformers/ Paper : https://arxiv.org/pdf/2111.01673v1.pdf

Read more

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?

https://healess.github.io/paper/Paper-TokenLearner/

Read more

VIDEO SWIN tRANSFORMER

https://healess.github.io/paper/Paper-Video-Swin-Transformer/

Read more