Papers-Stack

Keeping a track of research papers I’ve read.
… Guilty of not pushing the recently read on the Top!

Keys -> || ✅ : Done reading || 📖 : In progress || 🚫 : Dropped ||

		Paper Name	Notes	Link	Year
1	✅	WaveNet: A Generative Model for Raw Audio	notes	arxiv	2016
		Causal conv. layers with dilation. Autoregression model. Sequential inference
2	✅	Fast Wavenet Generation Algorithm	notes	arxiv	2016
		WaveNet improvement. O(2 L) -> O(L). Still sequential though. Use queues to push & pop already computed states at each layer
3	✅	Parallel WaveNet: Fast High-Fidelity Speech Synthesis		arxiv	2017
	⬇️	Probability Density Distillation - Teacher + student based architecture. Marries efficient training of Wavenet with efficient IAF for sampling. Sampling is parallel here for realtime synthesis. ✔️medium - An Explanation of Discretized Logistic Mixture Likelihood ✔️ vimeo - Parallel WaveNet
4	📕	Improved Variational Inference with Inverse Autoregressive Flow		arxiv	2016
	⬇️	⭐ ✔️ Introduction to Normalizing Flows (ECCV2020 Tutorial)		video
5	✅	Deep Unsupervised Learning UC Berkeley lectures		course
		✔️ L1 - Introduction -> Types: 1. Generative models 2. Self-supervised models		01:10:00
		✔️ L2 - Autoregressive Models -> histogram. parameterized distribution. 1.)RNN based 2.)Masking based. 2.1)MADE 2.2)Masked ConvNets		02:27:23
		✔️ L3 - Flow Models -> Model output != p_theta(x); instead z=f_theta(x). z comes from a prob dist. Sampling is inverse of f_inverse_theta(x). -> Autoregressive Flows:- Fast training; Slow sampling -> Inverse Autoregressive Flow:- Slow training; Fast Sampling		01:56:53
6	✅	ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech		arxiv	2018
-	-	-——————-	-–	-–	--
7	✅	Deep Photo Enhancer: unpaired learning for Image Enhancement using GANS	CPVR	arxiv	2018
		Cycle gan extension; individual BN for x->y’ & x’->y’’; adaptive weighting for WGAN
8	✅	AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE	Google Brain	arxiv	ICRL, 2021
		Vision Transformer (ViT) - sequence of img patches to Transformer. Less computation than ResNets. Training on large data trumps inductive bias in CNNs and outperforms.