报菜名

报菜名

LORA

HuggingGPT (could’ve chosen a better name, linkedin tier)

GPTEval

GPT4ALL

Vector DB

GPU Basics

HyDE

Generative Agent

Toolformers

ReAct

LLMA Decoding Acceleration

NOOO, THEY AUTOMATED L3s (KINDA!)

OpenAGI

4-bit quantization

AutoGPTs, Hmmmm

Deepspeed

INT4 finetuning for LLMs

AutoGPTs, Hmmmm, Hmmmm

Chameleon

Evaluate Code +

Can The foundation be just an LLM? If only Hari Seldon read this paper

Iter-CoT

WizardLM

DECKARD - RL Agent that dreams

Training LLMs using AI generated dialogues

Automating Data Analysts [By Microsoft(™)]

Local PC Waifu

(FLARE) Active Retrieval Augmented Generation

I LOVE COMPUTERS!!!!!

Flash Attention

ALiBi

Hack to make inference faster (by HuggingFace)

Unlimiformer

Tree of thoughts

Model Interpretation

QLoRA

Yes your models can memorize exact stuff

Voyager [Diamond ranked AI Minecraft player]

Need to update doc

Activation-aware Weight Quantisation (AWQ)

SpQR (Sparse Quantised Representation)

GGML adds 2-bit quantisation

SOTA document bender for your company QA

Multimodal is hard

Insane alpha drop from kaiokendev

Skinny dip into GGML code base

Skip Decode

Multi-party chat

Lost in the Middle

GPT-4 Details Leaked

How to check fine tuning datasets’ quality?

DPO (Direct Preference Optimization)

Mixture of Experts

Switch transformers

Glam

St-MOE

Multi-Query Attention

Symbol Rank ( for coding LLMs)

ReLORA

Zero++

Flash attention 2

LIMA

RLHF

[Lora Hub] Wait, was I talking about being blessed with the mandate of the heaven, Yes I still have it

TinyStories

FNet

Scaling S3 is not easy [Not related to ML but also related to AI cause all data is in S3]

RetNet

MoE (by Deepmind) (It’s soft not sparse)

Skill Issue Paper

BERT Primer

Estimate LLM Flops and Memory requirement

RoPE

Speculative decoding

Cool paper  - Topology of NN

How to reduce KV cache mem usage?

Hyena

VectorDB arc

Ok, I am going to become Vector DB expert this week

Lucene HNSW

FAISS

Annoy

Mixture of Experts: PEFT edition by Cohere

LLM as Optimisers

Generative Recommendors - Cool paper by Google

Flamingo

Fusing Modalities - Chimera by Meta

PromptBreeder

LLAVA

LLAVA-1.5

IMPORTANT INTERPRETABILITY PAPER BY ANTHROPIC

SAM

Qwen-VL

SigLIP

One peace

Make LLM do Maths

Distil-Whisper

It’s not AGI (it’s just your data)

Insane ML Notes on Twitter with Q&A

Stable Video Diffusion (SVD)

Stable Diffusion Turbo (or How to distill a diffusion model 101)

I can’t hear the MUSIC*  !!!!!!! NEEEED TO GET BETTTTTTTER!!!

Images are Sentences

Videos are sentences

Sentences are predictable

Mamba - faster architecture (Reading cause Tri Dao is author)

Gemini

Mitigating LLM Hallucinations

LCMs

Use smol models to train large models faster

DoReMI

LLM Paper from Apple?? : That’s a rare sight

Multimodal paper from Apple???

Amazing paper to Learn about Dingboard

TDM edge Multimodal arc (I blame Vik)

MobileVLM

MathPile

Unified-IO 2

DocLLM

Microsoft broke MTEB

Reading List from AHM

Reading List from Yacine

LASERRRRR (for reasoning)

Embarrassing myself publicly arc (PHOTOMAKER)

Lumiere

Deepseek Coder

IPAdapter

How to create AGI?

ILYA’s READING LIST (For getting up to speed on today’s architectures)

Stream Diffusion - Brrrrr ImageGen at 100FPS

MLLM-Guided Image Editing (MGIE)

Matryoshka Embeddings

Generalising Length of Transformers

World Model

Diffusion Transformers

Stable Diffusion 3

Deepseek-VL

Synth2

Fashion Diffusion (Make your waifu dress in Zara)

Another Apple LLM (this time it’s multimodal)

Quiet-Star (Is it really the fabled openai algo, nope)

Transformers for time series (truly retarded)

GaLore

ORPO

MyVLM (Shitty Name only Snapchat can think of)