报菜名

#untaged

报菜名

HuggingGPT (could’ve chosen a better name, linkedin tier)

Generative Agent

LLMA Decoding Acceleration

NOOO, THEY AUTOMATED L3s (KINDA!)

4-bit quantization

AutoGPTs, Hmmmm

INT4 finetuning for LLMs

AutoGPTs, Hmmmm, Hmmmm

Evaluate Code +

Can The foundation be just an LLM? If only Hari Seldon read this paper

DECKARD - RL Agent that dreams

Training LLMs using AI generated dialogues

Automating Data Analysts [By Microsoft(™)]

(FLARE) Active Retrieval Augmented Generation

I LOVE COMPUTERS!!!!!

Flash Attention

Hack to make inference faster (by HuggingFace)

Tree of thoughts

Model Interpretation

Yes your models can memorize exact stuff

Voyager [Diamond ranked AI Minecraft player]

Need to update doc

Activation-aware Weight Quantisation (AWQ)

SpQR (Sparse Quantised Representation)

GGML adds 2-bit quantisation

SOTA document bender for your company QA

Multimodal is hard

Insane alpha drop from kaiokendev

Skinny dip into GGML code base

Multi-party chat

Lost in the Middle

GPT-4 Details Leaked

How to check fine tuning datasets’ quality?

DPO (Direct Preference Optimization)

Mixture of Experts

Switch transformers

Multi-Query Attention

Symbol Rank ( for coding LLMs)

Flash attention 2

[Lora Hub] Wait, was I talking about being blessed with the mandate of the heaven, Yes I still have it

Scaling S3 is not easy [Not related to ML but also related to AI cause all data is in S3]

MoE (by Deepmind) (It’s soft not sparse)

Skill Issue Paper

Estimate LLM Flops and Memory requirement

Speculative decoding

Cool paper - Topology of NN

How to reduce KV cache mem usage?

Ok, I am going to become Vector DB expert this week

Mixture of Experts: PEFT edition by Cohere

LLM as Optimisers

Generative Recommendors - Cool paper by Google

Fusing Modalities - Chimera by Meta

IMPORTANT INTERPRETABILITY PAPER BY ANTHROPIC

Make LLM do Maths

It’s not AGI (it’s just your data)

Insane ML Notes on Twitter with Q&A

Stable Video Diffusion (SVD)

Stable Diffusion Turbo (or How to distill a diffusion model 101)

I can’t hear the MUSIC* !!!!!!! NEEEED TO GET BETTTTTTTER!!!

Images are Sentences

Videos are sentences

Sentences are predictable

Mamba - faster architecture (Reading cause Tri Dao is author)

Mitigating LLM Hallucinations

Use smol models to train large models faster

LLM Paper from Apple?? : That’s a rare sight

Multimodal paper from Apple???

Amazing paper to Learn about Dingboard

TDM edge Multimodal arc (I blame Vik)

Microsoft broke MTEB

Reading List from AHM

Reading List from Yacine

LASERRRRR (for reasoning)

Embarrassing myself publicly arc (PHOTOMAKER)

How to create AGI?

ILYA’s READING LIST (For getting up to speed on today’s architectures)

Stream Diffusion - Brrrrr ImageGen at 100FPS

MLLM-Guided Image Editing (MGIE)

Matryoshka Embeddings

Generalising Length of Transformers

Diffusion Transformers

Stable Diffusion 3

Fashion Diffusion (Make your waifu dress in Zara)

Another Apple LLM (this time it’s multimodal)

Quiet-Star (Is it really the fabled openai algo, nope)

Transformers for time series (truly retarded)

MyVLM (Shitty Name only Snapchat can think of)

Connected Pages

Depth

On this page

Pages mentioning this page

No other pages mentions this page