Reducing cost and improve performance while using LLM

Sunday, May 21, 2023 8:18:53 AM UTC
Sunday, May 21, 2023 10:34:48 PM UTC

Reducing cost and improving performance while using LLM

Note

Prompt Adaptation

We don't need all previous context when asking new question.

LLM Approximation

Completion cache: using the answer generated before.

LLM cascade

Pasted image 20230521082711.png
Pasted image 20230521082934.png

Note

In this paper, the first two methods are merely concepts and the focus is on the third idea, LLM cascade.