The mechanics and economics of calling models from your own code. Learn how LLM APIs are billed, what counts as tokens, latency, streaming and rate limits — and remember it with spaced repetition.
Using an LLM in an app is different from chatting with one. You call an API, you pay per token, and both your prompt and the model's response count toward the bill. Understanding that pricing model is the difference between a feature that scales and one that quietly burns money.
This track focuses on the foundations every LLM app rests on: how API usage is billed, what counts toward token cost, what latency and streaming mean for user experience, and how rate limits and truncation shape what your code has to handle.
It uses spaced repetition so these essentials stick. It pairs naturally with AI & LLM Fundamentals (how models work) and Prompt Engineering (getting good output).
Each module is a set of practice cards — 18 in total. Answer, review, and watch your knowledge grow from seed to full bloom.
The builder's view of using a model — token billing, latency, streaming, rate limits, and cost control
18 cardsA taste of the real cards. Pick an answer, then reveal the explanation.
How is LLM API usage usually billed?
Which parts of a request count toward token cost?
What does "streaming" a response mean?
What is a "rate limit" on an API?
Each card is one practical concept with multiple options. Pick what you think is right.
See the correct option plus a clear explanation, and a link to deeper docs when one is available.
A spaced-repetition engine (SM-2 or FSRS) resurfaces each card just before you would forget it.
Knowing exactly what you pay for — every token in and out — is how you keep an LLM feature affordable at scale.
Understanding latency and streaming lets you build responses that feel fast instead of frozen.
Rate limits and truncated responses are not edge cases — they are everyday behaviour your code must expect.
This track is aimed at people building on top of model APIs, so some coding background helps — but the concepts (tokens, billing, latency) are explained plainly.
About 10 minutes a day. Spaced repetition means short, frequent sessions beat long cramming, so the essentials stick.
Yes, completely free. No registration or credit card is required, and all your progress is stored locally in your browser.
Plant your first seed today. Ten minutes a day is all it takes to grow the foundations real LLM apps rest on.