A rate limit caps how many requests (or tokens) you can send per time window. It's not a reply-length, model-size, or accuracy cap.