the maximum number of tokens the agent can use before termination. This can be useful in the case of local models where the cost is not directly measurable, but we don't want excessive work.