We implement rate limiting to make sure the API is responsive for all customers. The rate limits are set in a way to provide substantial functionality while trying to stifle abuse. Most times, a rate limit can be subverted by implementing a different API call or leveraging other API features (i.e. instead of polling for quotes, leverage the streaming API).
Two pieces of information to understand as it pertains to our limits:
For example: With a rate limit of 120 requests per minute, if you make a /quotes request every second for a minute, you would still have 60 requests left in that minute before hitting the limit.
Should you be concerned about rate-limits?
Probably not. Polling for data, while not the best solution, is reasonably supported by our APIs. The limits exist with enough headroom to get up-to-date data and still have room to make other requests. The best way to know if your application will hit the limits is to build it and scale back.
Each limit is enforced by the minute and on a per-access-token basis. As such, the limits are enforced on per app and per user basis.
With each request that has a rate limit applied a series of headers will be sent in the response. These headers should help you to gauge your usage and when to throttle your application. For example:
X-Ratelimit-Allowed: 120
X-Ratelimit-Used: 1
X-Ratelimit-Available: 119
X-Ratelimit-Expiry: 1369168800001