What happens when you go over your cluster usage?

Hi all.

I have a very complex and unique use case for an application I’m working on at the minute which ultimately requires me to use multiple queries per search request and unfortunately in some cases there will be queries run that don’t have an optimal index in place.

I currently use Serverless but after rolling out this update to searches, the costs will start to become way too high. I would rather sacrifice performance than pay more for millions of RPUs, which leads me to the following question:

What happens when you overrun your cluster usage?

Example: M20 cluster (2000 IOPS, 3000 max connections)
If I had thousands of complex, unindexed queries running at the same time (I won’t, but just for an example) on this cluster, would there eventually be a point where it threw an error, or would it just force each query to delay and wait until prior ones have executed before they take place?

As mentioned, I am happy to have queries take longer, as long as it doesn’t throw an error. I just don’t want anything to break.

Any clarification would be appreciated. Thanks in advance.

Edit: just to add, the storage will never overrun it’s limit. I am aware that, if that were to happen, it would fail to write documents and cause an error.