Currently task allocation in Koji is decentralized. The builders pick their next task from a list. The system prefers builders with higher available capacity via the algorithm that the builders use. For a given task, they look at the set of other ready builders for the given channel-arch bin. If the host is below the median, it will not take that task until a waiting period (task_avail_delay) has passed. This delay gives higher capacity hosts more of a chance to claim the task.
Unfortunately, if the set of hosts is very heterogeneous in capacity, the largest capacity hosts might not get used as much as they should because this algorithm does not distinguish any more finely than above/below the median.
This change generalizes the task_avail_delay behavior to scale with the rank of the host within the channel-arch bin. The hosts with highest capacity will take the task immediately, while hosts lower down will have a delay proportional to their rank. We calculate rank as a float between 0.0 and 1.0 and use that as a multiplier for the delay.
The end result will be that hosts with higher available capacity will be more likely to claim a task, resulting in better utilization of the highest capacity hosts.
Currently task allocation in Koji is decentralized. The builders pick their next task from a list. The system prefers builders with higher available capacity via the algorithm that the builders use. For a given task, they look at the set of other ready builders for the given channel-arch bin. If the host is below the median, it will not take that task until a waiting period (
task_avail_delay
) has passed. This delay gives higher capacity hosts more of a chance to claim the task.Unfortunately, if the set of hosts is very heterogeneous in capacity, the largest capacity hosts might not get used as much as they should because this algorithm does not distinguish any more finely than above/below the median.
This change generalizes the
task_avail_delay
behavior to scale with the rank of the host within the channel-arch bin. The hosts with highest capacity will take the task immediately, while hosts lower down will have a delay proportional to their rank. We calculate rank as a float between 0.0 and 1.0 and use that as a multiplier for the delay.The end result will be that hosts with higher available capacity will be more likely to claim a task, resulting in better utilization of the highest capacity hosts.