Goal of the Scheduler

Minimize the total delay of the gNB while considering the priority of each attached UE

In Terms of Usability

The scheduler’s design should be simple and easily understandable to facilitate ease of reuse

State (Input)

X=[x1x2xn],where xi=[RNTIi,Pi,HOLi]\begin{aligned} X= \begin{bmatrix} \overrightarrow{x_1} & \overrightarrow{x_2} & \cdots & \overrightarrow{x_n} \end{bmatrix} \\ \text{,where } \overrightarrow{x_i} = [\text{RNTI}_i, P_i, \text{HOL}_i] \end{aligned}

Each element of ii th UE’s state vector xi\overrightarrow{x_i} is defined as follows:

Action (Output)

WtW_t is the weights vector of nn UEs, each element wiw_i indicates the weight of the iith UE

Wt=[w1,w2,,wn]W_t=[w_1, w_2, \cdots, w_n]

Reward

The reward rir_i is received for each UE’s weight wiw_i.

Option 1

ri=(100Pi)×HOLir_i= -(100-P_i)\times\text{HOL}_i

Advantages

Disadvantages

Option 2

ri=1(100Pi)×HOLir_i= \cfrac{1}{(100-P_i)\times\text{HOL}_i}

Advantages

Disadvantages

Total Reward

The total reward R(Xt,Wt)R(X_t, W_t) of a gNB with the current input XtX_t and output WtW_t in slot time tt is as follows:

R(Xt,Wt)=i=1nriR(X_t,W_t) = \sum_{i=1}^n r_i \\