Optimal Control of Multi-Supplier Inventory Management with Lead Time

In the current global competition, companies are required to save money in order to survive. One of the expenses that can be reduced is the cost of inventory control. To minimize these costs, we require a proper planning and management of the inventory. Ordering supplies should be performed at a certain time period, especially with uncertain demand. As such, the company must determine when to order at the suppliers and how many should be ordered. So there will be no excess inventory in the warehouse because of too much ordering or because of the inventory cannot meet demand due to late or too little order to suppliers. Consequently, in this research, a quadratic cost functional is used as the objective function in multi-supplier inventory management problem with different lead time. Optimal control theory, LQR (Linear Quadratic Regulator) is used to solve this problem. According to the simulation, we conclude that the smaller weight resulted in more optimal inventory cost.


I. INTRODUCTION
I NVENTORY management is one of the operation management functions that is very important because inventory requires a lot of capital and affects to customer demand fulfilment. Inventory management has an impact on all of the business functions, especially operation, marketing, and finance. Inventory is a stock of materials used to make ease of production or to satisfy customer demand. Inventory specifically consists of raw materials, semi-finished goods (work in process), and finished goods.
Every company will strive to meet the consumers demand at the right time and exact amount. Loss of sale due to a shortage of inventory should be avoided by the company. Inventory shortages can decrease the sales, customers trust, and customers loyalty. However, overstock can raise the holding cost. Furthermore, there is a wasteful that occurs during storage such as unused inventory because of waiting to be produced. To solve the problems, a company requires proper planning and management of inventory.
One approach to solving inventory control problems is optimal control theory [1], [2], [3]. Research conducted by Ignaciuk and Bartoszewicz discussed the theory of discrete sliding-mode control that is used to design new supply strategies for periodic review inventory systems. In contrast to Manuscript  subchan@its.ac.id the classical, stochastic approaches, they focus on optimizing the inventory system dynamics [4]. Xiao-Jun, Ai-Ming, and Bao-Lin deal the approximation of optimal inventory control problem of the supply chain networks with lead time. The simulation examples show that the inventory replenishment strategy is effective to reduce bullwhip and thereby improve the performance of supply chain networks system [5]. Smagin, Koshkin, and Kim propose an algorithm for inventory control with incomplete information about the demand model [6]. Akikuni, Okuhara, and Fujisaki present the optimal control of multiple suppliers inventory systems with lead time and the order rate to each supplier is time varying [7]. Granin, Mandel and Vilms describe a multi-step discrete-time inventory control problem that occurs in the supply chain with several alternative suppliers [8]. Luthfi, Sutrisno, and Widowati formulate a dynamical linear system with random parameter in the matrix coefficient to solve the inventory control problem with imperfect delivery proves and apply the robust linear quadratic regulator (RLQR) to find the optimal decision [9]. Idayani, Sari, and Munawwir aim to minimize the cost of inventory procurement by using the optimal control theory. The necessary conditions of Pontryagin's Maximum Principle (PMP) and Karush-Kuhn-Tucker (KKT) are met in order to obtain the optimal raw material procurement cost [10].
Various methods have been used to solve supplier selection problems. Nguyen, Chen, and Wang propose a mixed integer linear programming (MILP) model which allows for optimizing supplier selection and transportation plan [11]. Nguyen and Chen present a two-stage stochastic programming model dealing with supplier selection to stabilize feedstock supply of a biomass supply chain in uncertain environments [12]. Hosseini et al propose a stochastic bi-objective mixed integer programming model to support the decision-making in how and when to use both proactive and reactive strategies in supplier selection and order allocation [13]. Memari et al present an intuitionistic fuzzy TOPSIS method to select the right sustainable supplier that concerns nine criteria and thirty sub-criteria for an automotive spare parts manufacturer [14].
This research used quadratic cost functional as an objective function in multi-suppliers inventory management with different lead times that have been discussed in [15]. Optimal control theory, Linear Quadratic Regulator (LQR) is used to find the minimum cost. The quadratic cost functional has been implemented to find the optimal solution in inventory management of rice in Bulog, national logistical supply organization. Then it is analyzed and simulated using Matlab.

II. PRELIMINARIES A. Inventory Management Model
1) The Goods Flow in Supply Chain: According to [15], generally, the goods flow in a supply chain is illustrated in Fig.  1. The distribution center in connection with the consumer or the retailers and suppliers are used to provide goods for production or to meet consumers demand in the distribution network. In this model, the effect of the delay is considered. Delay is the waiting time between ordering goods on suppliers and goods arrivals at the distribution center. The waiting time in fulfilling the order is referred to lead-time. Here is an overview of the inventory system with multiple suppliers.
In Fig. 2, we can see that a number of ordered goods is determined by the controller. The number of orders that is ordered from the supplier is denoted by u(kT ), where T is discrete time or period review and k = 1, 2, . . .. The on-hand stock is an amount of inventory or goods in the warehouse that is ready to be distributed to customers if there is demand at time kT denoted by y(kT ). The target stock denoted by y d is the amount of inventory that is relieved to meet demand, where y d > 0 is used to determine the number of orders u(kT ).
The customer demand denoted by d(kT ) is unknown before, where d(kT ) ≥ 0. Therefore, the number of demand d(kT ) should be limited, i.e. with maximal demand d max that is acceptable to the center distribution. The demand d(kT ) must be smaller or equal to the maximal demand d max . In other words the constraint of demand d(kT ) ie Goods sold to customers so that out of the distribution center denoted by h(kT ), where h(kT ) ≥ 0. If there is a customer demand d(kT ), then h(kT ) is smaller or equal to the demand. So it can be written as follows The order for a stock is done at regular interval kT based on the on-hand stock y(kT ), the number of target stocks y d , and the previous order. In the case of multi suppliers, the number of orders u(kT ) can be divided among m suppliers choices according to company strategy. As a result, in any review period k, γ p is the order allocated to supplier p (p = 1, 2, . . . , m) where γ p is a real number in the interval [0, 1] that satisfies m ∑ p=1 γ p = 1. In a certain case, when γ p = 1, then a single supplier p is selected to send all the orders, while γ p = 0 means no replenishment from option p.
The number of orders for each supplier p comes from the multiplication between the order allocation γ p and the total order u at kT − L p , i.e.
Any non-zero order placed on supplier p is realized after leadtime delay L p , which is assumed to be a multiple of the review period, i.e. L p = n p T , where n p is a constant positive integer that represents one round of inventory distribution. Then, it can be said that in one round of goods flow for each supplier, Round Trip Time (RTT) RT T p = L p = n p T . Supplier options for an order based on lead time of each supplier is The change of the on-hand stock is The on-hand stock for each k ≥ 0 can be expressed as follows 2) State Space and Objective Function of Inventory Control Model: Discrete-time system is described in the state space according to [15] as follows .. x n (kT )] T is the state vector with x 1 (kT ) = y(kT ) represents on-hand stock in period k and .., n equals to the delayed input signal u. The order for this system is n = n m + 1 = (L m /T ) + 1 depend on discrete period T and the longest round time on the goods flow in a supply chain. Matrix A A A is an n × n state-space matrix, B B B, V V V , and C C C is the n × 1 vector 1 a n−1 a n−2 . . .
The first line element in the matrix A A A is which indicates the allocation of several supplier choices with lead time jT ( j = 1, 2, . . . , n m ) in the number of total orders allocated by the controller. If there is no allocation for the supplier, then the appropriate allocation a j in the total order is zero. From condition m ∑ p=1 γ p = 1, we obtain n m ∑ j=1 a j = 1. Using (10), then (6) becomes Then we can present a multi-supplier system as follows . . .
where the desired system state is From the point of view of dynamic system optimization, the purpose of the control is to bring the system state (the current stock) to the desired value (target state) without any redundant control. In other words, we want to reduce the closed-loop system error e(kT ) close to zero by using the appropriate number of orders.
Therefore, we will look for an optimum controller u opt (kT ), which minimizes the following objective function where w is weight, the positive constant used to adjust the influence of controller command and the output variable at the value of the objective function.

B. Optimal Control
The main goal of optimal control is to determine control signals which affect a process to meet physical constraints. Then, at the same time can be determined the optimum value in accordance with performance index or objective function [16], [17]. Formulations in optimal control problem are as follows [16] • Describing the mathematical process means getting the mathematical method of controlling process (generally in the form of a state variable). • Specification of the objective function. • Determining the boundary conditions and physical constraints on the state and or control. In the following, we described the linear, time-variant, and discrete-time control systems.
is the control vector of size n, A A A(k) and B B B(k) respectively are n × n and n × r matrices. Generally, the problem of discrete-time optimal control can be formulated as follows. For the purpose of computing control u u u(t) that optimize objective function.

C. Linear Quadratic Regulator
Linear Quadratic Regulator (LQR) is used to produce the closed-loop optimal control of a linear system with a quadratic objective function J [16]. It leads to the matrix of Riccati differential equations. According to the goal, there is LQR whose the system leads to a tracking system or called by Linear Quadratic Tracking (LQT) system because of searching for the output or optimal state approaching the desired state. In other words, we will search for the smallest error or close to zero. The following is LQT formulation and its solution [16].
The linear state equation is described as and output relationships The objective function will be minimized is where x x x(k) is the n-sized state vector, u u u(k) is the r-sized control vector and y y y(k) is n-sized output vector. While F F F and Q Q Q are n × n symmetric positive semi-definite matrix and R R R is r × r symmetric positive semi-definite matrix. Initial condition is given by x x x(k 0 ) and the final condition x x x(k f ) is free with predetermined k f . The goal is to make the error as small as possible with minimal control, where z z z(k) is the n-sized target vector.

D. Difference Riccati Equation
Difference Riccati Equation (DRE) is the first order and non linear ordinary differential equation that is formed as follows with a(x) = 0 and c(x) = 0. Equation name is taken from Count Jacopo Francesco Riccati (1676-1754) [18], [19].

A. Determination of Optimal Tracking System
The following steps can be done to obtain a solution for optimal tracking system [16], [20].
Step 1: Formulation of the Hamiltonian Formulate the Hamiltonian as follows where λ λ λ (k +1) is a Lagrange multiplier that refers to co-state.
Step 2: Determination of the state and co-state system The state equation is The co-state equation is The control equation of the stationary condition is Step 3: Determination the open-loop optimal control From (25), the open-loop optimal control is By using optimal control (26) at state (23), the Hamiltonian System of state (23) and co-state (24) is Step 4: Determination of Riccati equation To find the closed-loop form for optimal control (26), it is assumed a transformation λ λ λ * (k) = P P P(k)x x x * (k) − g g g(k) (28) where the matrix P P P(k) and vector g g g(k) have not been determined. Eliminate co-state λ λ λ (k) from the Hamiltonian system (27) using transformation (28). Then we obtain and vector of linear differential equations ie Step 5: Determination of the closed-loop optimal control The optimal state trajectory is obtained from substituting equation (32) to equation (23), as follows

B. Determination of the Solution of Difference Riccati Equations Matrix
DRE matrix is DRE in matrix form. The solution of DRE matrix at (29) will be explained in the following example, with Q Q Q = I I I n×n and R = 1 [15], [21]. P P P(k) = A A A T P P P(k + 1) I + B B BB B B T P P P(k + 1) with DRE can be solved by a classical approach i.e. with backward iteration as in [16], [20], [22], [23] that can be used to complete DRE with predefined final dimension and state. But it does not apply to solve equation (34) because of it has n × n dimension and has no final state.
A suitable solution for (34) is to use iteration, analytic substitution of the general form of P P P to the right side of the equation and compare it to the left side so that at each step the amount of independent variables p i j will be reduced [15], [21]. Where p i j defines the Riccati matrix element on the i-th row and the j-th column.
The general P P P = P P P 0 = [p i j ] n×n is To start the iteration, substitute P P P 0 to the right side of the Riccati equation (34) along with the matrices A A A, B B B, C C C, and w. Then, look for the equation among the elements p i j so that can be obtained a certain form or pattern in the matrix P P P. In the first iteration, obtained an equation between elements in the first two rows and the first two columns of the matrix P P P, i.e. p 12 = p 11 − w, p 22 = p 11 − w, p 23 = p 13 , and so on until the n-th element p 2n = p 1n (Since P P P is a symmetric matrix then p i j = p ji ). From the first iteration, we obtain Then we repeat the substitution process and search for the pattern until all of the elements of P P P can be written in the form of the first element p 11 and n-size system. The final form of P P P can be written as (38).
Then we compute p 11 by substituting (38) to the right side of the equation (34) and comparing the first row of the first column element on the right side with the element on the left side so as to produce The roots of p 11 can be obtained from (39), that is where p + 11 ensures all elements on P P P are positive.

IV. RESULTS AND DISCUSSIONS
A. The Solution of Inventory Control Model 1) Optimal Control Solution Using LQR (Linear Quadratic Regulator): Before solving the inventory control model, substitute the output equation y y y(kT ) (7) into the objective function (14) to obtain To complete the inventory control model using LQR with LQT system, the first thing to do is determining the Hamiltonian function. From (22), we obtain the following Hamiltonian function After obtaining the Hamiltonian function H , from equation (23) the obtained optimum state is and from equation (24) the optimal co-state is obtained While the open-loop optimal control is obtained from stationary conditions (25) By substituting the optimal control at (45) to the optimal state equation (43), we obtained The Hamiltonian system of (27) is formed by combining state (46) and co-state (44) as follows Next, we will look for closed-loop optimal control on (45). From (28), we obtain λ λ λ * (kT ) = P P P(kT )x x x * (kT ) − g g g(kT ) where the matrix P P P(kT ) and g g g(kT ) will be searched in the next step. By substituting λ λ λ * (kT ) (48), state equation (46) becomes From (50) we obtained Riccati equation which is the symmetric matrix P P P n×n and g g g(kT ) + 2) The Solution of DRE Matrix: As a prefix, we define the general form of P P P = P P P 0 = [p i j ] n×n as In the first iteration, substitute P P P 0 to the right side of Riccati equation (51) along with the matrices A A A, B B B, Q Q Q, and weight w. Then, we search the equation among elements p i j so that we can obtained certain form or pattern in matrix P P P. In this way, we get the equation between elements in the first two rows and the first two columns of the matrix P P P, i.e. p 12 = a n−1 (p 11 −w), p 22 = a 2 n−1 (p 11 − w), p 23 = a n−1 p 13 , and so on until the nth element p 2n = a n−1 p 1n . (Because P P P is a symmetric matrix then p i j = p ji ). From this first iteration, we obtain P 1 (54). The next step is to substitute P P P 1 (54) on the right side of equation (51) and compare the result with the left side so as to obtain a particular pattern. Then repeat the substitution process and pattern searching until all the elements of P P P can be written in the form of the first element p 11 and the system size is n. The final form of P P P can be written as (55) (for efficiency, top element only).
In equation (55), the element p nn can be written as n−1 ∑ j=1 a n− j (p 11 − (n − j)w) because n−1 ∑ j=1 a n− j = 1. Substituting (55) to the right side of equation (51) and compare the firstrow element of the first column on the right side with the element on left side, so we obtain From (56), we can obtain the roots of p 11 , i.e.
where p + 11 ensures all elements of P P P are positive and the determinant is not negative. Then (55) and (57) will be used to find the optimal control and state.
3) Optimal Solution of Inventory Control Model: To obtain the optimal control and state, first will be searched feedback gain equation L L L(kT ) by substituting P P P in equation (55) to the equation L L L(kT ) at (52). We obtain L L L(kT ) = 1 a n−1 2 ∑ j=1 a n− j · · · n−2 ∑ j=1 a n− j 1 α. (58) Then, we will find the feedback forward gain equation L L L g (kT ) by substituting P P P in equation (55) to the equation L L L g (kT ) at (52). So that we obtain Subsequently, we compute g g g by substituting P P P in equation (55) to (52) and iterates until a particular form or pattern is obtained in matrix g g g. The general form of g g g = g g g 0 = [g i j ] n×1 is defined by equation (60). In the first iteration, substitute g g g 0 P P P 1 =        p 11 a n−1 (p 11 − w) p 13 · · · p 1n a n−1 (p 11 − w) a 2 n−1 (p 11 − w) a n−1 p 13 · · · a n−1 p 1n p 13 a n−1 p 13 p 33 · · · p 3n . . . . . . . . . . . . . . . p 1n a n−1 p 1n p 3n · · · p nn p 11 a n−1 (p 11 − w) 2 ∑ j=1 a n− j (p 11 − (3 − j)w) · · · n−1 ∑ j=1 a n− j (p 11 − (n − j)w) a n−1 (p 11 − w) a 2 n−1 (p 11 − w) a n−1 2 ∑ j=1 a n− j (p 11 − (3 − j)w) · · · a n−1 n−1 ∑ j=1 a n− j (p 11 − (n − j)w) p 13 p 23 2 ∑ j=1 a n− j 2 ∑ j=1 a n− j (p 11 − (3 − j)w) · · · 2 ∑ j=1 a n− j n−1 ∑ j=1 a n− j (p 11 − (n − j)w) to the right side of equation (52) along with matrices A A A, B B B, C C C, and P P P. Then, equation among elements g i j is determined so that we can obtain certain form or pattern on matrix g g g. From this first iteration, we obtain g g g 1 (60). The next step is to substitute g g g 1 (60) on the right side of the equation (52) and compare the result with the left side to obtain the certain form. Then repeat the substitution and search process for the form until all the g g g elements can be written in the form of the first element g 11 and the system size is n. The final form of g g g can be written as g g g (60).
g 11 a n−1 (g 11 − wy d ) g 11 a n−1 (g 11 − wy d ) 2 ∑ j=1 a n− j (g 11 − (3 − j)wy d ) In equation (60), the element g n1 can be written as g 11 − n−1 ∑ j=1 ja j wy d because n−1 ∑ j=1 a n− j = 1. Next, we compute g 11 by substituting (60) to the right side of equation (52) and compare the first-row element of the first column on the right side with elements on the left side, we obtain Next, we will find the optimal control u * (kT ) of equations (58), (59), and (60). To obtain u * (kT ), we must find L g (kT )g[(k + 1)T ] from equation (59) and (60). Then substitute g 11 and p 11 into it so that we obtain Substitute equation (58) and (62) to equation (52) to obtain optimal control u * (kT ) as follows In the previous section, it has been explained that x 1 (kT ) = y(kT ) and x j (kT ) can rewritten in the form of control generated in the n − 1 period, ie x j (kT ) = u[(k − n + j − 1)T ] for every j = 2, 3, . . . , n which can be seen in the equation (16). Then equation (63) becomes Then the optimal control equation (64) can be rewritten to In the optimal control (65), the number of orders for each period are comparable to the difference between the amount of on-hand stock and the desired amount reduced by an open order. An open order is the number of orders already ordered to the supplier but not yet reached the warehouse due to the waiting time. 4) Phase in the Inventory Control Model: In one period, the inventory control model can be divided into two phases, i.e. phase I (initial phase) in Table I and phase II in Table II.

B. Properties of Inventory Control Model Solution
1) The stock is always positive: If equation (65) is used in the system (7) and the desired stock (y d ) meets then the on-hand stock is always positive for k ≥ n m + 1. In other words, there will be no backorder. Backorder occurs if the on-hand stock cannot meet demand, so it needs some time to fulfill the demand. Fortunately, this condition does not occur because demand can be fulfilled by existing inventory or placing an order.
2) Optimal control is not negative and bounded: If equation (65) is used in system (7), then optimal control (number of order) is always non-negative and bounded, i.e.

V. IMPLEMENTATION ON RICE INVENTORY MANAGEMENT
A simulation model of multi-supplier inventory management with lead time in the inventory of rice in Bulog, national logistical supply organization, was used Matlab based on the lead time of each supplier (Table III) and demand in the distribution center (Fig. 3) for 6 months [24].
The first thing to do in this simulation is the determination of order allocation for each supplier (γ p ) based on lead time (L p ), with p = 1, 2, . . . , m are the suppliers. From Table III,, note that there are m = 34 suppliers will fulfill the order. Order allocation γ p (Table III) is determined by pairwise comparison [25].
The next step is to determine a j , the first-row elements of n ×n matrix A A A with j = 1, 2, . . . , n −1. To obtain the size of the matrix A A A, rank suppliers based on lead time from the smallest to largest thus satisfies the equation (4). It is assumed that period of discretization is T = 1 day. Then we obtain a round  trip time RT T p = L p = n p T , with n p = L p /T (Table III). A A A is an n × n matrix with n = n m + 1 = 3 + 1 = 4. The first-row elements of the matrix A A A is the sum of the order allocation γ p with the same lead time, i.e. Determining the most effective weights w in providing optimal value to the objective function (14). In this case, it will be simulated with weights w = 0, 1, w = 1, and w = 10. The desired stock y d can be obtained from equation (66) in first property by d max = 582.448 kg/month or d max = 19.415 kg/day. The parameters which will be used in the simulation are shown in Table IV. Then, γ, a j , and y d are used to search the on-hand stock y(kT ), optimal control u * (kT ), and objective function J. The on-hand stock y(kT ) is influenced by previous order that had come and amount of stock are used to meet the demand (h(kT )) as in equation (11). In this simulation, we performed three case studies with different weighting w as shown in Table IV.
On-hand stock obtained from the third case study is shown in Fig. 4 and Fig. 5. It can be seen that the number of stocks never worth less than zero. It means the demand is always mostly fulfilled. Therefore, no backorder cost to be incurred and customer dissatisfaction which usually arise because of the inability of the company to meet customer demand. So we can say maximum service has been done.
Optimal control (order) is influenced by the desired stock y d , on-hand stock y(kT ), and prior order that has not yet arrived (open order) as in equation (65). Optimal control obtained from the three case studies with different w can be seen in Fig.  6 and Fig. 7. It can be seen that the optimal control is always bounded as to the second property. The order never worth less than zero and not exceed max{β y d , d max }. It means that the order has a maximum for the minimum order still be able to meet customer demand.
Fulfilling demand can be obtained from on-hand stock or orders. In Fig. 1, demand trends occur from 100th to 120th days without oscillation and overshoot. In Fig. 5, high demand has an impact on stock levels not increasing and decreasing to 0. In Fig. 7, high demand has an impact on high order levels. This condition occurs because the on-hand inventory is always positive as in the second property (61). When the demand is high but the on-hand stock is low, demand fulfillment is obtained from high orders. So even though demand is high, inventory is always positive. This means that there is no backorder so that the customer is not dissatisfied by waiting for inventory and Bulog does not need to pay a backorder cost.
The amount of inventory y(kT ) and optimal control u(kT ) depend on weighting w. In Fig. 4 and 5 can be seen that along increasing the weight w, optimal control reacts faster toward demand changes d(kT ). Meanwhile, when the weight w decreases, the speed of reaction toward demand changes also declined. In addition, the greater the weight w, the smaller the on-hand stock. Conversely, the smaller the weight w, the greater the on-hand stock.
The change of weight w affects on objective function. When the weight w = 0, 1, the objective function value is 0.9886 × 10 9 . However, when the weight w = 1, the objective function rose 626% to 7.1737 × 10 9 and when the weight w = 10, the objective function rose 970% to 76.774 × 10 9 . The greater the weight w, the greater the objective function. Conversely, the smaller the weight w, the smaller the objective function. It can be concluded that the smaller weight w, the objective function is more optimal.

VI. CONCLUSIONS
The selection of order allocation for each supplier with lead time criteria can be used as a pairwise comparison with the sum of all allocations is 1. The problems in multi-suppliers inventory management with lead time can be solved with optimal control Linear Quadratic Regulator with demand as the controller that is based on on-hand stock, desired stock, and customer demand. When the weight w = 0.1, the objective function value is 0.9886 × 10 9 . However, when the weight w = 1, the objective function rose 626%, to 7.1737 × 10 9 and when weight w = 10, the objective function rose 970% to 76.774 × 10 9 . The greater the weight w, the greater the objective function. Conversely, the smaller the weight w, the smaller the objective function. It can be concluded that the smaller weight w, the objective function is more optimal.