Analysis of a Single Manufacturer Multi-Retailer Inventory Competition Using Supermodular Multi-Objective Games

Analysis of a Single Manufacturer Multi-Retailer Inventory Competition Using Supermodular Multi-Objective Games

Rubono Setiawan SalmahIrwan Endrayanto Indarsih 

Department of Mathematics, Universitas Gadjah Mada, Bulak Sumur, Yogyakarta 55281, Indonesia

Department of Mathematics Education, Universitas Sebelas Maret, Jln. Ir. Sutami No. 36 A, Surakarta 57126, Indonesia

Corresponding Author Email:
7 April 2022
13 June 2022
30 June 2022
| Citation



In this research, the optimum decision of a multi-objective inventory game problem has been investigated under strategic complementarities in a multiplayer supply chain. The supply chain comprises a single retailer and multi-retailers under the synchronization process, the wholesale contract, and the buy-back contract policy. We apply some results in supermodular multi-objective game theory to solve these problems. The optimal decision for all players is formed in two equilibria, namely the Pareto equilibrium and the weighted Nash equilibrium. For numerical results, A genetic algorithm and dominance principle elements of the payoff matrix are used to obtain the weighted Nash equilibrium and the weighted Nash equilibrium, respectively. The analytical and numerical results using the supermodular multi-objective games concept can be significant results to solve the supply chain competition problems in the industrial engineering.


supermodular game, non-cooperative game, multi-objective game, inventory competition, multiplayer supply chain

1. Introduction

Game theory is a useful tool in the strategic competing analysis of many decision-making problems. One of the decision-making processes in multi-player supply chain analysis is how to obtain the optimum solution for the inventory problem. In real words, the players in the competing supply chain can take various types of strategies to obtain their optimum moves such as cooperative or non-cooperative with the other players. Therefore, the game theory can be used as a mathematical tool to obtain the optimum solution in those conditions. These game theory approaches can be more complicated than well-known methods such as the integrated method. However, by using the game theory concept, the optimum solution can be analyzed based on many perspectives of the strategy from the players and it is more describing the real situations indeed. One of the earliest results is the application of game theory to inventory problems with substitutable products [1]. Another result can be found in several papers, such as supply chain with multiple retailers [2], the competitive newsboy [3], and newsvendor games in inventory problems [4]. In recent years, the application of game theory in inventory problems has recently been extended to the multiplayer supply chain [5-8]. Game theory has been used to solve multi-player supply chain problems with linear multi-objectives [9]. Furthermore, the multi-objective games with non-linear multi-objective have been analyzed for a zero-sum multi-objective game in which most of the results focus on games with two players [10].

The ordered game is one of the extensions of a class of traditional games, which are characterized by strategic complementarities. In these games, when one player obtains a high payoff by taking a high-ordered strategy, the other players would also earn a high marginal payoff if they also increase their strategies. The ordered games (in maximization problems) with strategic complementarities are called supermodular games and are defined in a lattice. The theory of supermodular games and complementarity games was first explained for minimization games with a single payoff [11, 12]. These results are also explained in the supermodular form [13]. The properties of supermodular and its equilibrium were also further developed [14-16]. Then, fixed-point theorems were used to analyze the supermodular games [17].

The concept of the supermodular game has recently been widely applied in economic problems, stock competitions, and supply chain issues. One of the earlier relevant literatures on the application of supermodular games is the application of these games to the Cournot oligopoly problem [18]. Other results can be found in several papers, such as supermodular games for tax competition [19], NTU supermodular games [20], supermodular games for stock competition proposed [21], the relation between supermodular games and potential games explained [22], newsboy problem using supermodular games [3], and multiplayer supply chain models using supermodular games [5]. However, those results are still limited to the single objective case.

In this research, we explain some results in supermodular multi-objective games and then apply them to analyze the optimum result of complementary strategic game situations in a multiplayer supply chain with multi-objective. We formulate the game in a supply chain comprising of a single manufacturer and multi-retailers. All retailers have multi-objective and play complementarity strategies. The retailers conduct the game based on the synchronization process and the wholesale and the buy-back contract. We apply the genetic algorithm and the dominance principle of the modified payoff matrix to obtain the two equilibria. Therefore, this research is the first to discuss the application of supermodular multi-objective games in inventory problems. Using the supermodular multi-objective games concept, we can effectively explain the complementarity condition in multiplayer inventory games with multi-objective, which is indirectly linked to the convexity properties in optimum analysis. The analytical results including new definitions and theorems are different than the results in the references. In the references [11-17], the theoretical result on supermodular is constructed for the game with a single payoff. In this paper, these results have been extended to the multi-objective supermodular games. Our results are also different from the previous result on the linear multi-objective game [9] and nonlinear multi-objective games which focus on zero-sum and two-player cases [10]. In this paper, the formulation of the non-cooperative game has been extended to the non-cooperative supermodular multi-objective cases for non-zero-sum cases (which can be linear or non-linear). The inventory games problem provided in this paper is also different from the other reference about the application of game theory in inventory [2, 3, 5, 18-22]. The inventory model used in those references is in a single objective form. In our paper, the multi-objective inventory problems are used and proposed in supermodular form. Therefore, the results provided in this paper are the new result when compared to the references in supermodular games and their application.

The rest of the paper is organized as follows. The relationship of each element of the concept of the supermodular game to the proposed inventory problems is discussed in Section 2. In Section 2, some definitions and theorems on multi-objective supermodular games are proposed. The inventory games including assumption, general form, and properties are proposed in Section 3. Furthermore, the optimal analysis of the equilibrium of the games is presented in Section 4. A numerical example is also demonstrated in Section 5. Finally, the conclusion of the results of the research and some suggestions for future research are provided in the last section of the paper.

2. Preliminaries

We propose the mathematical formula to illustrate the competition condition between all players in the supply chain system when all players use the strategic complementarities and have more than a payoff function. The respective formula can be presented in supermodular game form. Therefore, our work is based on non-cooperative supermodular multi-objective games. Because the concept of these games has not been constructed before, then we proposed new definitions and theorems to explain the supermodular multi-objective games in this section. The definition of two equilibria is given to explain the optimum solution to our proposed inventory game problem. Because each player has multi-objective, then the optimum solution for each player must be the Pareto optimal. Furthermore, the players also played non-cooperative games with each other and the optimum of the pure strategy must meet the Nash equilibrium. Moreover, we use the weighted Nash equilibrium regarding the multi-objective conditions. The existence theorem of the equilibrium has been proved. This theorem can be used to check the conditions of the game whether it has an equilibrium or not.

Next, we will provide some elements of the n-person non-cooperative supermodular multi-objective game (NC-SMOG) in maximize case. We consider a supermodular game with a finite number of players ρ={1, .., n}. We also denote the games as $\widehat{G}_{m}=\left(\rho,\left\{S_{i}\right\}_{i \in \mathbb{N}} ;\left\{\overline{F_{i}}\right\}_{i \in \mathbb{N}}\right)$ . We then define a strategic space $S_{i} \subseteq \mathbb{R}^{p_{i}}$, where pi is the number of dimensions of the feasible strategy. The joint strategy space S:=S1×S2×...×Sn is a finite Cartesian product of Si from i=1 to i=n and denoted by $S=\times_{i=1}^{n} S_{i}$. The payoff $\bar{F}_{i}$ is a vector-valued function of S into $\mathbb{R}^{k_{i}}$, where ki is the number of components of each player-i's payoff. The payoff function for each player-i is said to be supermodular on $\times_{i=1}^{n} S_{i}$ if it satisfies the following definition:

Definition 1. Let $\left(\times_{i=1}^{n} S_{i}, \leq\right)$ be a partially ordered set. A function $F_{i}^{r_{i}}: S_{i} \rightarrow \mathbb{R}$ , where $r_{i} \in\left\{1, \ldots k_{i}\right\}$, is said to be a supermodular on $\times_{i=1}^{n} S_{i}$ if

$F_{i}\left(x_{i}^{\prime}\right)+F_{i}\left(x_{i}^{\prime \prime}\right) \leq F_{i}\left(x_{i}^{\prime} \vee x_{i}^{\prime \prime}\right)+F_{i}\left(x_{i}^{\prime} \wedge x_{i}^{\prime \prime}\right)$      (1)

where, $x_{i}^{\prime}, x_{i}^{\prime \prime} \in S_{i}$ where $x_{i}^{\prime} \leq x_{i}^{\prime \prime}$.

We use notations $\mathbb{L}_{+}$and $\mathbb{L}_{++}$  to define the simplex of and its relative interior $\mathbb{R}_{+}^{k_{i}}$, where:

$\mathbb{L}_{++}^{k_{i}}:=\left\{\mathbf{m}_{i}=\left(m_{i}^{1}, \ldots, m_{i}^{k_{i}}\right) \in \mathbb{R}_{+}^{k_{i}} \mid \sum_{i=1}^{k_{i}} m_{j}=1\right\}$      (2)

We extend the definition of the Pareto equilibrium by Ji et al. [9] which was first proposed in minimization cases into supermodular maximization games as follows:

Definition 2. A set of pure strategy x is called a Pareto equilibrium of NC-SMOG $\widehat{G}_{m}$ in a lattice. If $x_{i}^{\star}, i \in\{1, \ldots, n\}$ is an optimal solution for the multi-objective problem for each respective player-i and there is no $x_{i} \in S_{i}$ such that:

$\begin{aligned} F_{i}^{r}\left(x_{1}^{\star}, \ldots, x_{i-1}^{\star}, x_{i}^{\star}\right.&\left., x_{i+1}^{\star}, \ldots, x_{n}^{\star}\right) \\ & \leqslant F_{i}^{r}\left(x_{1}^{\star}, \ldots, x_{i-1}^{\star}, x_{i}, x_{i+1}^{\star}, \ldots, x_{n}^{\star}\right) \end{aligned}$      (3)

where, r=1, 2, …, ki and (3) holds strictly for at least one index r.

For each $\mathbf{x} \in S$, we define the weighted joint response function for each player-i is defined as a real-valued function $k^{\bar{\delta}}: S \times S \rightarrow \mathbb{R}$:

$k^{\bar{\delta}}(\mathbf{x}, \mathbf{y}):=\sum_{i=1}^{n} \delta_{i} \cdot \bar{F}_{i}\left(x_{1}, \ldots, x_{i-1}, y_{i}, x_{i+1}, \ldots, x_{n}\right)$    (4)

where, $\bar{\delta}=\left(\delta_{1}, \delta_{2}, \ldots, \delta_{n}\right)$, with $\delta_{i}=\left(\delta_{i}^{1}, \ldots, \delta_{i}^{k_{i}}\right) \in \mathbb{L}_{++}^{k_{i}}$, $i=1, \ldots, n$. From $(4)$, we can define the following function.

Definition 3. Function $Z^{\bar{\delta}}: S \rightarrow \mathcal{P}(S)$, with:

$Z^{\bar{\delta}}(\boldsymbol{x}):=\left\{\boldsymbol{y}^{\prime} \in S \mid k^{\bar{\delta}}\left(\boldsymbol{x}, \boldsymbol{y}^{\prime}\right)=\max _{y \in S} k^{\bar{\delta}}(\boldsymbol{x}, \boldsymbol{y})\right\}, \forall \boldsymbol{x}\in S$      (5)

is called a weighted best joint response function of each player concerning the weight combination $\bar{\delta}=\left(\delta_{1}, \delta_{2}, \ldots, \delta_{n}\right)$ where $\delta_{i}=\left(\delta_{i}^{1}, \ldots, \delta_{i}^{k_{i}}\right) \in \mathbb{L}_{++}^{k_{i}}, i=1, \ldots, n$.

We then present our theorem regarding the existence of the Pareto equilibrium of NC-SMOG.

Theorem 1. Given NC-SMOG $\widehat{G}_{m}=\left(\rho,\left\{S_{i}\right\}_{i \in \mathbb{N}} ;\left\{\bar{F}_{i}\right\}_{i \in \mathbb{N}}\right)$ and a weight $\bar{\delta}=\left(\delta_{1}, \delta_{2}, \ldots, \delta_{n}\right)$, where $\delta_{i} \in \mathbb{L}_{++}^{k_{i}} i=1, . ., n$ for $\bar{F}_{i}$. If $S_{i}, i=1, \ldots, n$ is a nonempty compact lattice and $F_{i}^{r}\left(x_{-i}, y_{i}\right), r=\left\{1, \ldots, k_{i}\right\}$ is upper-semicontinuous in $y_{i}$ on $S_{i}\left(x_{-i}\right)$ for each $x_{-i}$ in $S_{-i}$ and $r=\left\{1, \ldots, k_{i}\right\}$, then a set of the Pareto equilibrium exists for $\widehat{G}_{m}$.

Proof. If $S_{i}$ for each, $i \in\{1, \ldots, n\}$ is a nonempty compact lattice, then the weighted best joint response function $Z^{\bar{\delta}}(\mathbf{x})$ for each strategy $x \in S=\times_{i=1}^{n} S_{i}$ is a nonempty subcomplete sublattice of lattice $S=\times_{i=1}^{n} S_{i}$. Section $Z_{\mathbf{x}}^{\bar{\delta}}$ is a sublattice of $S \times S$. The range of $Z^{\bar{\delta}}($.) for each $\mathbf{x} \in S$ is a set with induced set ordering $\subseteq$ on a nonempty power set of $S, \mathcal{P}(S) \backslash\{\varnothing\}$. We will prove that $Z^{\bar{\delta}}$ (.) is an increasing function in $\mathbf{x} \in S$. A section $Z_{\mathbf{x}}^{\bar{\delta}}$ increases in $\mathbf{x}$ on projection $\prod_{x} Z^{\bar{\delta}}$. Function $k^{\bar{\delta}}(.)$ is a supermodular function in $\mathbf{y}$ on $S$ for each $\mathbf{x} \in S$. Therefore, $Z^{\bar{\delta}}(.)$ is an increasing function in $\mathbf{x}$ on the projection $\prod_{x} Z^{\vec{\delta}}$. Hence, the set of the fixed point of $Z^{\bar{\delta}}(.)$ is a nonempty lattice with the largest and smallest fixed points. Furthermore, taking any fixed point of $Z^{\bar{\delta}}(.)$, we obtain:

$k^{\bar{\delta}}(\mathbf{x}, \mathbf{y}) \preccurlyeq k^{\bar{\delta}}\left(\mathbf{x}, \mathbf{x}^{\star}\right), \quad \forall \mathbf{x} \in S$.     (6)

which holds

$\delta_{i} \cdot \bar{F}_{i}\left(x_{1}^{\star}, \ldots, x_{i-1}^{\star}, y_{i}, x_{i+1}^{\star}, \ldots, x_{n}^{\star}\right)$

$\prec \delta_{i} \cdot \bar{F}_{i}\left(x_{1}^{\star}, \ldots, x_{i-1}^{\star}, x_{i}^{\star}, x_{i+1}^{\star}, \ldots, x_{n}^{\star}\right)$      (7)

where, $\bar{F}_{i}=\left[F_{i}^{1}, F_{i}^{2}, \ldots, F_{i}^{r}\right]^{T}, y_{i} \in S, i=1, \ldots, n$. If that fixed point $\mathbf{x}^{\star}$ is not a Pareto equilibrium, then $i_{0}$ and $y_{i_{0}}^{\star} \in S$ exists such that:

$F_{i_{0}}^{r}\left(x_{1}^{\star}, \ldots, x_{i_{0}-1}^{\star}, x_{i_{0}}^{\star}, x_{i_{0}+1}^{\star}, \ldots, x_{n}^{\star}\right)$

$\leqslant F_{i_{0}}^{r}\left(x_{1}^{\star}, \ldots, x_{i_{0}-1}^{\star}, y_{i_{0}}^{\star}, x_{i_{0}+1}^{\star}, \ldots, x_{n}^{\star}\right)$       (8)

where, $r=1, \ldots, k_{i_{0}}$, and holds strictly for at least one inequality $\left(r=l_{i_{0}}, l_{i_{0}} \leq k_{i_{0}}\right)$ such that:

$\delta_{i_{0}}^{l_{i_{0}}} F_{i_{0}}^{k_{i_{0}}}\left(x_{1}^{\star}, \ldots, x_{i_{0}-1}^{\star}, x_{i_{0}}^{\star}, x_{i_{0}+1}^{\star}, \ldots, x_{n}^{\star}\right)$

$\prec \delta_{i_{0}}^{l_{i_{0}}} \bar{F}_{i_{0}}^{k_{i_{0}}}\left(x_{1}^{\star}, \ldots, x_{i_{0}-1}^{\star}, y_{i_{0}}^{\star}, x_{i_{0}+1}^{\star}, \ldots, x_{n}^{\star}\right)$      (9)

where, $\delta_{i_{0}}^{k_{i_{0}}} \in \mathbb{R}_{++}$.  This condition a contradiction with (7). Hence, $\mathbf{X}^{\star}$ is the Pareto equilibrium for $\widehat{G}_{m}$.

We then explain the second equilibrium of NC-SMOG in the following definition.

Definition 4. A selective strategy $\mathbf{x}^{*}$ is called by the weighted Nash of NC-SMOG with the weight $\bar{\delta}=$ $\left(\delta_{1}, \delta_{2}, \ldots, \delta_{n}\right)$, where $\delta_{i} \in \mathbb{R}_{++}^{k_{i}}, i=1, \ldots, n$; for each player-i, the following is satisfied:

$\delta_{i} \cdot \bar{F}_{i}\left(x_{1}^{*}, \ldots, x_{i-1}^{*}, x_{i}, x_{i+1}^{*}, \ldots, x_{n}^{*}\right)$

$\leqslant \delta_{i} \cdot \bar{F}_{i}\left(x_{1}^{*}, \ldots, x_{i-1}^{*}, x_{i}^{*}, x_{i+1}^{*}, \ldots, x_{n}^{*}\right)$     (10)

We demonstrate the existence of the weighted Nash equilibrium in the following theorem:

Theorem 2. Given NC-SMOG $\widehat{G}_{m}=\left(\rho,\left\{S_{i}\right\}_{i \in \mathbb{N}} ;\left\{\overline{F_{i}}\right\}_{i \in \mathbb{N}}\right)$. If $\mathrm{S}$ is a nonempty compact lattice and $F_{i}^{r}\left(x_{-i}, .\right), r=1, \ldots, k_{i}$ $i=1, \ldots, n$ is upper-semicontinuous in $y_{i}$ on $S_{i}\left(x_{-i}\right)$ for each $x_{-i} \in S_{-i} i \in\{1, \ldots, n\}$ then the set of the equilibrium $\boldsymbol{x}^{*}=$ $\left(x_{i}^{*}, \ldots, x_{i}^{*}\right)$ is a nonempty complete lattice and the largest weighted Nash equilibrium $\overline{\boldsymbol{x}}^{* \prime}$ and the smallest weighted Nash equilibrium $\overline{\boldsymbol{x}}^{* \prime \prime}$ exist.

The supply chain comprises a single manufacturer and multi-retailer (also known as a single vendor-multi buyer) has been analyzed before by several authors [23-26]. In a single manufacturer and multi-retailer model, a finite number of retailers with competing conditions exists. All retailers are assumed to take place the order of a single product to a single manufacturer. One of the main assumptions of a single manufacturer-multi retailer model is the synchronization process. The production cycle of the manufacturer should be synchronized with the ordering cycles of the retailers [23]. The total cost for each retailer is calculated by several marginal costs such as ordering cost, transportation cost, and also holding cost [26]. The manufacturer’s inventory level can be obtained by calculating the difference between the manufacturer’s accumulated inventory and the retailers’ accumulated inventory [24]. The strategic complementarities have been applied to the inventory problem [5]. Supermodularity between any two strategies is not linked directly to either convexity or even continuity [5]. In these games, increasing best response functions is the only major requirement for an equilibrium to exist [5]. In this paper, we use a single manufacturer and multi-retailer model which additional assumptions like a wholesale price and buyback contract. We propose the relationship between the vendor and the retailers using strategic complementarities in the competing situations.

3. Games Formulation

3.1 General assumptions

We consider the multiplayer supply chain system, which comprises one manufacturer and multiple competing retailers. The manufacturer produces the items to fulfill the demand of the retailers and coordinates the supply chain terms and conditions to earn profit from the supply chain. A finite number of retailers with competing conditions exists. The competition occurs by allocating the demand along with the retailers to their inventory under terms and conditions issued by the manufacturer. All the retailers can have more than one payoff function. They maximize their payoffs by determining the optimum ordering quantity through a decentralized scheme, which is implemented with a multi-objective game. The game only exists among the retailers. The manufacturer is not directly involved in the games. However, the manufacturer is the coordinator in the supply chain system and can offer some contracts to all retailers. Three kinds of contracts are presented as follows:

(1) Synchronization process. The production cycle of the manufacturer should be synchronized with the ordering cycles of the retailers. Based on the result of the synchronization process in the inventory problem [23-26], the synchronization is useful to reduce the total related cost for the entire supply chain.

(2) Wholesale contract. The manufacturer will charge each retailer-i the amount of price per unit purchased.

(3) Buy-back contract. The manufacturer charges the retailer amount of wholesale price, but pays the retailer amount of price per uni remaining at the end of the cycle on each side of retailer-i. The manufacturer also charges the retailer a standard cost for handling the remaining product return process. This standard cost can be reduced by an advance agreement with the retailers.

The game can start when all buyers have agreed to the contracts from the manufacturer. Using the synchronization process, the manufacturer will use the equilibrium from the games played by all retailers as the reference to determine the optimum result. We will only discuss two payoff functions in the current research and focus on exploring the strategic complementary that will be used by all players. Therefore, we will work with a completely new perspective and situations in multi-objective and game problems, namely supermodular multi-objective games.

Quantity orders qi is a decision variable for each retailer. For each $i \in\{1, \ldots, n\}, S_{i} \in \mathbb{R}^{1}$ is a strategic space for all retailers. We define the function $\bar{\pi}_{i}: S \rightarrow \mathbb{R}^{1}$ as retailers’ payoffs. The first payoff is related to the profit for selling the product to the respective consumers, while the second payoff is the reward function using the reward rate from the manufacturer. Each player has the same ordering (a complete, transitive binary relation) preference $\leq$ over feasible payoff outcomes which is a subset of $\mathbb{R}^{k_{i}}$. We use the lexicographic ordering induced by standard ordering ≤ in the component of each vector in this research. Therefore, for each feasible selective outcome $\boldsymbol{z}^{\prime}, \mathbf{z}^{\prime \prime} \in \mathbb{R}^{k_{i}}$, $\boldsymbol{z}^{\prime} \leq \boldsymbol{z}^{\prime \prime}$, if $\boldsymbol{z}^{\prime}=\boldsymbol{z}^{\prime \prime}$ or there is $i^{0}$ with $1 \leq i^{0} \leq n$ exist such that $z_{i}^{\prime}=z_{i}^{\prime \prime}$ and $z_{i^{0}}^{\prime}=z_{i}^{\prime \prime}$ for each $i$ with $1 \leq i \leq i^{0}$. The ordering preference of all players also induces the preference for the joint strategy space $S=\times_{i=1}^{n} S_{i}$. For each selective joint strategy $\mathbf{x}=\left\{x_{1}, \ldots, x_{n}\right\}, \quad \mathbf{y}=\left\{y_{1}, \ldots, y_{n}\right\} \in S \quad, \quad \mathbf{x} \leq \mathbf{y}$ whenever $\bar{\pi}_{i}(\mathbf{x}) \leq \bar{\pi}_{i}(\mathbf{y})$. Suppose vector $\boldsymbol{x}=\left(q_{i}, q_{-i}\right)$ denotes the joint strategy vector with the strategy $x_{i}$ of player-i replaced by $y_{i}$ in $\mathbf{x}$ and other components of $\mathbf{x}$ remain unchanged. Each player- $i$ obtains his payoff $\bar{\pi}_{i}(\mathbf{x})=\left(\bar{\pi}_{i}^{1}(\mathbf{x}), \bar{\pi}_{i}^{2}(\mathbf{x})\right), i=1,2$ when a selective joint strategy $\mathbf{x}=\left(q_{1}, q_{2}, \ldots, q_{n}\right) \in S$ is played. We assume that $S_{i}, i \in\{1, \ldots, n\}$ is a nonempty compact lattice. Therefore, $S$ is also a nonempty compact lattice.

3.2 Nomenclature and notations

Before the detailed discussion regarding the payoff function, we present the following notations used in this paper.

Notation for the manufacturer


Production rate per cycle


Setup cost per unit product


Holding cost per unit product per cycle


Inventory level per cycle


The marginal cost per unit product per cycle


Manufacturer’s payoff function

Notation for the retailers


Decision variables, product's order quantity


Retailer’s demand per cycle


Cumulative demand


Purchased cost per unit product


Wholesale price per unit product


The marginal cost per unit product per cycle


Ordering cost


Transportation cost


Buy-back cost per unit rest product


Holding cost per unit product per cycle


Retailer’s inventory level per cycle


Retailer’s vector-valued payoff function


The standard cost for handling a returning unsold product

3.3 Retailer’s payoff

The first payoff for all of the retailers is a profit function. Each retailer-i earns some profit from selling a single type of the product to their independent consumer after its reduction by some costs. The retailers order a single type of product from a single manufacturer well in advance of the selling period to meet their product needs, and retailers will then order a quantity of the product from the manufacturer with deterministic demand. The manufacturer starts its production after receiving the order of the retailer at a production rate of P with excess total retail demand (P>D). The total retail demand is divided between n retailers proportional to their stocking quantity Di such that:

$D_{i}=\left(\frac{q i}{q}\right) D$                  (11)

with the shipment quantity size, q is the sum of the order quantity of all retailers $\left(q=\sum_{i=1}^{n} q_{i}\right)$. We denote q-i=q-qi as an order quantity from another player except for player-i. Replenishment of all of the retailers using the synchronization contract is conducted via a single-shipment such that the shipment cycle time of the manufacturer is equal to the common ordering cycle time of the retailer. The manufacturer ships simultaneously numerous q to meet the total retail demand of all of the retailers. The shipment process is completed at once. We consider the condition without a lead time and shortages in all the retailers to avoid the complexity and ensure fairness in the game. Thus, the manufacturer has obtained some resources and power to ensure that the replenishment product will be received by all of the retailers simultaneously. A deterministic demand generates a constant inventory level on the retailers’ side. Therefore, the form of sales function $R(.): S_{i} \rightarrow \mathbb{R}$, which depends on the demand, is known by each player. Function R(.) is assumed to be a twice differentiable function in $q_{i} \in S_{i}$. We then explain the cost associated with the activities of the retailer. The first component of the cost is the marginal cost per unit per cycle $c_{i}^{r}$, which comprises ordering and transportation costs. This cost is simply formulated by $c_{i}^{r}=A_{i}^{r}+B_{i}^{r}$. Each retailer-i incurs the holding cost for storing unsold inventory because selling the inventory and collecting the payment quickly is time-consuming. The cost is also calculated based on the on-hand inventory level until the end of the cycle. The inventory levels are illustrated in the following figure:

Figure 1. Inventory level for manufacturer and retailer i

Figure 1 indicates that the inventory level for each retailer-i and the manufacturer is:

$I_{i}^{r}=\int_{0}^{T} D_{i} t d t=\left.\frac{1}{2} D_{i} t^{2}\right|_{0} ^{T}=\frac{1}{2} D_{i} T^{2}=\frac{1}{2} D_{i} \frac{q^{2}}{D_{i}^{2}}=\frac{1}{2} \frac{q_{i}^{2}}{D_{i}}$                    (12)

Therefore, the holding cost term is $h_{i}^{r} I_{i}^{r}$, where $h_{i}^{r} I_{i}^{r}=\frac{h_{i}^{r} q_{i}^{r}}{2 D_{i}}$. The last component of the retailers’ cost is the transfer payment $T_{i}^{r}$ using the buy-back contract. The supplier charges each retailer-$i\,\,w_{i}^{r}$, but pays each retailer-$i \,\, b_{i}^{r}$ per unit of remaining inventory at the end of the cycle. The manufacturer should not profit from the remaining excess inventory in the newsvendor problem class; thus, $b_{i}^{r} \leq w_{i}^{r}$ [27]. The unit cost associated with the buy-back contract in the deterministic case is larger than the holding cost, that is, $b_{i}^{r} \leq h_{i}^{r}$ [28]. According to Eq. (12), the inventory level formula is obtained from the integral process from T=0 until the end of cycle period T (equivalent to $\frac{q_{i}}{D_{i}}$). Furthermore, the transfer payment for the left inventory that occurred at the end of the cycle is calculated as $b_{i}^{r} I_{i}^{r}-c_{i}^{r} R\left(q_{i}\right)$. Hence, the first payoff for each retailer-i is formulated by $\pi_{i}^{1}: S_{-i} \times S_{i} \rightarrow \mathbb{R}$ with:

$\pi_{i}^{1}\left(q_{i}, q_{-i}\right)=\left(p_{i}^{r}-c_{i}^{r}\right) R\left(q_{i}\right)-\left(h_{i}^{r}-b_{i}^{r}\right) \frac{1}{2} \frac{q_{i}^{2}}{D_{i}}-w_{i}^{r} q_{i}$.                    (13)

We will then explain the second payoff for each retailer-i. The manufacturer will buy the remaining product at the retailers due to the buy-back contract. Therefore, the manufacturer also incurs a standard cost crt to handle the returning process (including transportation, packing, and other preparations), and such a cost will be charged to all retailers. However, the standard cost can still be reduced based on the number of unsold products in each retailer-i. For each unit sold product, the cost will be multiplied by a unit price gm, in which these results will be used to reduce the fixed cost crt. Therefore, each retailer must minimize the returning process cost which is formed in the real-valued function $f_{i}: S_{-i} \times S_{i} \rightarrow \mathbb{R}$, in which $f_{i}\left(q, q_{-i}\right):=c_{r t}-g_{m} q_{i}$. It is assumed that $c_{r t} \geq g_{m} q_{i}$ therefore, $f_{i}\left(q, q_{-i}\right) \geq 0$. The minimization process for fi(q, q-i) can be converted into the equivalent problem that maximizes -fi(q, q-i) to simplify the analytical process of the game. Therefore, each retailer will maximize the -fi(q, q-i) as the second payoff. Hence, the second payoff function for each retailer-i, $\pi_{i}^{2}: S_{-i} \times S_{i} \rightarrow \mathbb{R}$ is formulated by:

$\pi_{i}^{2}\left(q_{i}, q_{-i}\right)=g_{m} q_{i}-c_{r t}$                 (14)

The optimal solution $q_{i}^{*}$ is also the equilibrium of the games. In other words, $q_{i}^{*}$ should be the weighted best response from each retailer-i. The weighted Nash equilibrium $q_{i}^{*}$ must be included in the Pareto set, and this value is used by the manufacturer as optimal decision variables $q_{i}^{*}$.

3.4 Manufacturer’s payoff

We also explain the payoff function of the manufacturer. The manufacturer does not follow the game directly because his payoff depends on the optimum solution by the retailers. The manufacturer has one payoff function, which is the profit function. This profit is obtained from the purchased cost after its reduction by some costs, which include the cost to run the setup process, production process, holding, and shipments. Figure 1 shows that the inventory level of the manufacturer is:

$I_{m}=\frac{q^{2}}{2 P}$               (15)

Therefore, the holding cost term for the manufacturer is:

$h_{m} I_{m}=h_{m} \frac{q^{2}}{2 P}$                 (16)

Only one component is assumed for the marginal cost, that is cm=Am. Hence, we obtain the payoff function $\pi_{m}: S_{m} \rightarrow \mathbb{R}$ of the manufacturer as follows:

$\pi_{m}(q)=\sum_{i=1}^{n}\left(w_{i}^{r}-b_{i}^{r}\right) q_{i}-c_{m} q-h_{m} \frac{q^{2}}{2 P}+c_{r t}-g_{m} q_{i}$               (17)

4. Optimum Analysis

We consider the multi-objective inventory game with two payoffs $G P_{m}=\left(\rho,\left\{S_{i}\right\}_{i \in \mathbb{N}} ;\left\{\bar{F}_{i}\right\}_{i \in \mathbb{N}}\right)$ in this section. We present the analytical analysis of the equilibrium of GPm. First, we will verify whether GPm is it a supermodular game or not. Take any value $q_{i}^{\prime}, q_{i}^{\prime \prime} \in S_{i}$, with $q_{i}^{\prime} \preccurlyeq q_{i}^{\prime \prime}$ Since Si is a nonempty lattice (and also as chain); thus, $q_{i}^{\prime} \vee q_{i}^{\prime \prime}, q_{i}^{\prime} \wedge q_{i}^{\prime \prime} \in S_{i}$ and:

$\pi_{i}^{1}\left(q_{-i}, q_{i}^{\prime}\right)+\pi_{i}^{1}\left(q_{-i}, q_{i}^{\prime \prime}\right) \preccurlyeq \pi_{i}^{1}\left(q_{-i}, q_{i}^{\prime} \vee q_{i}^{\prime \prime}\right)+\pi_{i}^{1}\left(q_{-i}, q_{i}^{\prime} \wedge q_{i}^{\prime \prime}\right)$                   (18)

Therefore, $\pi_{i}^{1}$ is the supermodular function in yi on Si for each $q_{-i} \in S_{-i}$ and $\pi_{i}^{1}$ has increasing differences in (q-i, yi) on S-i×Si. Again, if we take any $q_{i}^{\prime}, q_{i}^{\prime \prime} \in S_{i}, q_{i}^{\prime} \leqslant q_{i}^{\prime \prime}$, then we can obtain:

$\pi_{i}^{2}\left(q_{-i}, q_{i}^{\prime}\right)+\pi_{i}^{2}\left(q_{-i}, q_{i}^{\prime \prime}\right)=g_{m} q_{-i} q_{i}^{\prime}+g_{m} q_{-i} q_{i}^{\prime \prime}$$\leqslant \pi_{i}^{2}\left(q_{-i}, q_{i}^{\prime} \vee q_{i}^{\prime \prime}\right)+\pi_{i}^{2}\left(q_{-i}, q_{i}^{\prime} \wedge q_{i}^{\prime \prime}\right)$                   (19)

Therefore, $\pi_{i}^{2}\left(q_{-i}, .\right)$ is also supermodular in yi on Si for each $q_{-i} \in S_{-i}$ and demonstrates increasing differences in (q-i, yi) on S-i×Si. Moreover, $\pi_{i}^{1}\left(q_{-i}, .\right)$ and $\pi_{i}^{2}\left(q_{-i}, .\right)$ are verified to be continuous (also upper-semicontinuous) functions in yi on Si for each $q_{-i} \in S_{-i}$. Hence, GPm is an NC-SMOG. Furthermore, two equilibria exist for this game. Definition 2.2 indicates that two conditions exist for the Pareto equilibrium $\mathbf{x}^{\star}=\left(q_{1}^{\star}, q_{2}^{\star}, \cdots, q_{n}^{\star}\right)$. A selective $q_{i}^{\star}, i \in\{1, \ldots, n\}$ should be an optimal solution for the multi-objective problem for each respective player-i. Furthermore, $x_{i} \in S_{i}$ does not exist such that satisfies (3) for r=1, 2, …, ki, which holds strictly for at least one index r. We use the weighted sum method under a priori assumption to obtain a solution to the multi-objective problem with the associated payoff of all players. We change the multi-objective problems into the single objective ones following the form in (4). The sum of the weight values in these functions must be equal to one, that is, $\sum_{i=1}^{n} \sum_{r=1}^{r=k_{i}} \delta^{k_{i}}$. All players are considered to be informed of their weights. We will then determine the weighted Nash equilibrium of GPm. Suppose the weights $\bar{\delta}=\left(\delta_{1}, \delta_{2}\right)$, where $\delta_{i}=\left(\delta_{i}^{1}, \delta_{i}^{2}\right) \in \mathbb{R}_{++}^{2}$. We define the weighted best response of all of the players as:

$\mathrm{u}^{\bar{\delta}}(\mathbf{x}, \mathbf{y})=\sum_{i=1}^{n} \delta_{i}^{1}\left(\left(p_{i}^{r}-c_{i}^{r}\right) R\left(y_{i}\right)-\left(h_{i}^{r}-b_{i}^{r}\right) \frac{1}{2} \frac{q_{i}^{2}}{D_{i}}-w_{i}^{r} y_{i}\right)+\sum_{i=1}^{n} \delta_{i}^{2} g_{m} q_{i}-c_{r t}$                (20)

The function $u^{\bar{\delta}}(\mathbf{x}, .)$ in (20) is supermodular in y on S for each q-i in S-i by supermodularity of function in (13) and (14). Function $u^{\bar{\delta}}(\mathbf{x}, .)$ is defined on the chain; therefore, it has increasing differences in q-i, yi on S-i×Si. If Si, $i \in\{1, \ldots, n\}$ is a nonempty complete lattice, then GPm has a weighted Nash equilibrium. Let q(q-i) be the response function of a retailer-i. Each retailer-i has symmetric payoff functions; therefore, $q_{j}\left(q_{-j}\right)=q_{i}\left(q_{-i}\right), i \neq j$ If a quantity ordering set $\mathbf{x}^{*}=\left\{q_{1}^{*}, q_{2}^{*}, \ldots, q_{n}^{*}\right\}$ is a weighted best response of all of the retailers, then x* is the weighted Nash equilibrium, thereby satisfying $q_{i}^{*}=q_{i}\left(q_{-i}^{*}\right)$, with $q_{-i}^{*}=q^{*}-q_{i}^{*}$ and $\sum_{j=1}^{n} q_{j}^{*}, j \in\{1, \ldots, n\}$. The weighted Nash equilibrium becomes a fixed point of $Z^{\bar{\delta}}(.)$ in Definition 2.3. Another approach is using Definition 2.5. to obtain that equilibrium directly.

The Pareto equilibrium of NC-SMOG can be numerically obtained by the same process when determining Pareto optimal solutions for multi-objective problems. Many different methods can be used to solve multi-objective optimization. One of these methods is the genetic algorithm. The genetic algorithm can solve the difficult optimization problem [29]. We apply a genetic algorithm in this research through NSGA II in Python using pymoo packages [30]. Most numerical methods used to identify the Nash equilibrium are designed for two players. We extend the dominance principle of the payoff matrix element in this research. We modify the payoff matrix in the single objective (payoff) into the multi-objective case and obtain the algorithm to acquire the weighted Nash equilibrium in the case of two players. The elements of each player in the payoff matrix are obtained from the sum of the dot product between the payoff and the respective weight. We use the dominance property principle to analyze each element in each row and column. This analysis will check all of the elements for each row and each column and then determine whether other elements can be a high payoff or not. The pseudocode of the algorithm is as follows.

Algorithm to obtain the weighted Nash equilibrium


[1]: Import: NumPy as np.

[2]: Define the initial requirement value.

[3]:   Input: the weights value r time for Each Player.

[4]:   Input: the lower and the upper bound of strategic space.  

[5]: Define the strategic space using np.array.

[6]: For each strategy in Player I’s strategic space:

[7]:   For each strategy in Player II’s strategic space:

[8]:    Input: the form of weighted payoff for Player I and check if the weighted payoff is less than or equal to the upper bound.

[9]:    Input: the form of the weighted payoff for Player II and check if the weighted payoff is less than or equal to the upper bound.

[10]:        If satisfy steps 8 and 9:

[11]:             Print the value of the weighted payoff.

[12]:        If neither can satisfy:

[13]              Print “0” as the weighted payoff.

[14]:  Input: result from steps 11 – 13 as a matrix form where Player I is a row Player I and Player II column player.

[15]: For each element in each row of the weighted payoff matrix:

[16]:   For each element in each column of the weighted payoff matrix.

[17]:     Check if the row player can gain a better payoff.

[18]:     Check if the column player can gain a better payoff.

[19]:          If neither element can be:

[20]:              Set as a weighted Nash equilibrium.

[21]:          If any element can be:

[22]:              Set as not a weighted Nash equilibrium.


5. Numerical Example

We provide some numerical examples in this section to obtain the representation of the equilibria of GPm. We consider the supermodular games GPm with two retailers and denoted by GPm. The respective joint strategy space is the set:

$S=\left\{x \mid q_{1}=q_{2}, q_{2} \in[1,3)\right\} \cup([3,25] \times[3,25])$                 (21)

The sales function for the first retailer is $R_{1}: S_{1} \rightarrow \mathbb{R}$, where $R_{1}\left(q_{1}\right)=q_{1}^{3}-4 q_{1}^{2}+5$ and for the second retailer is $R_{2}: S_{2} \rightarrow \mathbb{R}$, with $R_{2}\left(q_{2}\right)=q_{1}^{3}-4 q_{1}^{2}+5$. Let the value of the parameters for the retailers be $p_{1}^{r}=100,$ $ p_{2}^{r}=95,$ $ c_{1}^{r}=10$, $c_{2}^{r}=8, $ $h_{1}^{r}=5,$ $ h_{2}^{r}=6,$ $ b_{1}^{r}=20,$ $ b_{2}^{r}=20,$ $ w_{1}^{r}=55$, $w_{1}^{r}=55,$ $ D_{1}=30,$ $ D_{2}=25, g_{m}=0.1$, $c_{r t}=25$. Therefore, the payoff function for the first retailer is $\pi_{1}^{k_{1}}: S_{2} \rightarrow \mathbb{R}, k_{1} \in\{1,2\}$, where:

$\pi_{1}^{2}\left(q_{1}, q_{2}\right)=(90)\left(q_{1}^{3}-4 q_{1}^{2}+5\right)+\frac{1}{4} q_{1}^{2}-55 q_{1}=30 q_{1}^{3}-359.75 q_{1}^{2}-55 q_{1}+450$        (22)


$\pi_{1}^{2}\left(q_{1}, q_{2}\right)=0.1 q_{1}-25$                (23)

By contrast, the payoff function for the second retailer is $\pi_{2}^{k_{2}}: S_{2} \rightarrow \mathbb{R}, k_{2} \in\{1,2\}$, where:

$\pi_{2}^{1}\left(q_{2}, q_{1}\right)=(87)\left(q_{2}^{3}-5 q_{2}^{2}+8\right)+\frac{7}{25} q_{2}^{2}-55 q_{2}=87 q_{2}^{3}-434.72 q_{2}^{2}-55 q_{2}+696$              (24)


$\pi_{1}^{2}\left(q_{2}, q_{1}\right)=0.1 q_{2}-25$               (25)

The joint strategic space S in (21) is a nonempty subcomplete sublattice of lattice $\mathbb{R}^{2}$ with a lexicographic ordering relation; thus, S in (21) is a nonempty compact sublattice. The payoff functions (22), (23), (24), and (25) are upper-semicontinuous functions in yi on Siq-i for each q-i in S-i, $i \in\{1,2\}$. Hence, GPm1 demonstrates the Pareto equilibrium.

We will then determine the Pareto equilibrium of GPm1 using Definition 2.2. Based on the form of S in (21), if one of the players plays a strategy $q_{1} \in[1,3)$, then the other must choose the same strategy. The player cannot increase his strategy as long as the other retains his choice. Therefore, a set [1, 3] is a Pareto equilibrium. By contrast; based on the form of joint strategy S set in (21), the strategy which is contained in [3, 25] can be chosen by each player without depending on the choices of the other player. It means that each player doesn't have to choose the same strategy as other players. For example, when player 1 chooses q1=5, then player 2 doesn’t have to choose q2=25 and can freely choose their strategy in [3, 25] in the game. Payoffs (23) and (25) have different forms; however, payoffs (22) and (24) only depend on the choice of strategy of each player. Thus, each player will choose the strategy with the largest possible payoff (in this case, q1=25). The Pareto equilibrium must be a solution to a multi-objective problem related to the multi-payoff from all players. We use a weighted sum method under a priori assumption in the current research. Using the aforementioned method, we perform the optimization process by employing a new single objective function. This function is obtained from the sum of the finite product of each payoff component with its respective weights. The sum of the weights must be equal to one $\left(\delta_{1}^{1}+\delta_{1}^{2}+\delta_{2}^{1}+\delta_{2}^{2}=1\right)$. In this case, we simply use the weighted best joint response $k^{\bar{\delta}}(., .)$ for GPm1 such that:

$k^{\bar{\delta}}\left(q_{1}, q_{2}\right)=\delta_{1}^{1}\left(90\left(q_{1}\right)^{3}-359.75\left(q_{1}\right)^{2}-55 q_{1}+450\right)+\delta_{1}^{2}\left(0.1 q_{1}-25\right)$

$+\delta_{2}^{1}\left(87\left(q_{1}\right)^{3}-434.72\left(q_{2}\right)^{2}-55 q_{2}+696\right)+\delta_{2}^{2}\left(0.1 q_{2}-25\right)$              (26)

As previously explained, if one of the players plays a strategy $q_{1} \in[1,3)$, then the maximum value of $k^{\bar{\delta}}(., .)$ is reached for all points contained in [1, 3]. Therefore, all points in [1, 3] are fixed points of $Z^{\bar{\delta}}(.)$ which are related to $k^{\bar{\delta}}(., .)$. Furthermore, function $k^{\bar{\delta}}(., .)$ is twice differentiable on $\mathbb{R}^{2}$ and satisfies $\frac{\partial k^{\delta}\left(q_{1}, q_{2}\right)}{\partial q_{2} \partial q_{1}}=0$ and $\frac{\partial k^{\delta}\left(q_{1}, q_{2}\right)}{\partial q_{1} \partial q_{2}}=0$; thus, $k^{\bar{\delta}}(..,)$ is valuation (that is, supermodular and submodular simultaneously). Therefore, function $k^{\bar{\delta}}(\mathbf{x}, .)$ demonstrates increasing differences in $\mathbb{R}^{2}$ for each $\mathbf{x} \in \mathbb{R}^{2}$. A joint strategic space S is a chain and $k^{\bar{\delta}}(\mathbf{x}, .)$ is supermodular on S for $\mathbf{x} \in S$. The two conditions imply that function $k^{\bar{\delta}}(\mathbf{x}, .)$ will not reach its maximum at the stationary point, which is not the endpoint of the interval Si=[3, 25]. Function $k^{\bar{\delta}}(\boldsymbol{x}, .)$ is continuous (and also upper semicontinuous) on S×S. Therefore, $k^{\bar{\delta}}(x, .)$ reaches its maximum at point qi=25, i=1, 2. The value of all of the weights is less than one; thus, the payoff for each player will be smaller than before being weighting. The maximum possible weight value that can be taken by the first player and the second players is $\delta_{1}^{1}+\delta_{1}^{2}=0.5$ and $\delta_{2}^{1}+\delta_{2}^{2}=0.5$, respectively. Each player must divide the priority in determining the weight value for their payoff components. If they play rationally, then they will provide as much weight as possible to the first payoff. However, assigning a zero weight value to one of the payoff functions is not permissible. Hence, the Pareto equilibrium for GPm1 is:

$\mathbf{x}^{\star}=\left\{\left(q_{1}^{\star}, q_{2}^{\star}\right) \mid q_{1}^{\star}, q_{2}^{\star} \in[1,3) \cup\{25\}\right\}$                 (27)

with optimum value of the weights at $\delta_{1}^{1 \star}+\delta_{1}^{2 \star}=0.5$ and $\delta_{2}^{1 \star}+\delta_{2}^{2 \star}=0.5$. The second equilibrium is the weighted Nash equilibrium. We determine this equilibrium by using Definition 2.5. The selective strategy $q_{1}^{*} \in S_{1}$ of the first player is called the weighted Nash concerning the optimum strategy of the second player if the following condition holds.

$\delta_{1}^{1}\left(90\left(q_{1}\right)^{3}-359.75\left(q_{1}\right)^{2}-55 q_{1}+450\right)+\delta_{1}^{2}\left(0.1 q_{1}-25\right)$

$\leq \delta_{1}^{1}\left(90\left(q_{1}^{*}\right)^{3}-359.75\left(q_{1}^{*}\right)^{2}-55 q_{1}^{*}+450\right)+\delta_{1}^{2}\left(0.1 q_{1}^{*}-25\right)$          (28)

If $q_{2}^{*} \in[1,3)$, then the first player must choose the same strategy. For each $\delta_{1}^{1}, \delta_{1}^{2} \in \mathbb{R}^{++}$, any $q_{1} \in[1,3)$ satisfies (10). By contrast, if $q_{1}^{*} \in[3,25], q_{1}=25$, then (10) is satisfied for each $\delta_{1}^{1}, \delta_{1}^{2} \in \mathbb{R}^{++}$. Furthermore, a selective strategy $q_{2}^{*} \in S_{2}$ is called the weighted Nash concerning of the first player if the following condition holds.

$\begin{aligned} & \delta_{2}^{1}\left(87\left(q_{1}\right)^{3}-434.72\left(q_{2}\right)^{2}-55 q_{2}+696\right)+\delta_{2}^{2}\left(0.1 q_{2}-25\right) \\ \leq & \delta_{2}^{1}\left(87\left(q_{2}^{*}\right)^{3}-434.72\left(q_{2}^{*}\right)^{2}-55 q_{2}^{*}+696\right)+\delta_{2}^{2}\left(0.1 q_{2}^{*}-25\right) \end{aligned}$                (29)

Similar to the first player, for each $\delta_{1}^{1}, \delta_{1}^{2} \in \mathbb{R}^{++}$and $q_{1} \in[1,3)$, any $q_{2} \in[1,3)$ satisfies (10). However, for each $q_{2} \in[1,3), q_{2}=25$, (10) is satisfied for each $\delta_{1}^{1}, \delta_{1}^{2} \in \mathbb{R}^{++}$. Hence, the set of equilibrium (28) is also the weighted Nash equilibrium. The largest equilibrium is $\left(q_{1}^{*}, q_{2}^{*}\right)=(25,25)$ and the least equilibrium is $\left(q_{1}^{*}, q_{2}^{*}\right)=(1,1)$.

Next, we will use the algorithms to obtain the two equilibria. Only criteria related to the multi-objective optimization will be examined for the Pareto equilibrium. The selective strategy in [1, 3) disregards the use of a numerical test; therefore, the numerical test is only performed for the selective strategy in the [3, 25]. We use a genetic algorithm NSGA II [30] to obtain solutions numerically. We take the weight value of 0.25 for each payoff component. To use a genetic algorithm type, we must input some value of parameters. These parameters are commonly described in the terms of biology and genetics, such as population, the crossover, and the offspring. We take some value of the number of population, termination population, and several offsprings as the initial value for the algorithm. We take the crossover probability of 0.4 for the numerical test. The optimal results for several different parameter values are presented below Table 1.

Based on Table 1, if we use a large number (more than 150) as the value of the initial population in an NSGA II, then we obtain that the equilibrium value will tend to the q1=25 and q2=25. Therefore, we obtain a single Pareto equilibrium $x^{\star}=\left(q_{1}^{\star}, q_{2}^{\star}\right)=(25,25)$. This Pareto equilibrium is the optimal solution for each player regarding their multi-objective. If one of the players chooses a strategy in [1, 3), then all of the points in [1, 3) are the Nash equilibrium for another player based on the definition of S in (21). Furthermore, we will determine the weighted Nash equilibrium when the players choose a strategy in the interval [3, 25]. We apply the dominance principle to determine the equilibrium. We take 0.25 as the weight for each component of the payoff function. Each player takes 23 strategies such as qi=3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, and 25. Therefore, we obtain the weighted payoff matrix with 529 elements. We also obtain the weighted Nash equilibrium of $\left(q_{1}^{*}, q_{2}^{*}\right)=(25,25)$ by using the proposed algorithm in the previous section. If both players choose the aforementioned strategy, player 1 earns a total payoff (sum of the first payoff and the second payoff) of 295,114. 875 (in IDR 1000), while player 2 obtains 271743,563 (in IDR 1000). Furthermore, each player can pay minimal costs for the returning process while increasing profits simultaneously by taking the higher strategy (qi>1). Because it is assumed that all players use the complementary strategy, then it can be suggested that all players choose the highest equilibrium $\left(q_{1}^{*}, q_{2}^{*}\right)=(25,25)$ All retailers can eventually obtain the maximum profit and pay the minimum returning cost if they choose $\left(q_{1}^{*}, q_{2}^{*}\right)=(25,25)$. Therefore, the highest and the least weighted Nash equilibrium is $\left(q_{1}^{*}, q_{2}^{*}\right)=(1,1)$ and $\left(q_{1}^{*}, q_{2}^{*}\right)=(25,25)$, respectively. The weighted Nash equilibrium results in the game performed by all retailers are used by the manufacturer to determine his optimal moves. Let the value of the parameters for the manufacturer be cm=13, hm=4, and P=60. Two possible payoffs will be obtained by the manufacturer according to the optimum quantity ordered from the retailers.

(1) If each retailer chooses to play the least optimum strategy $\left(q_{1}^{*}, q_{2}^{*}\right)=(1,1)$, then the manufacturer will use q*=1+1 as his strategy and will earn πm=68.76 (in IDR 1000).

(2) If each retailer chooses to play the largest optimum strategy $\left(q_{1}^{*}, q_{2}^{*}\right)=(25,25)$ then the manufacturer will use q*=25+25=50 as his strategy and will earn πm=1039,167 (in IDR 1000).

Hence, if each retailer takes the high-order quantities, then such a condition is profitable for the manufacturer.

Table 1. Numerical test for pareto equilibrium using an NSGA II

Initial Population

Termination Population








































































































6. Conclusions

A new multi-objective inventory game for a single manufacturer and multi-retailer has been formulated under the synchronization process, the wholesale contract, and the buy-back contract. The manufacturer is the coordinator of the system and determines the details of the contract. All the retailers conduct a non-cooperative game to determine the optimum decision. This game is analyzed using the concepts of non-cooperative supermodular multi-objective games We must convert the second payoff for each retailer into the maximation case to obtain the second payoff for each retailer. Each retailer can pay a minimal cost for the returning process while increasing profits simultaneously by taking the superior strategy. Therefore, we obtain the highest and the least weighted Nash equilibrium which is included in the Pareto set equilibrium. We use a class of genetic algorithms to obtain the Pareto equilibrium. Finally, the dominance principle is applied to obtain the weighted Nash equilibrium.


This work is supported by BPPDN 2019 Ph.D. Scholarships from the Higher Education General Director (Dirjen Dikti) of the Ministry of Education, Culture, Research, and Technology (Kemendikbudristek) of the Republic of Indonesia.


[1] Parlar, M. (1988). Game theoretic analysis of the substitutable product inventory problem with random demands. Naval Research Logistics, 35: 397-409.;2-Z

[2] Cachon, G.P. (2001). Stock wars: Inventory competition in a two-echelon supply chain with multiple retailers. Operations Research, 49: 558-674. 

[3] Lippman, S.A., McCardle, K.F. (1994). The competitive newsboy. Operation Research, 45, 54-65.

[4] Silbermayr, L. (2020). A review of non-cooperative news vendor games with horizontal inventory interactions. Omega, 92: 1-12.

[5] Cachon, G.P., Netessine, S. (2004). Game theory in supply chain with multiple retailers. In: Wu, S.D., Sen, Z.-J. (eds) Handbook of Quantitative Supply Chain Analysis: Modelling in E-Business Era, pp. 13-59.

[6] Mahazan, S., Ryzin, G.V. (2001). Inventory competition under dynamic consumer choice. Operation Research, 49(5): 646-657.

[7] Yin, S., Nishi, T., Zhang, G. (2016). A game theoretic model for coordination of single manufacturer and multiple suppliers with quality variations under uncertain demands. International Journal of Systems Science: Operations & Logistics, 3(2): 79-91.

[8] Setiawan, R., Salmah, Widodo, Endrayanto, I., Indarsih. (2021). Nash equilibrium for manufacturer-retailer inventory model with perishable goods and time depending holding cost. In: The third International Conference on Mathematis: Education, Theory, and Application, AIP Conference Proceeding, 2326: 1-5.

[9] Ji, Y., Liu, M., Qu, S. (2018). Multi-objective linear programming and applications in supply chain competition. Future Generation System, 86: 591-597.

[10] Lozovanu, D., Solomon, D., Zelikovsky, A. (2005). Multi-objective games and determining pareto-nash equilibria. Buletinul Academieide Stiinte a Republicii Moldova, Matematica, 3(49): 115-122.

[11] Topkis, D.M. (1978). Minimizing a submodular function on a lattice. Operations Research, 26(2): 305-321.

[12] Topkis, D.M. (1979). Equilibrium points in nonzero-sum n-person submodular games. SIAM Journal on Control and Optimization, 17(6): 773-787.

[13] Topkis, D.M. (1998). Supermodularity and Complementarity. Princeton University Press, New Jersey.

[14] Vives, X. (1990). Nash equilibrium with strategic complementarities. Journal of Mathematical Economics 19(3): 305-321.

[15] Milgrom, P., Roberts, J. (1990). Rationalizability, learning, and equilibrium in games with strategic complementarities. Econometrica, 58(6): 1255-1277.

[16] Milgrom, P., Shannon, C. (1994). Monotone comparative statics. Econometrica, 62(1): 157-180.

[17] d’Orey, V. (1996). Fixed point theorems for correspodences with values in a partially ordered set and extended supermodular games. Journal of Mathematical Economics, 25: 345-354.

[18] Amir, R. (1996). Cournot oligopoly and the theory of supermodular games. Games and Economic Behaviour, 15(2): 132-148.

[19] Graziosi, G.R. (2019). The supermodularity of the tax competition game. Journal of Mathematical Economics, 83: 25-35.

[20] Koshevoy, G., Suzuki, T., Talman, D. (2016). Supermodular NTU-games. Operations Research Letters, 44(4): 446-450.

[21] Chen, Y.J. (2009). Monotonicity in the stock competition game with consumer search. Operation Research Letters, 37(6): 430-432.

[22] Bra ̂nzei, L., Mallozi, L., Tijs, S. (2003). Supermodular games and potential games. Journal of Mathematical Economics, 39(1): 39-49.

[23] Hoque, M.A. (2008). Syncronization in the single-manufacturer multi-buyer integrated inventory supply chain. Production, Manufacturing, and Logistics, 188: 811-825.

[24] Jha, J., Shanker, K. (2013). Single-vendor multi-buyer integrated production-inventory model with controllable lead time and service level constraints. Applied Mathematical Modelling, 37(4): 1753-1767.

[25] Mandal, P., Giri, B.C. (2015). A single-vendor multi-buyer integrated model with controllable lead time and quality improvement through reduction in defective items. International Journal of Systems Science: Operations & Logistics, 2(1): 1-14.

[26] Setiawan, R., Salmah, Widodo, Endrayanto, I., Indarsih. (2021). Analysis of the single-vendor-multi-buyer-inventory model for imperfect quality with controllable lead time. IAENG Journal of Applied Mathematics, 51(3): 1-10.

[27] Pasternack, B.A. (2008). Optimal pricing and return policies for perishable commodities. Marketing Science, 27(1): 133-140.

[28] Wang, Y., Gerchak, Y. (2001). Supply chain coordination when demand is shelf-space dependent. Manufacturing & Service Operations Management, 3(1): 82-87.

[29] Sadrzadeh, A. (2012). A genetic algorithm with the heuristic procedure to solve the multi-line layout problem. Computers & Industrial Engineering, 62(4): 1055-1064.

[30] Blank, J., Deb, K. (2020). Multi-objective optimization in Python. IEEE Access, 8: 89497-89509.