#import "@preview/subpar:0.2.2"
#import "@preview/cetz:0.3.4": canvas, draw
#import "../utils/blocks.typ": *
#import "@preview/theorion:0.3.3": *
#import cosmos.simple: *
#import "notations.typ": *
#show: show-theorion

#set text(lang: "en")
#set math.equation(numbering: none)

#let cost = math.op("cost")
#let val = math.op("val")
#let rellr(l, r) = {
  math.op(eval("$-->_(" + l + ",R," + r + ")$"))
}

#let rell(l) = {
  math.op(eval("$-->_(" + l + ",R)$"))
}

#let relr(r) = {
  math.op(eval("$-->_(R," + r + ")$"))
}
#let rel() = math.op($-->_R$)

#let torel = math.op("rel")

#let downlr(l, r) = {
  math.op(eval("$-->_(" + l + ",D," + r + ")$"))
}

#let downl(l) = {
  math.op(eval("$-->_(" + l + ",D)$"))
}

#let downr(r) = {
  math.op(eval("$-->_(D," + r + ")$"))
}

#let down() = math.op(eval("$-->_D$"))
#let hide = math.op("hide")
#let rem = math.op("rem")
#let bay = math.op("Bay")
#let add = math.op("add")
#let free = math.op("free")


#set heading(numbering: "1.")










= Introduction <intro>

In this section we give all the formal definitions needed
to prove that the heuristic $L$ is worst-case optimal.
We suppose that $N,W,H in NN$ are given constants which
are the initial number of containers, the number of stacks
and the maximal height of the stacks. Note that we must have
$N < W dot H$ for the game to be feasible, else no container
could be relocated.

To define the game, we first have to define a state
for both players, i.e. player $1$ which is relocating
the containers and player $2$ which is asking for containers
to retrieve. Intuitively, states for player $1$ are just a given
configuration of the bay, which can be represented by a single
vector whose components are the heights of the stacks. Yet we will prefer a set of coordinates, which is easier to manipulate. For the player $2$, it is a configuration of the bay and coordinates of the asked container. Yet this container must be
high enough to there is enough free space for player $1$ to move
every container above it. This is formally defined in @def_states.

#definition(title: "states and notations")[
  - $Q_2$, the set of states of the player $2$, is composed of every $s in [|0,H|]^W$
  - for any $s in Q_2$ and $1 <= i <= W$, we denote by $s_i$ the $i$-th coordinate of $s$, i.e. the height of the $i$-th stack of $s$.
  - $Q_1$, the set of states of the player $1$, is composed of every $s^(i,j) = (s,i,j) in Q_2 times [|1,W|] times [|1,H|]$ such that $1 <= j <= s_i$ (player $2$ must request a container, not an empty space) and $(W - 1) dot H >= |s| - j$ (there must be enough space for player $1$ to move the containers above the requested one).
  - For any $s^(i,j) in Q_1$, the number of containers player $1$ has to relocate is independant from its decision, its value is $s_i - j$ and is denoted by $torel(s, i, j)$
] <def_states>

#example()[
  Let us consider $a_1 = {(1,1),(2,1),(3,1),(1,2),(3,2)}$ in a bay of width $W = 3$ and height $H = 3$, a representation of $a_1$ is given in @imex1a. If the bottom left container is requested, we get the state $a_1^(1,1)$ given in @imex1b, the requested container is the one with a dot in it.
]

#let imex1a = im_blocks((
  (
    (2, 1, 2),
    none-style(),
  ),
))

#let imex1b = im_blocks((
  (
    (1, 0, 0),
    request(),
  ),
  (
    (1, 1, 2),
    none-style(),
  ),
))

#let imex1c = im_blocks((
  (
    (2, 0, 0),
    none-style(stroke_ext: (dash: "dashed")),
  ),
  (
    (0, 1, 3),
    none-style(),
  ),
))

#subpar.grid(
  figure(imex1a, caption: [The state $a_1$]), <imex1a>, figure(imex1b, caption: [The state $a_1^(1,1)$]),
  <imex1b>, figure(imex1c, caption: [The state $a_2$]), <imex1c>,
  columns: (1fr, 1fr, 1fr),
  caption: [Representation of basic configurations],
  label: <ex1>,
)

We need to define some notations to make it easier to manipulate movements of containers between the stacks, it can be adding a container, removing/retrieving one, or relocating it from a stack to another.

#definition(title: "retrieve - relocation")[
  - For $i in [|1,W|]$ we call $e_i$ the vector composed of $0$s and a $1$ at position $i$. Hence for $s$ in $Q_2$, if $s_i < H$ the state $s + e_i$ is the same configuration of containers as $s$ with an additional one on position $i$.
  - Similarly if $s_i > 0, s - e_i$ the configuration obtained by retrieving a container from stack $i$.
  - For any $1 <= i,j <= W$ and $s in Q_2$, if $s_j,s_i < H$ we say that $s + e_i$ relocates to $s + e_j$, which we denote by $s + e_i rellr("i", "j") s + e_j$. Moreover, if $s_i > s_j$ we say that $s + e_i$ downs to $s + e_j$ and we note it $s + e_i downlr("i", "j") s + e_j$.
]

These help us defining the so-called transition of the game, intuitively, player $2$ just have to keep the same state and choose the coordinates of the container he will request, and player $1$ has to move the $torel(s, i, j)$ containers above the requested one and then (or before) retrieve it.

#definition(title: "CRP transition")[
  For any two states $s in Q_2, t^(i,j) = (t,i,j) in Q_1$ we have :
  - $s --> t^(i,j)$ if and only if $s = t$
  - $t^(i,j) --> s$ if and only if $t - e_i rell("i")^(torel(t^(i,j))) s$
]

#example()[
  We also consider the state $a_2 = add_3(rem_1^2 (a_1))$ which verifies $a_1^(1,1) --> a_2$, represented in @imex1c, the previous position of relocated or retrieved containers is dashed.
]


#definition(title: "CRP game")[
  Given $N,H,W in NN$, the CRP game $G$ is defined as the tuple $(Q_2,Q_1,-->)$. We also call $G_s$ the CRP game starting with the state $s in Q_2$
] <thm:game>

The next step is to introduce strategies, in this section, the considered strategies for both player are adaptative, memoryless and deterministic.

#definition(title: "strategies")[
  A strategy for player $1$ is a mapping $sigma : Q_1 -> Q_2$ such that for any $s^(i,j) in Q_1$, $s^(i,j) --> sigma(s^(i,j))$. Respectively, a strategy for player $2$ is a mapping $pi : Q_2 -> Q_1$ such that for any $s in Q_2$, $s --> pi(s)$. We call $Sigma,Pi$ the set of strategies of player $1$ and $2$.
]

When both players fix a strategy and an initial state is given, it induces a play, which is a sequence of requests and retrievals. The cost of this play is the number of relocations made by player $1$. A play is necessarly finite and converges to the state with no containers, since any move from player $1$ removes one container.

#definition(title: "play, value of a play")[
  Given an initial state $u in Q_1 union Q_2$ and two strategies
  $sigma in Sigma, pi in Pi$. A play $s --> s_1 --> ... --> s_K = emptyset$ is induced. The cost of this play is the number of relocations, i.e. $sum_(0 <= k <= K\ s_k in Q_1) torel(s_k)$ We call this value $val_(sigma,pi)(s)$.
]

#let imex2a = im_blocks((
  (
    (1, 0, 0),
    request(),
  ),
  (
    (2, 1, 1),
    none-style(),
  ),
))

#let imex2b = im_blocks((
  (
    (3, 0, 0),
    none-style(stroke_ext: (dash: "dashed")),
  ),
  (
    (0, 0, 1),
    request(),
  ),
  (
    (0, 2, 1),
    none-style(),
  ),
))

#let imex2c = im_blocks((
  (
    (0, 0, 2),
    none-style(stroke_ext: (dash: "dashed")),
  ),
  (
    (0, 1, 0),
    none-style(),
  ),
  (
    (0, 1, 0),
    request(),
  ),
  (
    (0, 1, 0),
    none-style(),
  ),
))

#let imex2d = im_blocks((
  (
    (1, 1, 1),
    none-style(),
  ),
  (
    (0, 2, 0),
    none-style(stroke_ext: (dash: "dashed")),
  ),
))

#subpar.grid(
  figure(imex2a, caption: [The state $b_1^(1,1)$]),
  <imex2a>,
  figure(imex2b, caption: [The state $b_2^(3,1)$]),
  <imex2b>,

  figure(imex2c, caption: [The state $b_3^(2,2)$]), <imex2c>, figure(imex2d, caption: [The state $b_4$]), <imex2d>,
  columns: (1fr, 1fr, 1fr, 1fr),
  caption: [Representation of a play of value $4$],
  label: <ex2>,
)

#example()[
  Let us consider $b_1 = {(1,1),(1,2),(1,3),(2,1),(3,1)}$. We consider the play $b_1 --> b_1^(1,1) --> b_2 --> b_2^(3,1) --> b_3 --> b_3^(2,2) --> b_4 --> ...$. From $b_1$ to $b_4$, the value of the play is obtained by counting the number of relocations, hence it is $torel(b_1^(1,1)) + torel(b_2^(3,1)) + torel(b_3^(2,2)) = 2 + 1 + 1 = 4$. By noticing that the induced cost from $b_4$ must be zero, and noting $sigma_0,pi_0$ some strategies inducing this play, we get $val_(sigma_0,pi_0)(b_1) = 4$
]


















= Optimal strategies <optimal_strat>

In this section we describe the leveling strategy for player $1$ and the fond strategy for player $2$. The study of these strategies can be tedious and one can make them more edible thanks to the following assumptions, which don't need a formal proof:
- Since the stacks can be differenciated only by their height, two states whose heights are a permutation of each other will have the same guarantee for both players, if they play optimally.
- Hence we can always consider the heights of the stacks in decreasing order.
- Depending on the context, $s_i$ can refer to the $i$-th stack of $s$ or to the height of this same stack with no ambiguity, for example in the two following sentences "Let us look at the stack $s_1$" or "$s_1$ containers have been relocated"

#definition(title: "strategies L & F")[
  - The strategy $L$ for player $1$ consists in relocating the containers one by one in the smallest stack. When multiple stacks are available, it chooses the leftmost one.
  - The strategy $F$ for player $2$ consists in requesting the bottom container of the biggest stack. If is not possible, it asks for the lowest possible container of this same stack. If several stacks have a maximal height, then it chooses the leftmost one.
]

#let imex3a = custom_im_blocks((
  (
    (2, 0, 0),
    none-style(),
  ),
  (
    (1, 0, 0),
    request(),
  ),
  (
    (2, 4, 4),
    none-style(),
  ),
))

#let imex3b = custom_im_blocks((
  (
    (4, 0, 4, 3),
    none-style(),
  ),
))

#let imex3c = custom_im_blocks((
  (
    (0, 0, 1, 0),
    request(),
  ),
  (
    (4, 0, 3, 1),
    none-style(),
  ),
))

#let imex3d = canvas(length: 0.9cm, draw_blocks((
  (
    (0, 0, 4, 0),
    none-style(stroke_ext: (dash: "dashed")),
  ),
  (
    (4, 2, 0, 2),
    none-style(),
  ),
)))





#subpar.grid(
  figure(imex3a, caption: [The state $F(c)$]), <imex3a>, figure(imex3b, caption: [The state $d$]), <imex3b>,
  figure(imex3c, caption: [The state $F(d)$]), <imex3c>, figure(imex3d, caption: [The state $L(F(d))$]), <imex3d>,
  columns: (1fr, 1fr, 1fr, 1fr),
  caption: [Beginning of a play induced by $L,F$],
  label: <ex3>,
)

#example()[
  Let's consider the state $c$ represented in @imex3a, without the request. The strategy $F$ chooses the first stack because it is the highest, then it requests the lowest container it can, i.e. the third one because, if $H = 5$, there is only two positions where player $1$ can relocate the container.
]

Before going into the details, one must notice that when $L$ has to place containers, it fills the empty space from left to right, up to top. Hence, except for the stack containing the requested container, which will always be stack number $1$ when $F$ plays, the decreasing order of the heights of the stacks is maintained.

Given a state $s$ of size $n$ that is not almost-full, we know that $F$ will request the container $(1,1)$, so $L$ has to relocate $s_1 - 1$ containers. We would like to have more information about which stacks $L$ will choose. Since $L$ has the choice between $W-1$ stacks and not $W$ to place the containers and he is aiming for the lowest stacks, we have the following lemma.

#lemma()[
  Let $s in Q_2$ of size $n$ such that $n <= (W-1)H + 1$. We have $F(s) = s^(1,1)$
]

The behavior of these strategies is quite regular, hence it is possible to compute the number of relocations induced by $L$ and $F$ on any state. We first give an intuition on what happens during the play which is converging to a certain form of state, where a stack is empty and all the others stack have the same height, up to a difference of $1$ container. This leads us to define the notion of leveled state, which will be useful later, and some other definitions to handle the side cases that can happen at the beginning of a play.

#definition(title: "class of states")[
  - let $s in Q_2$, and let $h_1 >= h_2 >= ... >= h_W$ be the heights of its stacks in decreasing order. $s$ is said $k$-leveled if it has an empty stack ($h_W = 0$) and if $h_k > ceil((n-k+1) / (W-1))$ and for all $i > k, h_i <= ceil((n-i+1) / (W-1))$.
  - It is said almost full if $|s| > (W-1) dot H + 1$, i.e. when the opponent can't ask for the deepest containers.
  - For $h in [|0,H|]$ it is said $h$-filled if it has a stack of size $h$ and all the other stacks are full (i.e. have a height of $H$). Note that if $s$ is $h$-filled for some $h$ then it is almost full.
  - Finally $s$ is $k$-regular if it is not almost-full, has no empty stacks, and and if $h_k > ceil((n-k+1) / (W-1))$ and for all $i > k, h_i <= ceil((n-i+1) / (W-1))$.
]

One can notice that a state $s$ cannot be $k$-leveled for some $k$ and almost full at the same time. Hence a state is either $k$-leveled for some $k$ or almost full or $k$-regular for some $k$. We now give the value induced by $F$ and $L$ for every one of these classes of states. We start with the leveled state, which is the most stable class and will be useful to tackle the other ones.

#proposition()[
  Let $s in Q_2$ and $k$ such that $s$ is $k$-leveled, and let $h_1 >= h_2 >= ... >= h_W$ be the heights of its stacks in decreasing order.
  $
    val_(L,F) (s) = sum_(i = 1)^k h_i + sum_(i = k+1)^n ceil((n-i+1) / (W-1))
  $
]<k_leveled_cost>

#proof()[
  By induction on $n$.
  - Let us suppose that $k = 0$, it means that $h_i <= ceil((n-i+1) / (W-1))$ for all $i in [|1,W-1|]$. Yet $n = sum_(i=1)^(W-1) h_i <= sum_(i=1)^(W-1) ceil((n-i+1) / (W-1)) = n$. It means that all these inequalities must be equalities, hence we get $ h_i = ceil((n-i+1) / (W-1)) $ Now we know every heights of the state $s$, we obtain that $F$ requests the bottom of a stack of size $ceil(n / (W-1))$, and after the move from $L$ we obtain the state $L(F(s))$ whose decreasing heights $h'_1 >= ... >= h'_W$ verify $h'_i = ceil((n-i) / (W-1))$. Hence $L(F(s))$ is $0$-leveled and we get
  $
    val_(L,F) (s) & = ceil(n / (W-1)) + val_(L,F) (L(F(s)) \
                  & = ceil(n / (W-1)) + sum_(i=1)^(n-1) ceil((n-i) / (W-1)) \
                  & = sum_(i=1)^(n) ceil((n-i+1) / (W-1))
  $
  - Now let us consider the case where $k > 0$. We have $h_1 >= h_2 >= ... >= h_k > ceil((n-k+1) / (W-1))$ and for every $i in [|k+1,W-1|], h_i <= ceil((n-i+1) / (W-1))$. $F$ will request the bottom of the biggest stack for a cost of $h_1$, then $L$ will move $h_1 - 1$ containers at the lowest locations, we get the state $s' = L(F(s))$ with decreasing heights $h'_1 >= ... >= h'_W = 0$. One can notice that for every $i in [|1,W-1|], h'_i >= h_(i+1)$. Hence we have $h'_(k-1) >= h_k > ceil(((n-1) - (k-1) + 1) / (W-1))$, hence $s'$ is at least $(k-1)$-leveled. \ It remains to show that for $i in [|k,W-1|], h'_i <= ceil((n-i) / (W-1))$. Let us first look at the total number of containers in all these stacks. Since $L$ is filling the levels from left to right, except for the first stack which is getting emptied, the order between the heights is keeped, hence the $h'_i$ containers from stack $i$ of $s'$ are the $h_(i+1)$ containers from stack $i+1$ of $s$ to which we add potential containers coming from the relocation. When we add up everything, since the number of relocated containers is exactly $h_1 - 1$ we get :
  $
    sum_(i=k)^(W-1) h'_i & <= h_1 - 1 + sum_(i=k+1)^(W) h_i \
                         & = n - 1 - sum_(i=2)^k h_i \
                         & <= n - 1 - sum_(i=2)^k ceil((n-i+1) / (W-1)) T O D O \\ W H Y \
                         & = sum_(i=1)^(W-1) ceil((n-i) / (W-1)) - sum_(i=2)^k ceil((n-i+1) / (W-1)) \
                         & = sum_(i = k)^(W-1) ceil((n-i) / (W-1))
  $
  Let us recall that $h'_i$ is, before the relocation, composed of $h_(i+1)$ containers, and that $h_(i+1) <= ceil((n-i) / (W-1))$. Yet $L$ levels from left to right, so if it's not necessary (which is the case because of the above inequality)
  TODO We have $h'_i <= ceil((n-i) / (W-1))$ which finally gives us that $s'$ is $(k-1)$-leveled. \
  The last argument to use is that for $i in [|1,k-1|], h_(i+1) >= ceil((n-i) / (W-1))$, which means that after the retrieval and before the relocation, $h'_i >= ceil((n-i) / (W-1))$ and thus that $L$ won't relocate any container on this stack, so $h'_i = h_(i+1)$. Finally, the induction hypothesis gives us
  $
    val_(L,F) (s) & = h_1 + val_(L,F) (s') \
                  & = h_1 + sum_(i = 1)^(k-1) h'_i + sum_(i = k)^(n-1) ceil(((n-1)-i +1) / (W-1)) \
                  & = sum_(i = 1)^k h_i + sum_(i = k+1)^n ceil((n-i+1) / (W-1))
  $
]

#proposition()[
  Let $s in Q_2$ and $h > 0$ such that $s$ is $h$-filled. We have
  $
    val_(L,F) (s) = sum_(i = 1)^h (H-i+1) + sum_(i = 1)^((W-1)H) ceil(((W-1)H-i+1) / (W-1))
  $
]

#proof()[
  By induction on $n$.
  - If $h = 1$, then the heights of the state $s$ are $(H,H,...,H,1)$, so $F$ will request the bottom container of the first stack of size $H$, and then the heights of the obtained state $s' = L(F(s))$ will be $(H,H,...,H,0)$, therefore $s'$ is $0$-leveled and $|s'| = (W-1)H$ so we get
  $
    val_(L,F) (s) & = H + val_(L,F) (s') \
                  & = H + sum_(i = 1)^((W-1)H) ceil(((W-1)H-i+1) / (W-1))
  $
  - Else, the heights of the state $s$ are $(H,H,...,H,h)$, so $F$ will request the deepest valid container of the first stack of size $H$, which will be at height $h$, and then the heights of the obtained state $s' = L(F(s))$ will be $(H,H,...,H,h-1)$, therefore $s'$ is $(h-1)$-filled and $|s'| = (W-1)H + h - 1$ so we get
  $
    val_(L,F) (s) & = (H - h + 1) + val_(L,F) (s') \
                  & = (H - h + 1) + sum_(i = 1)^(h-1) (H - i + 1) + sum_(i = 1)^((W-1)H) ceil(((W-1)H-i+1) / (W-1))
  $
]

#proposition()[
  Let $s in Q_2$ such that $s$ is almost-full, let $h_1$ be the heights of its biggest stack.
  $
    val_(L,F) (s) = h_1 - n + (W-1)H + 1 + sum_(i = 1)^(n - 1 - (W-1)H) (H-i+1) + sum_(i = 1)^((W-1)H) ceil(((W-1)H-i+1) / (W-1))
  $
] <almost_full_cost>

#proof()[
  It follows immediately from the fact that the first move will have a cost of $h_1 - n + (W-1)H + 1$ and the state $L(F(s))$ is $h$-filled with $h = n-1-(W-1)H$
]

#proposition()[
  Let $s in Q_2$ such that $s$ is $k$-regular, and let $h_1 >= h_2 >= ... >= h_W$ be the heights of its stacks in decreasing order.
  $
    val_(L,F) (s) = h_1 + sum_(i=2)^k h_i + sum_(i = k+1)^(n) ceil((n-i+1) / (W-1))
  $
] <k_regular_cost>

#proof()[
  TODO $0$-regular becomes $0$-leveled and $k$-regular becomes $(k-1)$-leveled
]

One can notice that the obtained formulaes for $k$-regular and $k$-leveled are very similar, hence we can generalise these to

#corollary()[
  If $s in Q_2$ is not almost-full
  $
    val_(L,F) (s) = h_1 + sum_(i=2)^W max(h_i, ceil((n-i+1) / (W-1))) + sum_(i = W+1)^(n) ceil((n-i+1) / (W-1))
  $
] <k_bay_cost>


We haven't yet proved that this value was the value of the game because nothing says apriori that $L$ and $F$ are the optimal strategies for both players. To do so, we can prove that player $1$ can ensure a value below $val_(L,F) (s)$ and that player $2$ can ensure a value above $val_(L,F) (s)$. The core of the proof is that leveled configurations are better for $L$ when playing against $F$.

#proposition()[
  Let $s,t in Q_2$ such that $s down() t$, we have $val_(L,F) (t) + 1 >= val_(L,F) (s) >= val_(L,F) (t)$
] <lower_better>

#proof()[
  Let us do a case analysis on the classes of states $s$ and $t$.
  - If $s$ is almost-full, then by definition $t$ is almost-full as well. Since they have the same size, by @almost_full_cost they have the same value up to the height of their highest stack. Since $s down() t$ we have $t_1 + 1 >= s_1 >= t_1$. Hence we have $val_(L,F) (t) + 1 >= val_(L,F) (s) >= val_(L,F) (t)$
  - Else, $t$ is not almost-full as well hence we are in the case of @k_bay_cost. Let $i < j$ be the indices such that the container is relocated from stack $i$ to $j$. Without loss of generality, we can suppose that $i$ is the rightmost stack of its height, and $j$ is the leftmost stack of its height. This gives us the property that the decreasing order of stacks is the same in $s$ and $t$. Notice that if $i = 1$ the property is direct. Else, we have $val_(L,F) (s) - val_(L,F) (t) = [max(s_i, ceil((n-i+1) / (W-1))) - max(s_i - 1, ceil((n-i+1) / (W-1)))] + [max(s_j, ceil((n-j+1) / (W-1))) - max(s_j + 1, ceil((n-j+1) / (W-1)))] = Delta_i - Delta_j$. Either $Delta_j = 0$ and the proof is done, either $Delta_j = 1,$ which means that $s_j >= ceil((n-j+1) / (W-1))$, yet we have $ceil((n-i+1) / (W-1)) <= ceil((n-j+1) / (W-1)) + 1 <= s_j + 1 <= s_i$, thus we get that $Delta_i = 1$ which concludes the proof.
]

Now, to prove the optimality of $L$ and $F$, it suffices to show that these strategies are optimal one against the other.

#proposition(title: "L optimal against F")[
  For any $s in Q_2$ and $sigma in Sigma, val_(L,F) (s) <= val_(sigma,F) (s)$
]
#proof()[
  Let us do an induction on $|s|$.

  If $|s| = 0$ the property is trivial.
  Else, let $i,j$ be the request of $F$ on $s$, i.e. $F(s) = s^(i,j)$. Now let us take a look at $L(s^(i,j))$ and $sigma(s^(i,j))$. By definition of $L$, i.e. it relocates the containers to the lowest stacks at each step, we get that $sigma(s^(i,j)) down()^* L(s^(i,j))$. Here is a formal proof of this property :
  - Let $prec.eq$ be a total order on the coordinates of the bay such that $(a,b) prec.eq (c,d)$ if and only if $b < d or (b = d and a < c)$. Let $t$ be the state $s$ where we removed all the containers above $(i,j)$ and let $k = torel(s, i, j)$, i.e. $t = s - k dot e_i$. Now seeing the states as a set of coordinates, we get that, by definition of $L$, $L(s^(i,j))$ is equal to $t union A$ where $A$ are the $k$ smallest coordinates for the order $prec.eq$ that are not in $t$. And $sigma(s^(i,j))$ is equal to $t union B$ where $B$ is a set of $k$ coordinates that are not in $t$. Hence, we can find a bijection $r$ from $A$ to $B$ such that $r((i,j)) prec.eq (i,j)$. This gives a sequence of relocations that are only downward, and that goes from the state $L(s^(i,j))$ to the state $sigma(s^(i,j))$

  By the induction property we have, noting $k$ the numbers of relocated containers
  $
    val_(sigma,F) (s) & = k + val_(sigma,F) (sigma(s^(i,j))) \
                      & >= k + val_(L,F) (sigma(s^(i,j))) \
                      & >= k + val_(L,F) (L(s^(i,j))) \
                      & = val_(L,F) (s)
  $
  Which concludes the proof.
]

#lemma()[
  For any $s in Q_2$ and $1 <= r <= H$ and $1 <= i,j <= W$, if $s_i > s_j$ and noting $Delta = s_i - s_j$ we have
  $
    L(s^(i,r)) rel()^Delta L(s^(j,r))
  $
] <lemma_imp>

#proof()[
  By induction on $Delta$.
  If $Delta = 0$ then $s^(i,r)$ and $s^(j,r)$ are the same up to a permutation hence they are still after the play $L$.
  Else, suppose $s_i - s_j = Delta + 1$, let $s'$ be the state obtained after relocating one container from $s_i$ according to the strategy $L$, hence there exists $u$ such that $s' = s - e_i + e_u$. We gave $L(s^(i,r)) = L(s'^(i,r))$. By induction hypothesis, $L(s'^(i,r)) rel()^Delta L(s'^(j,r))$ and finally we have $L(s'^(j,r)) rel() L(s^(j,r))$ hence we finally get that $L(s^(i,r)) rel()^(Delta + 1) L(s^(j,r))$ which concludes the proof.
]

#proposition(title: "F optimal against L")[
  For any $s in Q_2$ and $pi in Pi, val_(L,pi) (s) <= val_(L,F) (s)$
]
#proof()[
  Let us do an induction on $|s|$.

  If $|s| = 0$ the property is trivial.
  Else, let $i,j$ be the request of $pi$ on $s$, i.e. $pi(s) = s^(i,j)$. Let us now consider a strategy $pi_F$ that plays the same move as $pi$ for the first request and then plays as $F$. By induction hypothesis, this strategy is at least better than $pi$.

  If it is possible, consider the strategy $pi'_F$ requesting the container $i,j-1$ and then plays $F$. We have $L(s^(i,j)) rel() L(s^(i,j-1))$ hence by @lower_better we have $val_(L,F) (s^(i,j)) - 1 <= val_(L,F) (s^(i,j-1))$. Hence, by noting $k = torel(s^(i,j))$ we have :
  $
    val_(L,pi) (s) & <= val_(L,pi_F) (s) \
                   & = k + val_(L,F) (L(s^(i,j))) \
                   & <= (k + 1) + val_(L,F) (L(s^(i,j-1))) \
                   & = val_(L,pi'_F) (s)
  $
  Which means that the player $2$ must play as deep as possible in the stack he is considering. Now it remains to prove he has to choose the biggest stack.

  Let us consider two strategies, $pi_i$ and $pi_j$ with $s_i > s_j$. By using @lemma_imp we have $L(pi_i (s)) rel()^Delta L(pi_j (s))$ where $Delta = s_i - s_j$. Thus by @lower_better we have that $val_(L,F) (L(pi_i (s))) >= val_(L,F) (L(pi_j (s))) - Delta$. So, by noting $k$ the number of relocated containers in the cas of $pi_j$ :
  $
    val_(L,pi_i) (s) & = Delta + k + val_(L,F) (L(pi_i (s))) \
                     & >= Delta + k + val_(L,F) (L(pi_j (s))) - Delta \
                     & = val_(L,pi_j (s))
  $

  It means that player $2$ must play as deep as possible in the biggest stack, which means that $F$ is optimal against $L$
]

#corollary(title: "opposing optimality")[for every $s in Q_2$
  $
    max_(pi in Pi) val_(L,pi) (s) = val_(L,F)(s) = min_(sigma in Sigma) val_(sigma,F) (s)
  $
] <opposing_opt>

#corollary(title: "value of the game")[
  Given $s in Q_2$, $G_s$ is determined and its value is
  $
    val_(L,F) (s) = min_(sigma in Sigma) max_(pi in Pi) val_(sigma,pi)(s) = max_(pi in Pi) min_(sigma in Sigma) val_(sigma,pi)(s)
  $
] <determined>

#proof()[
  Let $pi in Pi$, by @opposing_opt, $min_(sigma in Sigma) val_(sigma,pi)(s) <= val_(L,pi) (s)$. So $ max_(pi in Pi) min_(sigma in Sigma) val_(sigma,pi)(s) <= max_(pi in Pi) val_(L,pi)(s) = val_(L,F) (s) $
  We also have $ max_(pi in Pi) min_(sigma in Sigma) val_(sigma,pi)(s) >= min_(sigma in Sigma) val_(sigma,F)(s) = val_(L,F) (s) $
  Now let $sigma in Sigma$, still by @opposing_opt $max_(pi in Pi) val_(sigma,pi)(s) >= val_(sigma,F)(s)$. So $ min_(sigma in Sigma) max_(pi in Pi) val_(sigma,pi)(s) >= min_(sigma in Sigma) val_(sigma,F)(s) = val_(L,F) (s) $
  We also have $ min_(sigma in Sigma) max_(pi in Pi) val_(sigma,pi)(s) >= max_(pi in Pi) val_(L,pi)(s) = val_(L,F) (s) $
]







= Non adaptative adversary <non-adaptative>

It is important to look back at what implies this result on our original problem, more precisely to look at the sets of allowed strategies. In @optimal_strat, player 2 is able to react to what player 1 is doing, because its strategy takes as input the position of the boxes. Hence we may come back to the context given by @ZEHENDNER201748, where the boxes have a retrieve order chosen by player $2$ at the beginning and unknown to player $1$, which exactly is the online framework. For sake of brevity, we won't formalise that new game and just give a rough idea of its components.

#definition(title: "permutation CRP game")[
  The permutation CRP game is very similar to a CRP game, here is a list of the changes :
  - The states $Q$, except the inital one, have the containers numbered. The ids of the containers represent the retrieve order, hence they must be distinct.
  - the player 2 can play only once, at the start of the game, by choosing the numbers (let's say a permutation of ${1,...,n}$) associated to each container.
  - the player 1 then plays alone. At every step, the container having the smallest id is requested. He has to take a decision based only on the heights of the stack, it is done by giving a projection from $rem : Q -> Q_1$ which removes the ids, then the player $1$ moves the boxes one by one without seeing the ids.
]


From the point of view of the player $1$, the game is exactly the same. He doesn't see ids and after each play he receives a request. For the player $2$, it is quite different because he cannot adapt anymore, after he plays, its moves are fixed for the whole duration of the game.

#definition(title: "strategies, play, value")[
  Let us call $overline(Sigma) = Sigma$ and $overline(Pi)$ the set of strategies for both players in the PCRP game. Given $overline(sigma) in overline(Sigma), overline(pi) in overline(Pi)$, an initial state $s in [|0,H|]^W$, a play is induced and its value is defined the same way as for CRP, we note it $overline(val)_(overline(sigma),overline(pi)) (s)$.
]

#remark(title: "player 1 strategies")[
  Notice that player $1$ strategies are the same in both games, since even in PCRP, he can't see the containers. So for a strategy $overline(sigma) in overline(Sigma)$, we will arbitrarily talk about $sigma$, the same strategy but considered in a CRP game.
]

This game exactly describes an online algorithm for the online CRP. We already proved that the heuristic $L$ was optimal for the previous game, the question is now to know wether it is still optimal when the opponent is restricted to permutation play. The following theorem gives more insight about this.

#theorem()[
  We have $ min_(overline(sigma) in overline(Sigma)) max_(overline(pi) in overline(Pi)) overline(val)_(overline(sigma),overline(pi)) (s) = val_(L,F)(s) >= max_(overline(pi) in overline(Pi)) min_(overline(sigma) in overline(Sigma)) overline(val)_(overline(sigma),overline(pi)) (s) $
  And the inequality is not always an equality. Hence PCRP is not determined in general.
] <PCRP_value>

#proof()[
  The core of the proof is to notice that for any $overline(sigma) in overline(Sigma)$, there exists $F_sigma in overline(Pi)$ such that $overline(val)_(overline(sigma),F_sigma) (s) = val_(sigma,F)(s)$. Let us fix $overline(sigma)$ and an initial state $s$, we will inductively build such a strategy $F_sigma$, which is an order on the containers.
  Simply choose the deepest box on the biggest stack of $s$ as container $1$. Then simulate one move of $overline(sigma)$. Next, consider the deepest container on the biggest stack, determine its initial position in $s$ and give it the id $2$ and so on.
  Since $overline(sigma)$ don't see the ids, modyfying the ids doesn't change its behavior, and by construction, this permutation acts (against $overline(sigma)$) as $F$.

  The second important argument is that $forall overline(sigma) in overline(Sigma), overline(pi) in overline(Pi)$, there exists $pi in Pi$ such that $val_(sigma,pi) (s) = overline(val)_(overline(sigma),overline(pi)) (s)$. Indeed let us fix $overline(sigma) = sigma$ and $overline(pi)$. We can simulate the induced game between these two and at each step, if the current state is $s_t$, we consider $rem(s_t)$ i.e. the same state without ids and choose $pi(rem(s_t))$ as the coordinate of the container requested by $overline(pi)$. By construction, $pi$ and $overline(pi)$ induces the same play against $sigma$.

  Hence $min_(overline(sigma) in overline(Sigma)) max_(overline(pi) in overline(Pi)) overline(val)_(overline(sigma),overline(pi)) (s) >= min_(overline(sigma) in overline(Sigma)) overline(val)_(overline(sigma),F_sigma) (s) = min_(overline(sigma) in overline(Sigma)) val_(sigma,F) (s) = val_(L,F) (s)$. Similarly $min_(overline(sigma) in overline(Sigma)) max_(overline(pi) in overline(Pi)) overline(val)_(overline(sigma),overline(pi)) (s) <= max_(overline(pi) in overline(Pi)) overline(val)_(L,overline(pi)) (s) <= max_(pi in Pi) val_(L,pi) (s) = val_(L,F) (s)$. The last inequality is given by the same argument : $max_(overline(pi) in overline(Pi)) min_(overline(sigma) in overline(Sigma)) overline(val)_(overline(sigma),overline(pi)) (s) <= max_(overline(pi) in overline(Pi)) overline(val)_(L,overline(pi)) (s) = val_(L,F) (s)$

  It remains to give an example where the last inequality is strict. Let us consider the initial state $s$ given in @counterexa. One can easily compute $val_(L,F) (s) = 5$, and then check that for every permutation of the numbered boxes, there is a way to complete the requests in at most 4 moves. This can be done by a case analysis given in @counterex.
  - If the container $1$ is above as in @counterexb, then we just retrieve it and the final cost is at most $3$ by computing $val_(L,F)$, which is an upper bound on the cost by what we showed above.
  - Now let's consider $1$ is below, if $2$ is above but not on $1$, as in @counterexc, we move the box $a$ on the third column, then we retrieve $2$ and get a maximal cost of $4$ by computing $val_(L,F)$
  - If $2$ is above $1$, as in @counterexd, we move it away and then retrieve it. We again get a maximal cost of $4$.
  - Now let us consider the case @counterexe, if $b < a$ or $c < d$, we just put the container $a$ above $b$, then we reach configuration @counterexf which has cost $0$ or $1$, at this point we already paid $3$, so we reach a maximal amount of $4$.
  - In the same example but with $a < b$ and $d < c$, we put the container $a$ on $c$ and reach after the second move the configuration @counterexg. At this point we paid $2$.
    + If $a < d$, then the remain cost is at most $2$ wether $c$ or $d$ is the next request.
    + If $a > d$, then $d$ is the next container to be retrieved since we supposed $a < b, d < c$. Hence we reach configuration @counterexh which has a cost of $0$ so the final cost is $4$.

  This gives an instance on which $max_(overline(pi) in overline(Pi)) min_(overline(sigma) in overline(Sigma)) overline(val)_(overline(sigma),overline(pi))(s) = 4 < 5 = val_(L,F)(s)$

  #let imempty6 = im_blocks((
    (
      (2, 2, 2),
      none-style(fill: gray),
    ),
  ))

  #let imcase1 = im_blocks((
    (
      (2, 2, 2),
      permutation(id_list: ($star$, 1, $star$, $star$, $star$, $star$), fill: gray, size: 16pt),
    ),
  ))

  #let imcase2 = im_blocks((
    (
      (2, 2, 2),
      permutation(id_list: (1, $a$, $star$, 2, $star$, $star$), fill: gray, size: 16pt),
    ),
  ))

  #let imcase3 = im_blocks((
    (
      (2, 2, 2),
      permutation(id_list: (1, 2, $star$, $star$, $star$, $star$), fill: gray, size: 16pt),
    ),
  ))

  #let imcase4 = im_blocks((
    (
      (2, 2, 2),
      permutation(id_list: (1, $a$, 2, $b$, $d$, $c$), fill: gray, size: 16pt),
    ),
  ))

  #let imcase5 = im_blocks((
    (
      (2, 0, 2),
      permutation(id_list: ($a$, $b$, $d$, $c$), fill: gray, size: 16pt),
    ),
  ))

  #let imcase6 = im_blocks((
    (
      (1, 0, 3),
      permutation(id_list: ($b$, $d$, $c$, $a$), fill: gray, size: 16pt),
    ),
  ))

  #let imcase7 = im_blocks((
    (
      (2, 1, 0),
      permutation(id_list: ($b$, $a$, $c$), fill: gray, size: 16pt),
    ),
  ))

  #subpar.grid(
    figure(imempty6, caption: "teste"), <counterexa>, figure(imcase1, caption: "testse"), <counterexb>,
    figure(imcase2, caption: "teste"), <counterexc>, figure(imcase3, caption: "testse"), <counterexd>,
    figure(imcase4, caption: "testse"), <counterexe>, figure(imcase5, caption: "testse"), <counterexf>,
    figure(imcase6, caption: "testse"), <counterexg>, figure(imcase7, caption: "testse"), <counterexh>,
    columns: (1fr, 1fr, 1fr, 1fr),
    caption: [full caption],
    label: <counterex>,
  )

]

Before concluding, we see that on the example from @counterex, the optimal offline algorithm pays at most $4$ while any online algorithm pays at least $5$ on some permutation. Hence we get the following lower bound on the competitive ratio of any deterministic online algorithm.

#corollary(title: "lower bound on the competitive ratio")[
  For any deterministic online algorithm for the online CRP, the competitive ratio of $A$ is at least $5 / 4$
]

The @PCRP_value confirms that even against weaker opponent, which is only able to choose a permutation of the containers, player $1$ can guarantee the same value as in the CRP game, with the same optimal strategy $L$.

#corollary(title: "optimality of L")[
  Given a configuration $s$ of the bay, for any deterministic algorithm $A$ and permutation $p in S_n$, we consider the induced number of relocation $cost_(A,p)(s)$ and the worst-case cost of $A$ being $cost_A (s) = max_(p in S_n) cost_(A,p)(s)$. Then, noting $cal(A)$ the set of deterministic algorithms for the online CRP, we have
  $
    min_(A in cal(A)) cost_A (s) = cost_L (s)
  $
  i.e. $L$ is worst-case optimal among deterministic algorithms
]

= Competitive ratio lower bound

The aim of this section is to determine a lower bound on the competitive ratio of any deterministic algorithm for the online CRP.

We first introduce a new problem which will be useful later.



#bibliography("works.bib")

