Exercise 1.2 (Box-Muller transform)
We first prove the converse statement, that is, starting with a couple (R,\theta) having the distribution as announced, we obtain two independent Gaussian random variables.
For any bounded measurable function f:{\mathbb R}^2\to {\mathbb R} we have
\begin{align*}
& \mathbb{E}(f(X,Y)) = \mathbb{E}(f(R\cos(\theta),R\sin(\theta))) \\
& = \iint_{[0,2\pi]\times\mathbb{R}^+}{ f(\sqrt{\rho}\cos(\phi), \sqrt{\rho}\sin(\phi))\dfrac12 e^{-\frac{\rho}{2}}\dfrac{1}{2\pi}{\rm d}\rho {\rm d}\phi } \\
& (\text{applying the change of variables } x=\sqrt{\rho}\cos(\phi), y=\sqrt{\rho}\sin(\phi)) \\
& = \iint_{\mathbb{R}^2}{f(x,y)e^{-\frac{(x^2+y^2)}2}\dfrac{1}{2\pi} {\rm d}x{\rm d}y},
\end{align*}
which is the expectation w.r.t. the 2-dimensional Gaussian distribution with independent components.
Conversely, the above computations show that starting from independent standard Gaussian components, we retrieve that R and \theta are independent and distributed as claimed.
Exercise 1.4 (acceptance-rejection method)
|u|e^{(x-1)^2/2} \leq 1.
Note that if X \overset{d}{\rm =} {\cal E}xp(1) and U \overset{d}{\rm =} \mathcal{U}([-1,1]) then Z = X\ {\rm sgn}(U) follows the Laplace distribution with the density
\begin{align*} g(z) = \dfrac{e^{-|z|}}{2}. \end{align*}
Remark that |U| is uniform on [0,1] and independent of {\rm sgn}(U). We may notice that the algorithm performs an acceptance-rejection scheme with the Laplace law as an auxiliary distribution (see Proposition 1.3.2). Indeed, we simulate the pair (U,X) until |U|e^{(X-1)^2/2} \leq 1 . This is equivalent to simulating (|U|, Z) = (|U|, {\rm sgn}(U)X) until |U|e^{(|Z|-1)^2/2} \leq 1. Conditionally on this event Z = {\rm sgn}(U)X will give us the target random variable.
It remains to find the target distribution of this scheme. Denote its density by f(z), which has to satisfy
\dfrac{g(z)}{f(z)} = \dfrac{1}{c}e^{(|z|-1)^2/2},
for some constant c.
By a simple calculation we get
f(z) = c\dfrac{g(z)}{e^{(|z|-1)^2/2}} = \dfrac{c}{2}e^{-|z| - (|z|-1)^2/2} = c^{\prime}e^{-z^2/2},
which is the density of the standard normal variable.
Exercise 1.6 (ratio-of-uniforms method, Gamma distribution)
First note that \Gamma(\alpha,\theta) \overset{\rm d}= \theta\Gamma(\alpha,1) so we will consider the case \theta=1. The distribution \Gamma(\alpha,1) has the density proportional to f(z) = z^{\alpha-1}e^{-z}\mathbb{1}_{z\geq 0}.
Note that the next simulation algorithm does not require the knowledge of the normalizing constant of the density, this is the interest of such a technique.\\
We apply Proposition 1.3.5 with r=d=1.
Following Lemma 1.3.6 we easily observe that z\mapsto f(z) and z\mapsto z^2 f(z) are bounded, and they both have a unique maximum. Solving (\log(f(z)))^{\prime} = 0 we get \sup_z{f(z)} = f(\alpha-1). Similarly \sup_z{z^2 f(z)} = (\alpha+1)^2f(\alpha+1).\\
Now using Lemma 1.3.6 we obtain
A_{f,1} \subseteq \tilde{A}_{f,1} = [0,f(\alpha-1)]\times[0,(\alpha+1)^2f(\alpha+1)],
where
A_{f,1} = \left\{ (u,v)\in \mathbb{R}^2: 0 < u \leq \sqrt{f\left(\dfrac{v}{u}\right)} \right\}.
Thus we may simulate a uniform variable (U,V) in the rectangle \tilde{A}_{f,1} until U \leq \sqrt{f\left(\dfrac{V}{U}\right)} , then V/U gives us the desired \Gamma(\alpha,1) distribution.
Exercise 1.9 (Archimedean copula)
We have to show that for any (u_1,\ldots,u_d)\in [0,1]^d we have\mathbb{P}(U_1\leq u_1, \ldots, U_d\leq u_d) = C(u_1,\ldots,u_d) := \phi^{-1}(\phi(u_1),\ldots,\phi(u_d)).
Recall that \phi^{-1}(u) = \mathbb{E}(e^{-uY}). Using the Dominated Convergence Theorem we deduce that \phi^{-1}(u) is continuous and further, as \mathbb{P}(Y>0)>0, that \phi^{-1}(u) is strictly decreasing. Note also that \phi^{-1}(0)=1 and \phi^{-1}(u) \underset{u\to +\infty}{\to} 0, using \mathbb{P}(Y>0)=1. Hence the function \phi(\cdot) is well-defined on [0,1], also continuous and strictly decreasing.
Using the independence of X_i's along with the properties of the functions \phi(\cdot) and \phi^{-1}(\cdot), we write
\begin{align*} \mathbb{P}(U_1\leq u_1, \ldots, U_d\leq u_d) &= \mathbb{P}\left( -\dfrac{1}{Y}\log(X_1)\geq \phi(u_1), \ldots,-\dfrac{1}{Y}\log(X_d)\geq \phi(u_d) \right) \\ & = \mathbb{P}\left(X_1\leq e^{-Y\phi(u_1)}, \ldots, X_d\leq e^{-Y\phi(u_d)} \right) \\ & = \mathbb{E}\left( \mathbb{P}(X_1\leq e^{-Y\phi(u_1)}, \ldots, X_d\leq e^{-Y\phi(u_d)} \mid Y ) \right) \\ & = \mathbb{E}( e^{-Y\phi(u_1)}\cdots e^{-Y\phi(u_d)}) \\ &= \phi^{-1}(\phi(u_1)+ \cdots + \phi(u_d)) = C(u_1, \ldots, u_d), \end{align*}
which proves the result.
Exercise 2.2 (substitution method)
To make the problem non-trivial, we assume u\neq 0. For both estimators \phi_{1,M}(u) and \phi_{2,M}(u) the Law of Large Numbers implies their a.s. convergences to \phi(u). Thus to compare the two estimators, we shall calculate and compare the corresponding asymptotic variances, which are later used to construct confidence intervals.- For \phi_{1,M}(u) we have \begin{align*}\sqrt{M}(\phi_{1,M}(u) - \phi(u)) = \dfrac{1}{\sqrt{M}}\sum_{m=1}^M{(e^{uX_m} - \mathbb{E}(e^{uX}))}.\end{align*}
From the Central Limit Theorem \sqrt{M}(\phi_{1,M}(u) - \phi(u)) converges in distribution to a centered Gaussian variable, with variance {\rm Var}(e^{uX}). The latter is equal to
\begin{align*}\mathbb{E}(e^{2uX}) - \mathbb{E}(e^{uX})^2 = \phi(2u) - \phi(u)^2 = e^{2u^2\sigma^2} - e^{u^2\sigma^2} = e^{u^2\sigma^2}(e^{u^2\sigma^2} - 1).\end{align*} - Now consider \phi_{2,M}(u). Since \sigma_M^2=\frac{ 1 }{M }\sum_{m=1}^M X_m^2, the Central Limit Theorem ensures that \sqrt{M}(\sigma_M^2 - \sigma^2) converges in distribution to a centered Gaussian variable, with variance {\rm Var}(X^2)=2\sigma^4.
Now we apply the substitution method (Theorem 2.2.2) to derive that, for any regular function f,
\begin{align*}\sqrt{M}(f(\sigma_M^2) - f(\sigma^2)) {\underset{M\to \infty}\Longrightarrow}{\cal N}\Big(0, (f^{\prime}(\sigma^2))^2 2\sigma^4 \Big).\end{align*}
So, using this for the case f(x) = e^{\frac{u^2}{2}x} as in \phi_{2,M}(u) we obtain
\begin{align*}\sqrt{M}(\phi_{2,M}(u) - \phi(u)) {\underset{M\to \infty}\Longrightarrow} \mathcal{N}\left(0, \left(\frac{u^2}{2}\right)^2 e^{u^2\sigma^2} 2\sigma^4\right).\end{align*}
Observe that (for any u\neq 0)
\begin{align*} e^{u^2\sigma^2}(e^{u^2\sigma^2} - 1)> e^{u^2\sigma^2}(u^2\sigma^2+\frac{ (u^2\sigma^2)^2 }{ 2})>\left(\frac{u^2}{2}\right)^2 e^{u^2\sigma^2} 2\sigma^4;\end{align*}
therefore the second estimator \phi_{2,M}(u) yields a (asymptotically) better confidence interval that \phi_{1,M}(u).
Exercise 2.7 (concentration inequality, maximum of Gaussian variables)
- Let Y = (Y_1, \ldots, Y_d) be i.i.d. standard Gaussian random variables.
First the function f(y) = \sup_{1\leq i\leq d}{y_i} is 1-Lipschitz. Indeed, for any y = (y_1, \ldots, y_d) and y'= (y'_1, \ldots, y'_d) we have
\begin{align*} y_i &\leq y_{i}' + |y - y'|,\\ \sup_{1\leq i\leq d}y_i& \leq \sup_{1\leq i\leq d}y_i' + |y- y'|,\\ \left| \sup_{1\leq i\leq d}{y_i} - \sup_{1\leq i\leq d}{y'_i}\right| &\leq |y-y'|. \end{align*}
Then, using the concentration inequality (see Corollaries 2.4.13 and 2.4.16) we obtain
\begin{align*}& \mathbb{P}\left(\left|\sup_{1\leq i\leq d}{Y_i}-\mathbb{E}\left(\sup_{1\leq i\leq d}{Y_{i}}\right)\right| > \varepsilon \right) \leq 2\exp\left(-\dfrac{\varepsilon^2}{2}\right), \quad \forall\varepsilon \geq0.\end{align*}
More generally, Y can be represented as Y = L\tilde{Y}, where \tilde{Y} is a standard Gaussian vector with zero-mean and covariance matrix equal to identity, and where L is a matrix such that LL^\top equals the covariance matrix of Y. Notice that \mathbb{E}(Y_i^2)=\sum_j L_{i,j}^2\leq \sigma^2. Moreover, f(y) = \sup_{1\leq i\leq d}{(Ly)_i} is \sigma-Lipschitz: this can be justified as in the i.i.d. case, using
\begin{align*}(Ly)_i \leq (Ly')_i + \sum_{j=1}^d{|L_{ij}|}|y_j - y'_j| &\leq(Ly')_i + \sqrt{\sum_{j=1}^d {|L_{ij}|^2}} |y - y'|,\\\left| \sup_{1\leq i\leq d}{(Ly)_i} - \sup_{1\leq i\leq d}{(Ly')_i}\right| &\leq \sigma|y-y'|.\end{align*}
Now use the concentration inequality for the \sigma-Lipschitz function f(y) = \sup_{1\leq i\leq d}{(Ly)_i}, it gives the announced result. - Let \varepsilon>0. From the result of i) we obtain
\begin{align*} \mathbb{P}\left(\dfrac{\sup_{1\leq i\leq d}{Y_i}}{\mathbb{E}(\sup_{1\leq i\leq d}{Y_i})} > 1+\varepsilon\right)& =\mathbb{P}\left(\sup_{1\leq i\leq d}{Y_i} - \mathbb{E}(\sup_{1\leq i\leq d}{Y_i}) > \varepsilon \mathbb{E}(\sup_{1\leq i\leq d}{Y_i}) \right) \\ & \leq \exp\left(- \dfrac{\varepsilon^2(\mathbb{E}(\sup_{1\leq i\leq d}{Y_i}))^2}{2\sup_{1\leq i\leq d}{\mathbb{E}(Y_i^2)}}\right)\to 0\end{align*}
using the assumptions as d\to+\infty. Similarly \mathbb{P}\left(\dfrac{\sup_{1\leq i\leq d}{Y_i}}{\mathbb{E}(\sup_{1\leq i\leq d}{Y_i})} < 1-\varepsilon\right) \to 0. This justifies that \dfrac{\sup_{1\leq i\leq d}{Y_i}}{\mathbb{E}(\sup_{1\leq i\leq d}{Y_i})} converges to 1 in probability.
Exercise 3.3 (stratification, optimal allocation)
- We have (formula 3.2.2)
{\rm Var}(I_{M_1, \ldots, M_k}^{strat.}) = \sum_{j=1}^k{p_j^2\dfrac{\sigma_j^2}{M_j}}, with \sum_{j=1}^k{M_j} = M. The minimisation may be done by the Lagrange multiplier method: we minimize the function L(M_1, \ldots, M_k, \lambda) = \sum_{j=1}^k{p_j^2\dfrac{\sigma_j^2}{M_j}} + \lambda \left( \sum_{j=1}^k{M_j} - M\right). From the optimality condition \partial_{M_j}L(M_1, \ldots, M_k, \lambda) = -\frac{p^2_j\sigma^2_j}{M^2_j}+\lambda= 0 we get M_j^{\ast} = p_j\sigma_j/\sqrt {\lambda}. The constraint \sum_{j=1}^k{M_j^{\ast}} = M implies that M_j^{\ast} = M\dfrac{p_j\sigma_j}{\sum_{j=1}^k{p_j\sigma_j}}. Another way to work out the optimal M_j's is to re-interpret the variance {\rm Var}(I_{M_1, \ldots, M_k}^{strat.}) as an expectation according to the probability distribution \mathbb{P}(J=j)=\frac{M_j}M: {\rm Var}(I_{M_1, \ldots, M_k}^{strat.}) =M \mathbb{E}\left[\dfrac{p_J^2\sigma_J^2}{M^2_J} \right]. The Jensen inequality gives {\rm Var}(I_{M_1, \ldots, M_k}^{strat.}) \geq M\left(\mathbb{E}\left[\dfrac{p_J\sigma_J}{M_J} \right]\right)^2=M\left(\sum_{j=1}^k \frac{M_j}M p_j\dfrac{\sigma_j}{M_j} \right)^2=\frac{1}{M}\left(\sum_{j=1}^k p_j\sigma_j\right)^2. The lower bound on the right hand side is achieved when the Jensen inequality is an equality, that is when \dfrac{p_j\sigma_j}{M_j} is constant over j. We then retrieve the formula for M_j^{\ast}. - We write \begin{align*} & \sqrt{M}\left(I_{M^\ast_1, \ldots, M^\ast_k}^{strat.} - \mathbb{E}(X) \right) = \sqrt{M}\left(\sum_{j=1}^k{p_j\dfrac{1}{M_j^{\ast}}\sum_{m=1}^{M_j^{\ast}}{X_{j,m}}} - \sum_{j=1}^k{p_j\mathbb{E}(X|Z\in \mathcal{S}_j)} \right) \\ & = \sum_{j=1}^k x_j\sqrt{M_j^{\ast}}\left(\dfrac{1}{M_j^{\ast}}\sum_{m=1}^{M_j^{\ast}}{(X_{j,m} - \mathbb{E}(X|Z\in \mathcal{S}_j))} \right) \end{align*} where x_j:=p_j \frac{( \sum_{i=1}^k p_i\sigma_i )^\frac{ 1 }{2 }}{(p_j\sigma_j)^\frac{ 1 }{2 }}. Now from the standard CLT in dimension 1, we deduce that for each j {\cal E}_j(M_j^{\ast}):=\sqrt{M_j^{\ast}}\left(\dfrac{1}{M_j^{\ast}} \sum_{m=1}^{M_j^{\ast}} {(X_{j,m} - \mathbb{E}(X|Z\in \mathcal{S}_j) )}\right) {\underset{M\to \infty}\Longrightarrow} \mathcal{N}(0, \sigma^2_j). Since the variables (X_{j,m}:m\geq 1) are independent for different j, this is easy to derive a CLT for the vector ({\cal E}_1(M_1^{\ast}),\dots,{\cal E}_k(M_k^{\ast})). More formally, one may use the Levy theorem (Theorem A.1.3) to get (for any u\in \mathbb{R}) \begin{align*}\mathbb{E}(e^{iu \sqrt{M}(I_{M^\ast_1, \ldots, M^\ast_k}^{strat.} - \mathbb{E}(X) )})&= \mathbb{E}(e^{iu \sum_{j=1}^k x_j {\cal E}_j(M_j^{\ast})})\\ &=\prod_{j=1}^k\mathbb{E}(e^{iu x_j {\cal E}_j(M_j^{\ast})})\quad \text{(by independence between strata)}\\ &{\underset{M\to \infty}\longrightarrow}\prod_{j=1}^k e^{-\frac{ 1 }{2 }u^2 x^2_j \sigma^2_j}=e^{-\frac{ 1 }{2 }u^2 (\sum_{j=1}^k p_j\sigma_j)^2}. \end{align*} This shows that \sqrt{M}\left(I_{M^\ast_1, \ldots, M^\ast_k}^{strat.} - \mathbb{E}(X) \right){\underset{M\to \infty}\Longrightarrow} \mathcal{N}\left((0, (\sum_{j=1}^k p_j\sigma_j)^2\right).
Exercise 3.6 (importance sampling, Gaussian vectors)
Let Y \overset{d}{=} \mathcal{N}(0, {\rm Id}) be a d-dimensional standard normal Gaussian, denote its density by p(x)=(2\pi)^{-d/2}\exp(-\frac 1 2 |x|^2). Consider a new measure under which Y is distributed as SY + \theta under the initial one, where \theta is a d-dimensional vector and S is an invertible matrix. Then (following Proposition 3.4.8) the likelihood is given by
\begin{align*}
L& = \dfrac{1}{|{\rm det.}(S)|}\dfrac{p(S^{-1}(Y - \theta))}{p(Y)} \\
&=
\dfrac{1}{|{\rm det.}(S)|}\exp
\left(
- \frac12(Y-\theta)^{\top}(SS^{\top})^{-1}(Y-\theta) + \frac12 Y^{\top}Y
\right) \\
& = \dfrac{1}{|{\rm det.}(S)|}\exp
\left(
\frac12 Y^{\top}\Big({\rm Id}-(SS^{\top})^{-1}\Big)Y + Y^{\top}(SS^{\top})^{-1}\theta - \frac12 \theta^{\top}(SS^{\top})^{-1}\theta
\right).
\end{align*}
When Y, \Sigma, \theta are scalar quantities, we retrieve the second formula of Corollary 3.4.9.This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
No comments:
Post a Comment