Uses machine learning to evaluate the impact of deep trade agreements

Holger Brainlich, Valentina Koradi, Nadia Rocha, Joao MC Santos Silva, Thomas Zilkin 08 July 2022

Preferential trade agreements (PTAs) have become more frequent and increasingly complex in recent decades, making it important to assess how they affect trade and economic activity. Modern PTAs have many provisions in addition to tariff reductions in various areas such as service trade, competition policy, or public procurement. To illustrate this expansion of the non-tariff provision, part of the PTA is in force until Figure 1 2017 and the WTO has been notified which covers the areas of the selected policy. More than 40% of the contract includes provisions such as investment, movement of capital and technical barriers to trade. And covers more than two-thirds of the contract, such as competition policy or trade advantage.

Figure 1 PTA shares that cover selected policy areas

Note:: The figure shows the share of PTA that covers a policy area. Sources: Hofmann, Osnago and Ruta (2019).

Recent research has sought to go beyond estimating the overall impact of PTA on trade and establish the relative importance of individual PTA provisions (e.g. Kohl et al. 2016, Mulabdic et al. 2017, Dhingra et al. 2018, Regmi and Baier 2020). , Such attempts are hampered by the fact that the number of provisions included in the PTA is much higher than the number of PTAs available for study (see Figure 2), making it difficult to distinguish their personal effects on trade flows.

Figure 2 The number of provisions in the PTA over time

Formula: Mattu et al. (2020).

Researchers have tried to address the growing complexity of PTA in a variety of ways. For example, Mattu et al. Use the provision calculation in a contract as a measure of the ‘depth’ of (2017) and examine whether the increase in trade flow after a given PTA is related to this measure. Dhingra et al. (2018) group into categories (e.g. provision of services, investment, and competition) and examine the impact of this ‘provision bundle’ on trade flow. Clearly, these methods come at the cost of not allowing the identification of the effects of individual provisions within each group.

New methods

In recent studies (Breinlich et al. 2022), we instead adopt a strategy from the machine learning literature – ‘Minimum Perfect Compression and Selection Operator’ (LASO) – in the context of selecting the most important provisions and quantifying their impact. More explicitly, we adapt to Baloni et al’s ‘hard lasso’ approach. (2016) Estimation of sophisticated gravity models for trade (e.g. Yotov et al. 2016, Weidner and Zylkin 2021).1

In contrast to traditional estimating methods such as minimum squares and maximum probability which is based on optimizing the in-sample fit of the approximate model, Lasso balances the sample fit with persimmons to optimize the outer fit of the sample and at the same time select more. Estimate their effects on important regressors and trade flows. In our context, Lasso works by minimizing the effect of individual provisions to zero and gradually removing those that do not significantly affect the suitability of the model (for an intuitive description, see Breinlich et al. 2021; for more details, see Breinlich et al. 2022). ). Strict lasso of Baloni et al. (2016), Lasso refers to a relatively recent variant, considering the idiosyncratic variation of data and refining this method by leaving variables that have a statistically large effect on the suitability of the model.

Since strict lasso is in favor of very trivial models, it may miss some important provisions. To solve this problem, we introduce two methods to identify potentially important provisions that may be missed by strict lasso. One of the methods, which we call ‘iceberg lasso’, involves revoking every provision selected by the strict lasso over all other provisions, with the aim of identifying relevant variables that were initially missed due to their integration with the selected provisions. Initial steps. Another method, called ‘bootstrap lasso’, increases the set of variables selected by the plug-in lasso when rigid lasso is bootstrapped.

Results and warnings

We use the World Bank’s database on deep trade agreements, where we observe 283 PTAs and 305 ‘necessary’ provisions that are divided into 17 sections detailed in Figure 1.2 Strict LASO selected eight provisions that are more strongly associated with increasing trade flows after the implementation of the relevant PTAs. As detailed in Table 1, these provisions include anti-dumping, competition policy, technical barriers to trade, and trade facilitation.

Table 1 Provisions selected by strict lasso

Based on these results, the Iceberg Lasso method identifies a set of 42 provisions and the Bootstrap Lasso identifies between 30 to 74 provisions that can affect trade, depending on how it is applied. Therefore, the Iceberg Lasso and Bootstrap Lasso methods select sets of provisions that are small enough and large enough to explain so as to give us some confidence that they include more relevant provisions. In contrast, the more traditional implementation of Lasso based on cross-validation selects 133 provisions.

Reassuringly, both Iceberg Lasso and Bootstrap Lasso choose similar provisions, mainly related to anti-dumping, competition policy, subsidies, technical barriers to trade, and trade facilitation. Therefore, although there is no causal explanation of our results and as a result, we cannot be sure which provisions are more important, we can reasonably be confident that the provisions in this field have a positive effect on trade.

In addition to identifying sets of provisions that may affect trade, our methods also provide an estimate of the increase in trade flows associated with selected provisions. We use these results to estimate the effects of various PTAs already implemented. Table 2 summarizes the approximate effects for the selected PTAs obtained using the various methods introduced. For example, Baier et al. (2017 and 2019), we find a wide variety of effects, ranging from very large influences to agreements that include a number of selected provisions that are not at all effective in non-inclusive agreements.3

Table 2 also shows that different methods can be quite different estimates, and therefore these results need to be interpreted carefully. As mentioned above, we do not have a causal explanation of the results. The accuracy of the predicted effect of the individual PTA will therefore depend on whether the selected provision will have a causal effect on trade or act as a signal of the presence of provisions that have causal effect. When this is the case, predictions based on this approach can be reasonably accurate, and Breinlich et al. (2022), we report simulation results that this is the case. However, it is possible to imagine situations where predictions based on our approach fail dramatically; For example, it may be that a PTA is incorrectly measured for zero effect despite having many true causal provisions. Finally, we note that our results can also be used to predict the effects of the new PTA, but the same caution applies.

Table 2 Partial effect for PTA estimated by various methods


We presented the results of an ongoing research project where we developed new methods for estimating the impact of individual PTA provisions on trade flows. By adopting strategies from the machine learning literature, we have created data-driven methods to select the most important provisions and measure their impact on trade flows. While our approach may not completely solve the fundamental problem of identifying provisions with a causal effect on trade, we have been able to make considerable progress. In particular, our results show that provisions relating to anti-dumping, competition policy, subsidies, technical barriers to trade, and trade facilitation methods may increase the trade-enhancing impact of PTAs. Based on these results, we were able to estimate the effects of individual PTAs.

Author’s note: This column updates and expands Breinlich et al. (2021). See also Fernandez et al. (2021).


Baier, SL, YV Yotov and T Zylkin (2017), “One size fits all: on the heterogeneous effects of free trade agreements”,, 28 April.

Baier, SL, YV Yotov and T Zylkin (2019), “On the Widely Different Impact of the Free Trade Agreement: Lessons from Twenty Years of Trade Integration”, Journal of International Economics 116: 206-228.

Baloni, A., V. Chernozukov, C. Hansen and D. Kozbur (2016), “Estimation of high-level panel models with an application in gun control”, Journal of Business and Economic Statistics 34: 590-605.

Breinlich, H, V Corradi, N Rocha, M Ruta, JMC Santos Silva and T Zylkin (2021), “Using Machine Learning to Evaluate the Impact of the Impact of Deep Trade Agreements”, AM Fernandez, N Rocha and M Ruta (Addis) ), The economy of deep trade agreementsCEPR Press.

Breinlich, H, V Corradi, N Rocha, M Ruta, JMC Santos Silva and T Zylkin (2022), “Machine Learning in International Trade Research – Assessing the Impact of Trade Agreements”, CEPR Discussion Paper 17325.

Dhingra, S, R Freeman and E Mavroeidi (2018), “Beyond Tariff Reduction: What Additional Boost for Trade from Contract Provisions?”, LSE Center for Economic Performance Discussion Paper 1532.

Fernandez, A. N. Rocha and M. Ruta (2021), “The Economy of the Deep Trade Agreement: A New Ebook”,, 23 June.

Hofmann, C, A Osnago and M Ruta (2019), “The Content of Preferential Trade Agreements”, World Trade Review 18 (3): 365-398.

Kohl, T. S. Brackman and H. Garrettsen (2016), “Does the Trade Agreement Stimulate International Trade Differently? Evidence from 296 Trade Agreements”, The world economy 39: 97-131.

Mattoo, A, A Mulabdic and M Ruta (2017), “Creating Trade in Deep Contracts and Destroying Trade”, Policy Research Working Paper Series 8206, World Bank, Washington, DC.

Mattu, A, N Rocha and M Ruta (2020), Handbook of deep trade agreementsWashington, DC: World Bank.

Mulabdic, A, A Osnago and M Ruta (2017), “Deep integration and UK-EU trade relations,” World Bank Policy Research Working Paper Series 7947.

Regmi, N. and S. Bare (2020), “Using Machine Learning Methods to Capture Differences in Free Trade Agreements,” Mimeograph.

Weidner, M. T. Zilkin (2021), “Bias and Consistency in the Three-Way Gravity Model,” Journal of International Economics: 103513.

Yotov, YV, R Piermartini, JA Monteiro and M Larch (2016), An Advanced Guide to Trade Policy Analysis: The Structural Gravity ModelGeneva: World Trade Organization.


1 Our method complements the method adopted by Regmi and Baier (2020), who use machine learning tools to create provision groups and then use these clusters in gravity equations. The main difference between the two methods is that Regmi and Baier (2020) use what is called an unsupervised machine learning method, which uses only the information of the provisions for cluster formation. In contrast, we select provisions using a monitoring method that considers the impact of the provisions on trade.

2 The provisions required in the PTA include a set of key provisions (which require specific integration / liberalization commitments and obligations) and discipline, transparency, application or objectives in procedures that are necessary to achieve key commitments (Mattu et al. 2020).

3 It is noteworthy that based on the traditional cross-validation method, Lasso leads to very sparse estimates of trade effects, some of which are clearly unimaginable. This further illustrates the superiority of our proposed methods.

Leave a Reply

Your email address will not be published.