Matte Hartog

Background

Matte Hartog is a Research Fellow of the academic team at the Center for International Development's Growth Lab. The center works to advance the understanding of development and deploy breakthrough research, in collaboration with external stakeholders, to address the world’s most pressing challenges.

Teaching

The Tools and Methods of Economic Complexity Analysis (instructor, 2021 - present)
Aimed at second-year MPA/ID students, this module present the methods of economic complexity as they can be used for the analysis of countries and regions, firms, products, occupations, technologies, scientific research and educational areas, as well as the interactions among them and their evolution over time. The course presumes previous exposure to linear algebra, statistics, economics and either R or Python. More info here.
- Latest session, 23 Feb 2023:
- in Python: https://colab.research.google.com/github/matteha/product-space-eci-works...
- in R: https://colab.research.google.com/github/matteha/product-space-eci-works...
- Lecture on agglomeration economies and development: here
Executive Education: Leading Economic Growth (teaching assistant, 2016 - 2019)
Guiding project teams using the Atlas of Economic Complexity and its corresponding Country Profiles. More info here.

Projects

Document intelligence and large scale probabilistic matching
We use deep learning for optical character recognition (denoising CNN, LSTM neural net for historical fonts) and natural language processing (fine-tuned BERT, and Llama 1/2, for NER) - in combination with multimodal Transformer models (e.g. LayoutLMV3) - to analyze ~500,000 book pages on patents and R&D labs from 1850 onwards, and develop parallel computing libraries on Harvard’s supercomputing cluster to link these to billions of US Census records (using XGBoost), to study the professionalization of technological progress.
Output: work in progress, finish June 2023.
The impact of COVID-19
Using global investment data to identify Ukraine's European economic integration opportunities
In collaboration with the World Bank, we use data on 120 million firms worldwide combined with international trade data to analyze Ukraine's opportunities to participate in European value chains. We apply traditional gravity models and economic complexity analysis to analyze trade and foreign direct investment (FDI).
Output: Hartog, M., Lopez-Cordova, J.E. & Neffke, F., (2021) Assessing Ukraine's Role in European Value Chains: A Gravity Equation-cum-Economic Complexity Analysis Approach
Media coverage: This paper and follow-up work was covered by Christian Science Monitor and Bloomberg.
Exponential random graph models
We develop and apply exponential random graph models to analyze the evolution of collaboration networks.
Output:
Broekel, T. and Hartog, M. (2014), Determinants of cross-regional R&D collaboration networks: an application of exponential random graph models. In: Advances in Spatial Science. The geography networks and R&D collaborations. New York: Springer, pp. 49-80.
Broekel, T. and Hartog, M. (2013), Explaining the structure of inter-organizational networks using exponential random graph models. Industry and Innovation, 20 (3), pp. 277-295. Available here.
GMM estimators for causal inference
We develop and apply GMM estimators for causal inference of the impact of industrial compositions on employment growth.
Output: Hartog, M., Boschma, R. and Sotarauta, M. (2012), The impact of related variety on regional employment growth in Finland 1993-2006: High-tech versus medium/low-tech. Industry and Innovation, 19 (6), pp. 459-476. Available here.
Proportional hazard models
We use survival models on all banks in the Netherlands that existed between 1850 and 1993 to explain their role in acquiring other banks and driving spatial clustering.
Output: Boschma, R. and Hartog, M. (2014), Merger and Acquisition Activity as Driver of Spatial Clustering: The Spatial Evolution of the Dutch Banking Industry, 1850-1993. Economic Geography, 90 (3), pp. 247 - 266. Available here.
Agents of structural transformation of regional economies
We use social security data and administrative records covering the full population of Sweden from 1974 onwards, accessible through Statistic's Sweden supercomputing environment, to study the agents (firms, entrepreneurs) of structural economic transformation of regions.
Output: Neffke, F., Hartog, M., Boschma, R. and Henning, M. (2018), Agents of Structural Change. The role of entrepreneurs and expanding firms in regional diversification. Economic Geography 1 (94), pp. 23 – 48. Available here.
Mexican Ministry of Health: Atlas of Economic Complexity
Using data from the Mexican Ministry of Health, we created the Atlas of Economic Complexity of Mexico that serves as a diagnostic tool - for policy makers, entrepreneurs and firms - to analyze the productivity of departments, cities, and municipalities.
Output: Mexican Atlas of Economic Complexity at http://complejidad.datos.gob.mx/
The role of the diaspora in the internationalization of the Colombian economy
Using the ORBIS database, covering over 100 million establishments worldwide, in conjunction with other datasets to identify the Colombian diaspora, its resulting brain drain and consequences.
Output: Nedelkoska, L., Assumpcao, A., Grisanti, A., Hartog, M., Hinz, J., Lu, J., Muhaj, D., Protzer, E., Saxenian, A., Hausmann, R. (2021) The Role of the Diaspora in the Internationalization of the Colombian Economy. CID Faculty Working Paper Series 2021.397, Harvard University, Cambridge, MA.
The importance of tenure and experience for wages in Saudi Arabia
Through on-site work in Riyadh, Saudi Arabia, with government officials and the statistical office, we analyze administrative data covering the Saudi population merged to Saudi export data, allowing us to track workers' job mobility and analyze skill relatedness between industries and the impact on tenure and wages using standard Topel models in labor economics.
Output: Work in progress.

Software / computing

Polars is a very promising data analysis framework (in Python / Rust) to handle large datasets - particularly in terms memory efficiency and expressive syntax - I wrote its Emacs implementation for Jupyter consoles and notebooks here.
I also contribute to the EconGeo package (in R), which computes geospatial indices and complexity (network) metrics.
The following notebooks I wrote in Python and R as a step-by-step guide to create matrices on the co-occurence of activities and proximities, and subsequently calculate and analyze economic complexity and product complexity indices, revealed comparative advantage (RCAs) indices and product space visualizations using D3Plus.
With Frank Neffke I wrote the following STATA .do files which can be applied to large scale social security / administrative data to identify agents of structural transformation of regional economies.
For network analysis purposes, Tom Broekel and I developed exponential random graph models, which can be used to study networks where longitudinal data is of poor quality. The corresponding software is included in the online appendices of these studies:
- Broekel, T. and Hartog, M. (2013), Explaining the structure of inter-organizational networks using exponential random graph models. Industry and Innovation, 20 (3), pp. 277-295.
- Broekel, T. and Hartog, M. (2014), Determinants of cross-regional R&D collaboration networks: an application of exponential random graph models. In: T. Scherngell (ed.) Advances in Spatial Science. The geography networks and R&D collaborations. New York: Springer & Tensorscience, pp. 49-80.
HPC, To publish: parallel computing libraries for CentOS / SLURM (e.g. PyOMP), developed on Harvard's supercomputing environment Cannon, in conjunction with Python libraries on OCR and NLP to analyze (historical) (handwritten) documents.

Background

Teaching

Projects

Software / computing

Contact

e3a506e6480de786c065fc3da441da9c

a888ee78aa6ffebff4d3e7c8c3092933