1. What is the main objective of this study?

The primary objective of our study is to analyze the evolution of empirical methods and causal narratives in economics over the past four decades. By leveraging a custom large language model (LLM) to process over 44,000 working papers from NBER and CEPR, we aim to understand how the complexity and structure of causal claims have changed over time. We also investigate how these factors influence publication outcomes and the credibility of economic research.

2. How was the dataset constructed?

We compiled a comprehensive dataset of 44,852 working papers from the National Bureau of Economic Research (NBER) and the Centre for Economic Policy Research (CEPR), spanning from 1980 to 2023.

Our machine learning pipeline involved several key steps:

3. How did you extract causal claims from the papers?

We employed a custom large language model to analyze each paper and extract detailed causal relationships as presented by the authors. The LLM identified cause and effect variables, determined the types of causal relationships (e.g., direct effect, indirect effect), and recorded the causal inference methods used (e.g., RCT, IV, DiD). This resulted in an edge list per paper, where each row represents a causal claim, forming the basis for constructing causal graphs.

4. What are the key findings of the study?

5. How does this study contribute to existing literature?

Our study provides a comprehensive, data-driven analysis of the evolution of empirical methods and causal narratives in economics. By constructing causal graphs for a vast corpus of papers and quantifying narrative complexity, we offer new insights into how research practices have changed over time. This work contributes to the broader discourse on the "credibility revolution" in economics and underscores the need for transparency and replicability in research.

6. What are the implications of this study for economic research and policy?

7. How reliable are the methods used in this study?

We took several steps to ensure the reliability of our methods:

However, we acknowledge limitations such as potential biases introduced by the LLM and the challenges of mapping nuanced concepts to standardized codes.

8. Were there any limitations in your study?

Yes, our study has several limitations:

9. How can future research build on this study?

Future research can:

10. Is the dataset available for other researchers to use? 

Yes, we are making the aggregated dataset available for download under an Apache License 2.0. Researchers can access paper-level and causal claim-level data. For those interested in more detailed data or potential collaborations, please fill out our Data Access and Updates Form or contact us directly at prashant.garg@imperial.ac.uk.

11. How does this study address concerns about transparency and replicability?

Our study contributes by:

12. How do the findings relate to the 'credibility revolution' in economics?

The "credibility revolution" refers to the shift towards more rigorous empirical methods focused on causal inference. Our findings show significant growth in the use of advanced empirical methods like DiD, IV, and RCTs, reflecting this movement. However, we also highlight challenges such as increased narrative complexity and underreporting of null results, suggesting that while methods have advanced, issues related to transparency and replicability remain.

13. What are causal graphs, and how are they used in your study? 

Causal graphs are visual representations of causal relationships, where nodes represent variables or concepts, and directed edges represent causal effects from one variable to another. In our study, we constructed causal graphs for each paper by mapping the extracted causal claims to JEL codes. This allowed us to analyze the complexity and structure of causal narratives systematically.
For more info, check Pearl, J., 2009. Causality. Cambridge university press.

14. How did you measure causal narrative complexity?

We developed several measures based on the causal graphs:

These measures help quantify the depth and interconnectedness of the causal narratives in each paper.

15. How does this study relate to the use of AI in economics research?

Our study demonstrates how AI, specifically large language models, can be utilized to process and analyze large volumes of academic text efficiently. By automating the extraction of structured information from tens of thousands of papers, we showcase the potential of AI to augment traditional research methods in economics, opening avenues for large-scale meta-analyses and insights into research trends.

16. Did you find any differences across different fields within economics?

Yes, variations were observed:

These differences reflect how methodological choices and research practices vary across subfields.

17. How can your findings inform publication practices in economics?

Our findings suggest that top journals may favor papers with complex causal narratives. Recognizing this can help authors in structuring their research and may prompt journals to consider potential biases in their publication processes. Encouraging the publication of studies with simpler narratives or null results could enhance diversity and transparency in published research. 

18. How did you handle data privacy and ethical considerations in your research?

We adhered to ethical standards by:

19. Where can I find more details about the methods used in this study?

A detailed explanation of our methods can be found in the Methods section of our paper, available as a preprint here. For an accessible overview, please visit the Methods page on our website. If you have further questions, feel free to contact us at team@causal.claims

We hope this FAQ addresses your questions about our study. If you have additional inquiries, please do not hesitate to reach out to us