Taking a Second Look: Post-Dissertation Reflections on My Dataset

21.05.2025 , in ((Practices)) , ((No commenti))

After finishing my PhD dissertation, I was somewhat disappointed not to have exploited the full potential of my dataset. In other words, months of work can result in valuable datasets that remain largely unexplored once our PhD dissertations are complete. Yet this data holds great potential for further analysis and insight. Scholars should therefore revisit their past work and data not just for the initial purpose, but also to uncover new perspectives that might otherwise be missed.

A classic problem during a doctoral project is constantly refining one’s thematic focus. While this allows us to carry out meaningful analyses, it often means that parts of our data go unused. It is hard not to feel frustrated, knowing how much useful data ends up gathering dust once a research project is completed.

However, data on asylum regimes is a rare commodity, and often very difficult to obtain. I spent months building a dataset that holds the potential to answer far more than just my main dissertation question, and yet I hardly used this opportunity after submitting my dissertation. Eventually, I moved on to new topics and projects, and I don’t know when or if I will ever have the opportunity to look at my data again.

How I Started Off

The first stage of my PhD was to define a clear research topic. Once this was settled and the literature reviewed, I turned to thinking about how best to approach my research question. At that point, I had not collected any data, and could not find any existing quantitative datasets that fit my needs. A purely qualitative research strategy, focused only on close readings of qualitative data, was also not what I had in mind either.

So, inspired by Spirig (2018, 2023) and Gertsch (2021), I decided to build my own dataset. I scraped the decision database of the Federal Administrative Court (FAC) – the Swiss asylum appeals court – and compiled information from approximately 45,000 case files. This gave me the basis for quantitative analyses, allowing a “distant reading” (Livermore & Rockmore, 2019) of how the court handled controversial cases that had been dismissed in the first instance (the State Secretariat for Migration, SEM).

Working with the Data

Once the dataset was complete, it opened the door to a variety of possible analyses. In the end, though, and this was intended, I focused only on a relatively specific sub-category of the data, just 2% of the total cases. Still, I could not help but feel that the remaining 98% had more to offer. After all, the FAC is the second instance of the Swiss asylum system, and it is fair to say that its decisions relate to the most pressing and salient issues at any given time (Bolz, 2021).

While the FAC only deals with a subset of all asylum cases, its decisions function as a kind of a legal testing ground. Its judges effectively decide how future cases will be assessed and how certain categories of asylum applications will be handled. That alone shows how interesting this data is, and, in my opinion, has been analysed far too little using quantitative research strategies (Gertsch 2021 and Spirig 2018, 2023).

Examples of What the Database Can Reveal

Even without downloading or scraping anything, the court’s website allows initial simple analyses that can be used as a starting point for research projects. For instance, users can filter cases according to various criteria, such as keywords and the year of the appeal. If we filter using the tag ‘Syria’ – hence ask if the country has an important role in the case – and sort by the year when the appeal was handed in, we get the following results:

This statistic allows me to show when Syria became prominent and controversial, and how many decisions were appealed, which in turn can be a proxy for the relevance of the topic. Obviously, the events of 2014/2015 have had an impact on the data. The escalating violence in the Syrian civil war – e.g., due to the offensives by the Islamic State – left its mark.

Today, on the other hand, only a few cases with the keyword Syria are dealt with in the second instance. This is also interesting when compared with other countries. Let’s take the example of Afghanistan:

For Afghanistan, the peak seems to be after the Taliban seized power in 2022. The drop in 2024 may be explained by a more consolidated legal practice, though this would require further in-depth analysis. In any case, what is clear is that the drop does not reflect fewer asylum applications: the SEM recorded 7,054 cases from Afghanistan in 2022, 7,934 in 2023 and 8,627 in 2024 (SEM, 2025). This suggests that the court’s data is not representative of the overall number of cases, but rather shows which categories are legally contested.

A Valuable Resource Worth Returning To

With relatively little effort, we can use the database to find out which countries or topics are relevant in the Swiss asylum system at different times. Even the very brief examples above show how valuable the database can be for researchers. It should therefore be used more often. In particular, with the rise of AI tools that have greatly simplified the use of quantitative text analysis, there is an opportunity to apply sophisticated (and critical…) quantitative and computational methods (Drouhot et al., 2022; Engel et al., 2021; Törnberg, 2023; Törnberg & Uitermark, 2021) to this data to gain new insights and perspectives on the Swiss asylum system.

Moreover, going beyond a surface-level analysis requires a lot of work and takes time –  the kind of time and effort that goes into a PhD. And once the dissertation is finished, it is easy to leave the data and move on. Personally, I am not sure when and if I will use my data again. Many of us put months into collecting and structuring our data. It is worth asking ourselves what else we could learn from it now? After all, our data is a valuable product of long work and reflection, and new perspectives on the subject often emerge once the final presentation is over.

Mathis Schnell holds a Master’s degree in Political Science (University of Zurich), a PhD in Social Sciences (University of Neuchâtel), as well as a Certificate in Migration and Mobility Studies from the nccr – on the move. He currently works as a chargé de recherche at the Research Centre for Political Action at the University of Lausanne. (LinkedIn).

A Special Note

Thanks to Janine Dahinden and Inka Sayed for their valuable feedback.

References:

–Bolz, S. (2021). Das Verfahren vor dem Bundesverwaltungsgericht. In Schweizerische Flüchtlingshilfe (Ed.), Handbuch zum Asyl- und Wegweisungsverfahren (3rd ed., pp. 395–446). Haupt Verlag.
–Drouhot, L. G., Deutschmann, E., Zuccotti, C. V., & Zagheni, E. (2022). Computational Approaches to Migration and Integration Research: Promises and Challenges. Journal of Ethnic and Migration Studies, 49(2), 1–19.
–Engel, U., Quan-Haase, A., Liu, S. X., & Lyberg, L. (2021). Handbook of Computational Social Science, Volume 1: Theory, Case Studies and Ethics. Routledge.
–Gertsch, G. (2021). Richterliche Unabhängigkeit und Konsistenz am Bundesverwaltungsgericht: Eine quantitative Studie. Schweizerisches Zentralblatt für Staats- und Verwaltungsrecht, 1, 1–23.
–Livermore, M. A., & Rockmore, D. N. (2019). Distant Reading the Law. In M. A. Livermore & D. N. Rockmore (Eds.), Law as data: Computation, text, & the future of legal analysis (pp. 3–20). SFI Press.
–SEM. (2025). Asylstatistik. [12.04.2025].
–Spirig, J. (2018). Like cases alike or asylum lottery? Inconsistency in judicial decision making at the Swiss Federal Administrative Court. University of Zurich.
–Spirig, J. (2023). When Issue Salience Affects Adjudication: Evidence from Swiss Asylum Appeal Decisions. American Journal of Political Science, 67(1), 55–70.
–Törnberg, P. (2023). ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning (arXiv:2304.06588). arXiv.
–Törnberg, P., & Uitermark, J. (2021). For a heterodox computational social science. Big Data & Society, 8(2), 1–13.