Introduction

About the project

MeWiKo (Medien und Wissenschaftliche Kommunikation | Media and Science Communication) was a BMBF-funded project to investigate the relationship between external reporting on scientific publications and their bibliometric/altmetric impact.

MeWiKo-Co is an extension of this project, which aims, among other things, to create a comprehensive informative dataset of Covid-19-related preprints to help journalists, scholars and bibliometricians find their way through the volume of preprint publications. These efforts resulted in the generative dataset presented here.

What is a generative dataset?

A dataset that is not published as a dataset, but in the form of notebooks and scripts that create the dataset

Why a generative approach?

Licensing terms of Dimensions do not allow distribution of datasets.

And why Dimensions?

According to our preliminary research, it is the best aggregator of information on publications that is free to use for scientific purposes

How to use the generative dataset

graph LR 
    Start1((1)) --> A
    Start2((2)) --> C
    A[Query in Dimensions] --> B(Metadata)
    A --> C[List of DOIs] 
    C --> D[Networks]
    C --> E[Full texts]
    C --> F[Embeddings]
    C --> G[Citation intents]

If you have access to Dimensions, you can start at (1) and run a query that returns a list of Covid19-related preprints with their metadata.

Without access to Dimensions metadata, you can start at (2) and compile a list of relevant publications and their DOIs from other sources.