Journal Information

Introduction

We can get more information about journals using the Source Titles source from Dimensions:

The Source Titles source is a curated database of publications ‘containers’, for example journals, preprint servers, book series and others. This is analogous to the ‘source titles’ facet available in the Dimensions web application.

Of all available fields in this source, two journal indicators are likely to be of high interest:

SJR indicator (SCImago Journal Rank). This indicator measures both the number of citations received by a journal and the importance or prestige of the journals where the citations come from.

SNIP indicator (source normalized impact per paper). This indicator measures the average citation impact of the publications of a journal.

Querying additional information about journals

Imports and Login
import dimcli
import pandas as pd
from clj import mapcat
from funcy import chunks
import json


dimcli.login()
dsl = dimcli.Dsl()

We would like to get additional journal information for both preprints and resulting journals.

Define datasets
dfs_meta = {
   "preprints": {
    "name":"preprints",
    "input_file": INPUT_PREPRINTS,
    "output_file": OUTPUT_PREPRINTS,
    "df": pd.read_csv(INPUT_PREPRINTS)
   },
   "resulting_publications": {
    "name":"resulting publications",
    "input_file": INPUT_RESULTING_PUBLICATIONS,
    "output_file": OUTPUT_RESULTING_PUBLICATIONS,
    "df": pd.read_csv(INPUT_RESULTING_PUBLICATIONS)
   } 
}
Query for source title information
CHUNK_SIZE = 300

for k,v in dfs_meta.items():
    df = v["df"]
    unique_journal_ids = list(df["journal.id"].dropna().unique())
    print(v["name"], len(unique_journal_ids))

    data = []

    for ids in chunks(CHUNK_SIZE, unique_journal_ids):
        q = "search source_titles where id in {} return source_titles [id + title + publisher + type + linkout + sjr + snip]" + f" limit {CHUNK_SIZE*2}" 

        results_df = dsl.query(q.format(json.dumps(ids))).as_dataframe()

        data.append(results_df)

    source_titles_df = pd.concat(data) 
    source_titles_df = source_titles_df.rename(columns = {"id":"journal.id"})  
    source_titles_df.to_csv(v["output_file"], index=None)

Example: Top 20 journals for resulting publications by SJR

The journal indicators provided can be used to create rankings and evaluate whether, for example, a preprint has been published in a high-impact journal.

journal.id title publisher type sjr snip linkout
279 jour.1019114 Cell Cell Press journal 25.70 9.44 NaN
1191 jour.1014075 New England Journal of Medicine New England Journal of Medicine journal 24.90 20.10 NaN
408 jour.1113716 Nature Medicine Nature Publishing Company journal 24.20 12.20 https://www.nature.com/nm/
402 jour.1115214 Nature Biotechnology Nature America Publishing journal 20.10 9.58 https://www.nature.com/nbt/
281 jour.1018957 Nature Nature Publishing Group journal 17.90 11.30 https://www.nature.com/nature/
102 jour.1103138 Nature Genetics Nature Pub. Co. journal 16.50 7.51 https://www.nature.com/ng/
2219 jour.1077219 The Lancet Elsevier journal 15.70 33.80 NaN
32 jour.1346339 Science American Association for the Advancement of Sc... journal 14.60 9.12 NaN
409 jour.1112054 Immunity Cell Press journal 14.10 6.07 NaN
625 jour.1381482 The Lancet Microbe Elsevier Ltd. journal 13.30 6.50 NaN
74 jour.1154037 Foundations and Trends® in Machine Learning Now Publishers journal 13.20 21.10 NaN
548 jour.1030471 Annual Review of Astronomy and Astrophysics Annual Reviews journal 12.80 7.17 NaN
1146 jour.1033763 Nature Methods Nature Pub. Group journal 12.10 8.13 https://www.nature.com/nmeth
1608 jour.1118362 Nature Neuroscience Nature Publishing Group journal 12.00 5.23 https://www.nature.com/neuro/
677 jour.1284335 Science Immunology American Association for the Advancement of Sc... journal 12.00 4.21 NaN
676 jour.1284483 The Lancet Public Health Elsevier, Ltd. journal 11.40 11.90 NaN
2124 jour.1327517 The Lancet Respiratory Medicine Elsevier journal 11.10 14.70 NaN
550 jour.1030033 The Lancet Infectious Diseases Elsevier Science ;The Lancet Pub. Group journal 10.20 10.70 NaN
549 jour.1030059 Cancer Cell Cell Press journal 9.97 5.38 NaN
92 jour.1117828 Molecular Cell Cell Press journal 9.96 3.18 NaN