Imports and Login
import dimcli
import pandas as pd
from clj import mapcat
from funcy import chunks
import json
dimcli.login()= dimcli.Dsl() dsl
We can get more information about journals using the Source Titles source from Dimensions:
The Source Titles source is a curated database of publications ‘containers’, for example journals, preprint servers, book series and others. This is analogous to the ‘source titles’ facet available in the Dimensions web application.
Of all available fields in this source, two journal indicators are likely to be of high interest:
SJR indicator (SCImago Journal Rank). This indicator measures both the number of citations received by a journal and the importance or prestige of the journals where the citations come from.
SNIP indicator (source normalized impact per paper). This indicator measures the average citation impact of the publications of a journal.
import dimcli
import pandas as pd
from clj import mapcat
from funcy import chunks
import json
dimcli.login()= dimcli.Dsl() dsl
We would like to get additional journal information for both preprints and resulting journals.
= {
dfs_meta "preprints": {
"name":"preprints",
"input_file": INPUT_PREPRINTS,
"output_file": OUTPUT_PREPRINTS,
"df": pd.read_csv(INPUT_PREPRINTS)
},"resulting_publications": {
"name":"resulting publications",
"input_file": INPUT_RESULTING_PUBLICATIONS,
"output_file": OUTPUT_RESULTING_PUBLICATIONS,
"df": pd.read_csv(INPUT_RESULTING_PUBLICATIONS)
} }
= 300
CHUNK_SIZE
for k,v in dfs_meta.items():
= v["df"]
df = list(df["journal.id"].dropna().unique())
unique_journal_ids print(v["name"], len(unique_journal_ids))
= []
data
for ids in chunks(CHUNK_SIZE, unique_journal_ids):
= "search source_titles where id in {} return source_titles [id + title + publisher + type + linkout + sjr + snip]" + f" limit {CHUNK_SIZE*2}"
q
= dsl.query(q.format(json.dumps(ids))).as_dataframe()
results_df
data.append(results_df)
= pd.concat(data)
source_titles_df = source_titles_df.rename(columns = {"id":"journal.id"})
source_titles_df "output_file"], index=None) source_titles_df.to_csv(v[
The journal indicators provided can be used to create rankings and evaluate whether, for example, a preprint has been published in a high-impact journal.
journal.id | title | publisher | type | sjr | snip | linkout | |
---|---|---|---|---|---|---|---|
279 | jour.1019114 | Cell | Cell Press | journal | 25.70 | 9.44 | NaN |
1191 | jour.1014075 | New England Journal of Medicine | New England Journal of Medicine | journal | 24.90 | 20.10 | NaN |
408 | jour.1113716 | Nature Medicine | Nature Publishing Company | journal | 24.20 | 12.20 | https://www.nature.com/nm/ |
402 | jour.1115214 | Nature Biotechnology | Nature America Publishing | journal | 20.10 | 9.58 | https://www.nature.com/nbt/ |
281 | jour.1018957 | Nature | Nature Publishing Group | journal | 17.90 | 11.30 | https://www.nature.com/nature/ |
102 | jour.1103138 | Nature Genetics | Nature Pub. Co. | journal | 16.50 | 7.51 | https://www.nature.com/ng/ |
2219 | jour.1077219 | The Lancet | Elsevier | journal | 15.70 | 33.80 | NaN |
32 | jour.1346339 | Science | American Association for the Advancement of Sc... | journal | 14.60 | 9.12 | NaN |
409 | jour.1112054 | Immunity | Cell Press | journal | 14.10 | 6.07 | NaN |
625 | jour.1381482 | The Lancet Microbe | Elsevier Ltd. | journal | 13.30 | 6.50 | NaN |
74 | jour.1154037 | Foundations and Trends® in Machine Learning | Now Publishers | journal | 13.20 | 21.10 | NaN |
548 | jour.1030471 | Annual Review of Astronomy and Astrophysics | Annual Reviews | journal | 12.80 | 7.17 | NaN |
1146 | jour.1033763 | Nature Methods | Nature Pub. Group | journal | 12.10 | 8.13 | https://www.nature.com/nmeth |
1608 | jour.1118362 | Nature Neuroscience | Nature Publishing Group | journal | 12.00 | 5.23 | https://www.nature.com/neuro/ |
677 | jour.1284335 | Science Immunology | American Association for the Advancement of Sc... | journal | 12.00 | 4.21 | NaN |
676 | jour.1284483 | The Lancet Public Health | Elsevier, Ltd. | journal | 11.40 | 11.90 | NaN |
2124 | jour.1327517 | The Lancet Respiratory Medicine | Elsevier | journal | 11.10 | 14.70 | NaN |
550 | jour.1030033 | The Lancet Infectious Diseases | Elsevier Science ;The Lancet Pub. Group | journal | 10.20 | 10.70 | NaN |
549 | jour.1030059 | Cancer Cell | Cell Press | journal | 9.97 | 5.38 | NaN |
92 | jour.1117828 | Molecular Cell | Cell Press | journal | 9.96 | 3.18 | NaN |