Open Scholarly Data @ SUB Göttingen
The Scholarly Communication Analytics team at the State and University Library in Göttingen maintains a publicly accessible Open Scholarly Data Warehouse, which is based on Google BigQuery.
The warehouse features monthly Crossref snapshots, as well as data from various other sources, including OpenAlex, Semantic Scholar and Unpaywall, and provides access to bibliometric data from the German Competence Network for Bibliometrics.
Google BigQuery is provided as part of the OCRE 2024 Framework, with support from the GWGD.
More info: https://subugoe.github.io/scholcomm_analytics/
Contact: Najko Jahn
cr_history
cr_instant
hoaddata
oa2020
Estimating global publishing output by leading commercial publishers using open metadata.
Work carried out for OA2020 WG on financial flows and future cost scenarios https://oa2020.org/working-groups/openalex
openalex_walden
openbib
This dataset contains the most recent OPENBIB snapshot.
For more information, see: https://zenodo.org/records/18429476resources
semantic_scholar
This dataset contains a snapshot from Semantic Scholar.
–
Data Source & License
This dataset contains information from the Semantic Scholar Open Research Corpus, provided by the Allen Institute for Artificial Intelligence (AI2) and made available under the Open Data Commons Attribution License (ODC-By) v1.0.
The ODC-By license governs the database rights only — that is, the structure, organisation, and compilation of the data. It does not cover the rights in the individual contents of the database, such as paper titles, abstracts, or full texts, which may be subject to separate copyright or license terms held by their respective authors, publishers, or other rights holders. Users are responsible for ensuring their use of any such content complies with the applicable terms.
For scientific publications making use of this data, please also cite:
Kyle Lo, Lucy Lu Wang, Mark Neumann, Rodney Kinney, and Daniel Weld. 2020. S2ORC: The Semantic Scholar Open Research Corpus. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4969–4983, Online. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.447