Joachim Breitner's Homepage
Generating bibtex bibliographies from DOIs via DBLP
I sometimes write papers and part of paper writing is assembling the bibliography. In my case, this is done using BibTeX. So when I need to add another citation, I have to find suitable data in Bibtex format.
Often I copy snippets from .bib
files from earlier paper.
Or I search for the paper on DBLP, which in my experience has highest quality BibTeX entries and best coverage of computer science related publications, copy it to my .bib
file, and change the key to whatever I want to refer the paper by.
But in the days of pervasive use of DOIs (digital object identifiers) for almost all publications, manually keeping the data in bibtex files seems outdated. Instead I’d rather just put the two pieces of data I care about: the key that I want to use for citation, and the doi. The rest I do not want to be bothered with.
So I wrote a small script that takes a .yaml
file like
entries:
unsafePerformIO: 10.1007/10722298_3
dejafu: 10.1145/2804302.2804306
runST: 10.1145/3158152
quickcheck: 10.1145/351240.351266
optimiser: 10.1016/S0167-6423(97)00029-4
sabry: 10.1017/s0956796897002943
concurrent: 10.1145/237721.237794
launchbury: 10.1145/158511.158618
datafun: 10.1145/2951913.2951948
observable-sharing: 10.1007/3-540-46674-6_7
kildall-73: 10.1145/512927.512945
kam-ullman-76: 10.1145/321921.321938
spygame: 10.1145/3371101
cocaml: 10.3233/FI-2017-1473
secrets: 10.1017/S0956796802004331
modular: 10.1017/S0956796817000016
longley: 10.1145/317636.317775
nievergelt: 10.1145/800152.804906
runST2: 10.1145/3527326
polakow: 10.1145/2804302.2804309
lvars: 10.1145/2502323.2502326
typesafe-sharing: 10.1145/1596638.1596653
pure-functional: 10.1007/978-3-642-14162-1_17
clairvoyant: 10.1145/3341718
subs:
- replace: Peyton Jones
with: '{Peyton Jones}'
and turns it into a nice .bibtex
file:
$ ./doi2bib.py < doibib.yaml > dblp.bib
$ head dblp.bib
@inproceedings{unsafePerformIO,
author = {Simon L. {Peyton Jones} and
Simon Marlow and
Conal Elliott},
editor = {Pieter W. M. Koopman and
Chris Clack},
title = {Stretching the Storage Manager: Weak Pointers and Stable Names in
Haskell},
booktitle = {Implementation of Functional Languages, 11th International Workshop,
IFL'99, Lochem, The Netherlands, September 7-10, 1999, Selected Papers},
The last bit allows me to do some fine-tuning of the file, because unfortunately, not even DBLP BibTeX files are perfect, for example in the presence of two family names.
Now I have less moving parts to worry about, and a more consistent bibliography.
The script is rather small, so I’ll just share it here:
#!/usr/bin/env python3
import sys
import yaml
import requests
import requests_cache
import re
='sqlite')
requests_cache.install_cache(backend
= yaml.safe_load(sys.stdin)
data
for key, doi in data['entries'].items():
= requests.get(f"https://dblp.org/doi/{doi}.bib").text
bib = re.sub('{DBLP.*,', '{' + key + ',', bib)
bib for subs in data['subs']:
= re.sub(subs['replace'], subs['with'], bib)
bib print(bib)
There are similar projects out there, e.g. dblpbibtex
in C++ and dblpbib
in Ruby. These allow direct use of \cite{DBLP:rec/conf/isit/BreitnerS20}
in Latex, which is also nice, but for now I like to choose more speaking citation keys myself.
Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.