Subword embeddings trained on scientific texts

subword-vectors is a repository to download (or train) subword embeddings from the arXiv dataset of 1.7M+ scholarly papers.

A Manually Annotated Test Collection for Citation Recommendation

acm-cr is a repository that contains a test collection for (context-aware) citation recommendation constructed from bibliographic records and open-access papers collected from the ACM Digital Library.