ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications

This repository contains data for our paper "ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications" and a small utility class to work with it.

HuggingFace datasets

You can also use Huggin Face datasets to load ACLSum (dataset link). This would be convenient if you want to train transformer models using our dataset.

Just do,

from datasets import load_dataset
dataset = load_dataset("sobamchan/aclsum")

Our utility class

If you want to see what's in our data more carefully, the following example code on how to use our utility class may be helpful.

You can install the library with the dataset via pip, just run,

pip install aclsum

then you can load the dataset from your python code as,

from aclsum import ACLSum

# Load per split ("train", "val", "test")
train = ACLSum("train")

# One data sample (= paper)
document = train[0]

# Three summaries on each aspect (dict[aspect, summary])
document.summaries

# Get all the sentences from the paper (we only work with abstract, introduction, and conclusion sections) (list[str])
document.get_all_sentences() 

# You can specify sections to extract sentences from
document.get_all_sentences(["abstract", "conclusion"])

# Get highlight labels (list[0 or 1])
document.get_all_highlights()

# Get highlighted sentences (list[str])
document.get_all_highlighted_sentences()

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src/aclsum		src/aclsum
tests		tests
.gitignore		.gitignore
README.md		README.md
pdm.lock		pdm.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications

HuggingFace datasets

Our utility class

About

Releases

Packages

Languages

eltociear/aclsum

Folders and files

Latest commit

History

Repository files navigation

ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications

HuggingFace datasets

Our utility class

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages