ROUGE score is not matched with Pythonrouge when stemming=True

Hi @icoxfog417. Thank you for providing a great tool!
I found that the result is different between `Pythonrouge` and `RougeCalculator` when using `stemming=True`.
I attached the test code to reproduce (just change the option `stemming`):

```python
class TestRouge(unittest.TestCase):
    DATA_DIR = os.path.join(os.path.dirname(__file__), "data/rouge")

    def load_test_data(self):
        test_file = os.path.join(self.DATA_DIR, "ROUGE-test.json")
        with open(test_file, encoding="utf-8") as f:
            data = json.load(f)
        return data

    def test_rouge_with_stemming(self):
        data = self.load_test_data()
        rouge = RougeCalculator(stopwords=False, stemming=True)
        for eval_id in data:
            summaries = data[eval_id]["summaries"]
            references = data[eval_id]["references"]
            for n in [1, 2]:
                for s in summaries:
                    baseline = Pythonrouge(
                                summary_file_exist=False,
                                summary=[[s]],
                                reference=[[[r] for r in references]],
                                n_gram=n, recall_only=False,
                                length_limit=False,
                                stemming=True, stopwords=False)
                    b1_v = baseline.calc_score()
                    b2_v = rouge_n(rouge.tokenize(s),
                                   [rouge.tokenize(r) for r in references],
                                   n, 0.5)
                    v = rouge.rouge_n(s, references, n)
                    self.assertLess(abs(b2_v - v), 1e-5)
                    self.assertLess(abs(b1_v["ROUGE-{}-F".format(n)] - v), 1e-5) # noqa
```

Is this expected?
If so, is there any solution to match the results?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROUGE score is not matched with Pythonrouge when stemming=True #20

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ROUGE score is not matched with Pythonrouge when stemming=True #20

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions