I noticed substantial differences in results between textdeives and <a href="htt

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Different readability scores between textdescriptives and textstat about textdescriptives HOT 3 CLOSED

rnckp commented on June 9, 2024

Different readability scores between textdescriptives and textstat

from textdescriptives.

Comments (3)

rnckp commented on June 9, 2024 2

@HLasse @KennethEnevoldsen

Thank you very much for your comprehensive and thoughtful explanations and comments! This indeed answers my question. This is very helpful, I appreciate that.

All the best!

from textdescriptives.

KennethEnevoldsen commented on June 9, 2024 1

Removing the bug label as it isn't clear that it is a bug.

Thanks posting this issue. Our formulas for calculations are available here and implementation generally follow those of the package spacy-readability. However, do note that all of these metrics relies on estimated properties for instance determining average sentence length for flesch_reading_ease requires detection of sentence boundaries (using a different underlying model in textdescriptives will yield different results in edge cases, but generally it is fairly robust). This also seems like it is support by the fact that the mean sentence length is not a perfect match. The difference however seems to big... (in cases of disagreement it seems like spacy is typically better than textstat)

Actually it seems like textstat uses a different constant for Fleiss reading ease than our implementation (probably the main cause). The source is unclear but googling seems to stem from a German Thesis. So we assume a language agnostic constant (which is fitted on English) It might be better to use a language specific constant instead. Hmm there might also be a better default constant than the English one. (@HLasse, @LudvigOlsen what are your thoughts?). Seems like the syllables threshold for what constitutes hard words. textstat seems to use a default of 2 for German, where we use 3 (same as English), which is probably too low for German. It should probably be higher due to compound word.

from textdescriptives.

HLasse commented on June 9, 2024 1

Actually, IIRC, we based our implementation off of textstat so the values should be fairly similar (except for the differences in sentence detection etc., that you bring up, Kenneth).

+1 for Kenneth's comment: the minor deviations are most likely due to different sentence boundary detection and tokenization methods and are fairly negligible and tend to even out with longer texts.

Re. flesch reading ease, the implementations and constants are completely similar between textdescriptives and textstat for the English language. My initial suspicion was that we use different modules for hyphenation (counting syllables), but both use pyphen. So, there is likely to be a bug or at least an implementation difference in the way syllables are counted between the two libraries (textstat implementation, textdescriptives implementation. We should definitely look into this. Update: hyphenation works the same across textstat and textdescriptives. The difference is because of the different constants for the German language that textstat uses.

Re. gunning fog: This boils down to what Kenneth says: textstat uses a different threshold for hard words for German (2 syllables) than for English (3 syllables), where we use the threshold of 3 syllables regardless of language.

Re. your last point on the reliability of metrics for languages besides English: We have sought to implement metrics that are broadly reliable across all languages. What we mean by this, is that the rank-ordering of texts in terms of any of the metrics will correctly order them by reading ease. However, some metrics ( Flesch-Kincaid Grade) have constants that have been derived through modelling English text. The grade level will therefore likely only be reliable for English, but the metric is still useful for other languages for ranking the difficulty of texts (and is likely not that far off).

Hope this answers your question, @rnckp!

from textdescriptives.

Different readability scores between textdescriptives and textstat about textdescriptives HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent