norden.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Moin! Dies ist die Mastodon-Instanz für Nordlichter, Schnacker und alles dazwischen. Folge dem Leuchtturm.

Administered by:

Server stats:

3.4K
active users

#ModelEvaluation

0 posts0 participants0 posts today
Replied in thread

@data @datadon 🧵

How to assess a statistical model?
How to choose between variables?

Pearson's #correlation is irrelevant if you suspect that the relationship is not a straight line.

If monotonic relationship:
"#Spearman’s rho is particularly useful for small samples where weak correlations are expected, as it can detect subtle monotonic trends." It is "widespread across disciplines where the measurement precision is not guaranteed".
"#Kendall’s Tau-b is less affected [than Spearman’s rho] by outliers in the data, making it a robust option for datasets with extreme values."
Ref: statisticseasily.com/kendall-t

LEARN STATISTICS EASILY · Kendall Tau-b vs Spearman: Which Correlation Coefficient Wins?Discover why Kendall Tau-b vs Spearman Correlation is crucial for your data analysis and which coefficient offers the most reliable results.
Continued thread

4/

Below, key points:

- "lack of #ModelEvaluation"

- statistical #uncertainty & "#robustness of event attribution results"

#References

[4] Seneviratne, et al., 2021. Chapter 11: weather and climate extreme events in a changing climate. In: Climate Change 2021: The Physical Science Basis - Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change. IPCC, Geneva, Switzerland, pp. 1513–1766. purl.org/INRMM-MiD/z-ED8RQFV5

ipcc.ch/report/ar6/wg1/downloa

Replied in thread

Anyway, I keep meaning to write up a blog post on “falsehoods I have believed about measuring model performance” touching on #AppliedML issues related to #modelEvaluation, #metrics, #monitoring, #observability, and #experiments (#RCTs). The cool kids would call this #AIAlignment in their VC pitch decks, but even us #NormCore ML engineers have to wrestle with how to measure and optimize the real-world impact of our models.