Lemmy requires active users to manually search for communities and discover content. Instances can choose to defederate from other instances, but I want instances that block as few other users as possible so I can decide for myself what content I see.

I want to add a column to this script to analyze Lemmy instances and identify communities that have high user activity but low blocking of users.

Initially I was thinking of adding a column that calculates the ratio of:

(active users) / (total blocked users)

However, this runs into a divide by zero error if there are no blocked users.

I’ve thought of a few ways to handle the ZeroDivisionError case, but there could be a better metric entirely that avoids this issue or gives a good measure of high activity + low blocking.

Does anyone have ideas for a better metric or ratio to use here?

Some context on what the data looks like:

  • “active users” = number of active users in the past month
  • “total blocked users” = sum of active users from all instances blocking or being blocked by this instance

Let me know if you have any suggestions! I’m open to different formulas or metrics beyond a simple ratio.

Appreciate any help!

  • odium@programming.dev
    link
    fedilink
    arrow-up
    3
    ·
    edit-2
    8 months ago

    Just add 1 to the denominator.

    Simple is best.

    The max(1,total_blocked) method will make instances with 1 blocked and 0 blocked appear to be equal.

    • dumples@kbin.social
      link
      fedilink
      arrow-up
      3
      ·
      8 months ago

      Also to note if you don’t want significantly change the proportions add 1 to both top and bottom. It’s going to remove the divide by zero error and won’t significantly alter ratios. It’s used often in data science to avoid this problem