vitoplantamura/HackerNewsRemovals: List of stories ... - GitHub
Introduction
We read every piece of feedback, and take your input very seriously. To see all available qualifiers, see our documentation.
Project Overview
List of stories removed from the Hacker News Front Page, updated in real-time.
UPDATE (February 4, 2024): This is the discussion about this project on HN: here. Please specifically read dang's comment regarding the core assumption of this project: here. On a personal note, the number of Stories removed yesterday (Saturday, February 3, 2024) was the lowest ever recorded by the service. This includes 2 duplicate Stories.
As a side note, in the list always check whether a Story is a duplicate or not: this is a very reasonable reason for removal and unfortunately I have no way of automatically determining it in the service!
Purpose of the Project
The purpose of this project is to try to understand the type and scale of the moderation of the Hacker News Front Page.
NOTE: I love Hacker News. I try to read it every day. In the case of OnnxStream (here for example), 95% of the comments were helpful and intelligent. I also understand that moderating a site with huge traffic and where users are basically anonymous must be a very difficult task.
Moderation Tools
Returning to the purpose of this project, from what I have been able to see, the "public" (i.e. observable from the outside) moderation of the Front Page consists of two main tools: modification of the title of a Story (voluntarily or involuntarily influencing its growth in terms of rank) or directly its removal.

Regarding the first type of moderation, an excellent site is already available that tracks changes to Story titles. Here instead I will focus on the second type.
Project Development
For the reasons explained in the "Why?" section below, I have developed a small application that logs all the Stories that are removed from the Front Page, for personal use.

I later discovered that there is no tool/website that provides this type of information and I decided to make it public here. It was a difficult decision but my rationale is: is it better to have more transparency or less transparency?
If you know of a tool/website similar to this, please let me know: I will archive this repo or set it to private.
Possible Outcomes
A possible very positive outcome for this project could be to have a list similar to this, but available directly among the HN lists. Or even to notify a user when a Story is penalized on the Front Page, perhaps indicating the number of flags and/or the reason, for example.
Case Studies
A friend of mine posted two Stories on Hacker News related to OnnxStream (31 days apart), the first related to SDXL Turbo support and the second related to TinyLlama and Mistral 7B support.
In the case of the first, the Story was among the first on the Front Page, until its title was changed from "Stable Diffusion Turbo on a Raspberry Pi Zero 2 generates an image in 29 minutes" to "OnnxStream: Stable Diffusion XL 1.0 Base on a Raspberry Pi Zero 2". This effectively "killed" the Story.
In the case of the second, the Story was in third place on the Front Page, less than an hour after the submission. In this case, it was simply removed from the Front Page.
Conclusion
Having discovered this, perplexed, I sent an email to the moderator. dang, who was very kind and quick in his response, explained to me that the Story had been flagged by users even without being explicitly [flagged], and that he could, therefore, only hypothesize the causes of the flag.
While I have no reason to doubt Daniel's good faith, it's hard to believe that HN users would be tired of LLM-related news.
So I decided to develop a small console application to determine the frequency of this phenomenon. I subsequently discovered that there were no tools/websites that monitored this specific phenomenon and I, therefore, decided to make it public here.
Monitoring Process
Using the official HN API, the service fetches 90 Top Stories every minute and makes a comparison with the first 30 Top Stories (i.e. the Front Page) fetched the previous minute. It logs all missing Stories here.
Title and URL are those from when the Story first appeared in the top 30. The number of points and comments and the rank are those from when the Story was removed from the Front Page.
NOTE: always check whether a Story is a duplicate or not: this is a very reasonable reason for removal and unfortunately I have no way of automatically determining it in the service!
List of stories removed from the Hacker News Front Page, updated in real-time.




















