Content quality is suffering (as of late 2023)

Lately, I've come to understand why GPT models avoided digesting content from later than September 2021 [1]: The amount of artificially extended written articles and videos is staggering; probably there are also more fully AI-generated cases, but I wanted to focus on human-originated cases. For me, it is more than just a matter of wasting extra time when reading because there are whole sentences or paragraphs devoid of content or repeating for the sixth time the same point with different words. The form is also of poorer quality, making it harder to understand (which fails one of the main principles of writing, to communicate something!).

Let me mention two examples:

  • Everybody knows that 95% of YouTube channel content nowadays is artificially extended to get extra displayed ad slots. As a specific example, a particular channel that some friends like watching seldom publishes videos shorter than 30 minutes, sometimes going as far as 90 minutes. The "short" ones are usually content that can be explained in less than 5 minutes. For the longer ones, at minimum, one could remove 1/3 of the length (interviews that get too pedantic, "in-depth reports" that iterate multiple times over each topic...). I did a test, and in a specific video (~25 mins), they mentioned one relevant term more than 20 times. This is why I experimented with building a YouTube summarizer using ChatGPT, and I use a fine-tuned variant if I check YT (which is rare).
  • When reading content in English, I have some leeway regarding initially wrong-looking texts, because I am not an expert in the language, so it might be a wrong interpretation. But when I read Spanish articles, I am way stricter, and it is degrading a lot outside main media sites: Half a page of buzzwords, abusing bolding text on almost every sentence (probably because not even the author can find relevant information otherwise), with sentences often cut in half, breaking the natural structure and forcing you to do extra cognitive work (maybe to give the impression of a longer text?), at times with non-sense words suspiciously looking like other more appropriate ones (LLM mistake?). And worst of all, if you read the text many times, any reasonable human being will notice something wrong.

Everybody is free to write however they want. I have friends who run their texts through ChatGPT to make them sound more formal, and I've used it to generate more cheerful messages for Slack announcements, so it is not that I'm against their use; quite the opposite [2]. But people are forgetting the basics: make your content readable and understandable, and then, if you want, make it longer. Some books could be perfectly summarized in less than a single page, so it is nothing new!

For example, I currently use Grammarly (premium) to proofread my blog posts and some emails. At work, I use LanguageTool, which serves the same function [3]. They both have browser plugins and give you spelling corrections and suggestions about tones, false friends, partial or complete sentence rewordings for clarity...

Nothing beats careful human proofreading, but if you don't have the time or desire (I often don't, and simply click publish once grammar check is ok), at least try using suitable tools to increase legibility. It would also be nice to avoid "dark patterns" in writing online content; the tools I mentioned will probably flag most of them.

[1] Although I read somewhere that this might change soon. They probably devised an accurate enough way to detect artificial content to discard.

[2] The summaries on this blog's landing page are also made by GPT (but revised by me), and I use ChatGPT very often for both general questions and as a Swedish tutor/teacher. I sincerely think that LLMs are a technical revolution in many ways.

[3] I'm considering switching to LanguageTool's premium plan, as they don't use your data for ML training. But if you are okay with that, Grammarly is excellent.

Tags: Offtopic

Content quality is suffering (as of late 2023) published @ . Author: