Yeah, fair point. Yes, obviously taxes on businesses are born by the consumer. Saying that automatically makes them an "own goal" though is... quite the take.
From my knowledge (albeit limited) about the way LLMs are set up, they most definitely have abilities to include guardrails of what can't be produced. ChatGPT has some responses to prompts which stops users from proceeding.
And X specifically: there have many cases of X adjusting Grok where Grok was not following a particular narrative on political issues (won't get into specifics here). But it was very clear and visible. Grok had certain outputs. Outcry from certain segments. Grok posts deleted. Trying the same prompts resulted in a different result.
From my (admittedly also limited) understanding, there’s no bulletproof way to say “do NOT generate X” as it’s not non-deterministic and you can’t reverse engineer and excise the CSAM-generating parts of a model. “AI jailbreak prompts” are a thing.
Well it’s certainly horrible that they’re not even trying, but not surprising (I deleted my X account a long time ago).
I’m just wondering if from a technical perspective it’s even possible to do it in a way that would 100% solve the problem, and not turn it into an arms race to find jailbreaks. To truly remove the capability from the model, or in its absence, have a perfect oracle judge the output and block it.
Again, I'm not the most technical, but I think we need to step back and look at this holistically. Given Grok's integration with X, there could be other methods of limiting the production and dissemination of CSAM.
For arguments sake, let's assume Grok can't reliably have guardrails in place to stop CSAM. There could be second and third order review points where before an image is posted by Grok, another system could scan the image to verify whether it's CSAM or not, and if the confidence is low, then human intervention could come into play.
I think the end goal here is prevention of CSAM production and dissemination, not just guardrails in an LLM and calling it a day.
Given how spectacular the failure of EVERY attempt to put guardrails on LLMs has been, across every single company selling LLM access, I'm not sure that's a reasonable belief.
The guardrails have mostly worked. They have never ever been reliable.
At some point in the past I read that serif fonts are better for readability, as the supports at the base of the letters form a line and help the eye stay “on track”. This is never mentioned in TFA, so I assume it’s an urban legend? Personally I much prefer serif fonts when reading longer texts.
This was definitely true in the days before hi-res screens and good anti-aliasing, simply because the serifs get lost or become noise in lower-resolution settings. It’s probably less true today.
Of course, in terms of accessibility, there are any number of reasons why someone might prefer to read content in any number of typefaces. Certain typefaces are better for folks with dyslexia. Others may be better for certain folks with ADHD. People with low vision may just prefer a larger typeface.
We have these amazing machines we’ve invented that can display the same text in any number of different ways. At this point, it seems ridiculous to need to mandate a specific typeface for electronic usage. Sure, pick a well-regarded default, but if we want to mandate something, it should be that software provides tools to allow users to adjust textual elements of documents they are reading to suit their own needs.
Thanks for that. I thought the same as phantom784 and never updated my opinion for hi-res screens.
Related to choosing defaults: I like these tips for evaluating the legibility of a body typeface: https://prowebtype.com/selecting-body-text/ They mention one serif advantage, that "most serif typefaces are often ideal choices for reading text due to the noticeable strokes in their ascenders and descenders."
Are hi-res screens really that common-place? Resolutions have gone up, but so have screen sizes. I don't think Windows has seen the need to change their default DPI assumption from 96 in at least 20 years.
Yeah right. Times New Roman rendered using late 1990s software on monitors of that era certainly looked awful. These days text on screens can reasonably look like print.
As indicated in the article, serifs come from how original letters were carved into stones: they were an artifact of the tool in use.
Calligraphy developed similar traits by virtue of using a tool that produced an oval shape, and that you had to take care not to leave marks when pen/feather leaves the paper.
With the printing press, when we became able to put many books out, we did start also doing some research about what makes a book easy to read. Not least of because we could now easily put many characters on a single line and print it in the thousands.
Serif or cursive fonts were the default "content" type, and sans-serif was reserved for titles, shop names and other "short texts" as a more "modern", cleaner look: serifs do indeed allow one to more easily track a single long line of text, as you can more clearly see the "baseline" and not accidentally skip into the line above or below.
The next challenge is switching to the next line once you are done with the one you are on: while serifs help there too, the more important thing is the line length. Thus the famous (is it? :) 60-70 word limit for a line, and why you also see many web pages that only take like 20% of our modern 32"+ screens when browsers are made full screen.
Now, columnar layout as popularized by newspapers does not really come from the same desire: like TMR, it actually comes from the desire to fit more on one page to save on costs. With a wider column of text, all the last lines of paragraphs would average out at being half-empty, which is quite a bit with a wide column.
Sure, low resolution screens made sans-serif inevitable even for documents, but compare that with the earliest segmented LCD screens: font was what you could get rendered with as little electronics as possible :)
But today, serif fonts on high resolution screens (though there are still 32" Full HD screens which are not really high-resolution), or with the use of subpixel rendering (antialiasing is no match, as you can see by connecting a modern Mac to a non-high res screen) are a great choice if you want to limit the space you use and maintain great readability.
However, sans-serif fonts can work as well, and you may only need to go with a larger line spacing or shorter lines. The trick is to aim for a number of words/letters, and not "pixels", though modern CSS treats them as a scalable unit.
(Sorry for all of this being a bit rambly, just wanted to share a bit of the history along with how we can best apply it today)
What’s super depressing about Meetup.com are those Modal popups that want you to sign up for Pro. You can’t dismiss them. It’s like they’re intentionally destroying their product to squeeze the last remaining dollars from their users, which I assume are becoming fewer and fewer.
I think the people interacting with this post are just more likely to appreciate the raw craftsmanship and talent of an individual like Bellard, and coincidentally might be more critical of the machinery that in their perception devalues it. I count myself among them, but didn’t downvote, as I generally think your content is of high quality.
Short-form content (if you can call it that) is a weapon of mass attention span destruction. IMHO the doom-scrolling loop it creates should be illegal, regardless of the audience.
For email, I've had some luck just modifying the page with JS that's either indirect or obfuscated enough that the address can't be pulled directly from it - e.g. "var email" is the address encrypted with a fixed key, the JS decrypts it and then alters the HTML.
It can obviously be bypassed by using a JS runner, but it seems to be enough of a hurdle that few spammers bother. "You don't have to outrun the bear", as it were.
Nice. It's a pretty low-traffic site, so something that doesn't require an external service but is still capable enough to defeat 90% of spammers sounds like a good compromise. I imagine drawing the email address to a canvas instead of a textual HTML element could be more effective, alas not accessible.
I have my email in plain text on every page of my site. I get about 1 spam per day that I see in my inbox on Gmail. I suppose Gmail filters even more silently. It's been working fine for over a decade. Is there some scale of site popularity where it becomes a problem?
reply