> It's like somebody set out to do what the 90s Geocities couldn't, using modern tech.
This style is basically a sort of nostalgiacore and that's exactly what it's trying to do. It's heavily influenced by Web 1.0 and the time-smear of 1990s-2000s early Web culture.
Early on bitcoin was thought to be pseudoanonymous. Like sure, it's all public, but what's public is "bc1qxy2kgdygjrsqtzq2n0yrf2493p83kkfjhx0wlh", not "John Smith, age 43, living in Florida".
Then two things happened: people figured out that it's actually very easy to connect the dots, particularly if somebody ever does something like: "donate here: (hash)".
And, Bitcoin is hard to get into. As soon as difficulty went up, making yourself some went out of the window. Now you have to buy it. And its characteristics mean that anyone selling any online wants to be really, really sure of your identity. Thus near everyone ends up creating accounts at Coinbase or wherever with very accurate identity verification, and now we've got real names connected to those random looking numbers.
Agree with that interpretation. Perhaps "Microbe extracts oxygen from the water in Martian soils" would be short and intriguing enough to be correct and interesting enough to click on!
> I just put it in a plastic bag into the freezer during 15 minutes, and works.
What's that supposed to do for a SSD?
It was a trick for hard disks because on ancient drives the heads could get stuck to the platter, and that might help sometimes. But even for HDDs that's dubiously useful these days.
> It was a trick for hard disks because on ancient drives the heads could get stuck to the platter, and that might help sometimes.
Stuck heads were/are part of the freezing trick.
Another other part of that trick has to do with printed circuit boards and their myriad of connections -- you know, the stuff that both HDDs and SSDs have in common.
Freezing them makes things on the PCB contract, sometimes at different rates, and sometimes that change makes things better-enough, long-enough to retrieve the data.
I've recovered data from a few (non-ancient) hard drives that weren't stuck at all by freezing them. Previous to being frozen, they'd spin up fine at room temperature and sometimes would even work well-enough to get some data off of them (while logging a ton of errors). After being frozen, they became much more complacent.
A couple of them would die again after warming back up, and only really behaved while they were continuously frozen. But that was easy enough, too: Just run the USB cable from the adapter through the door seal on the freezer and plug it into a laptop.
This would work about the same for an SSD, in that: If it helps, then it is helpful.
Semiconductors generally work better the colder they are. Extreme overclockers don't use liquid nitrogen primarily to keep chips at room temperature at extreme power consumption, but to actually run them at temperatures far below freezing.
Complex issue- analog NAND doesn't work anything like the Logic in CPUs.
Far more often it's the act of simply letting a device sit unpowered itself that 'fixes' the issue. Speculation on what changed invariably goes on indefinitely
So on the off-chance that there's a firmware engineer in here, how does this actually work?
Like does a SSD do some sort of refresh on power-on, or every N hours, or you have to access the specific block, or...? What if you interrupt the process, eg, having a NVMe in an external case that you just plug once a month for a few minutes to just use it as a huge flash drive, is that a problem?
What about the unused space, is a 4 TB drive used to transport 1 GB of stuff going to suffer anything from the unused space decaying?
It's all very unclear about what all of this means in practice and how's an user supposed to manage it.
SSD firmware engineer here. I work on enterprise stuff, so ymmv on consumer grade internals.
Generally, the data refresh will all happen in the background when the system is powered (depending on the power state). Performance is probably throttled during those operations, so you just see a slightly slower copy while this is happening behind the scenes.
The unused space decaying is probably not an issue, since the internal filesystem data is typically stored on a more robust area of media (an SLC location) which is less susceptible to data loss over time.
As far as how a user is supposed to manage it, maybe do an fsck every month or something? Using an SSD like that is probably ok most of the time, but might not be super great as a cold storage backup.
So say I have a 4TB USB SSD from a few years ago, that's been sitting unpowered in a drawer most of that time. How long would it need to be powered on (ballpark) for the full disk refresh to complete? Assume fully idle.
(As a note: I do have a 4TB USB SSD which did sit in a drawer without being touched for a couple of years. The data was all fine when I plugged it back in. Of course, this was a new drive with very low write cycles and stored climate controlled. Older worn out drive would probably have been an issue.) Just wondering how long I should keep it plugged in if I ever have a situation like that so I can "reset the fade clock" per se.
the most basic solution that will work for every filesystem and every type of block device without even mounting anything, but won't actually check much except device-level checksums:
sudo pv -X /dev/sda
or even just:
sudo cat /dev/sda >/dev/null
and it's pretty inefficient if the device doesn't actually have much data, because it also reads (and discards) empty space.
for copy-on-write filesystems that store checksums along with the data, you can request proper integrity checks and also get the nicely formatted report about how well that went.
for btrfs:
sudo btrfs scrub start -B /
or zfs:
sudo zpool scrub -a -w
for classic (non-copy-on-write) filesystems that mostly consist of empty space I sometimes do this:
sudo tar -cf - / | cat >/dev/null
the `cat` and redirection to /dev/null is necessary because GNU tar contains an optimization that doesn't actually read anything when it detects /dev/null as the target.
Just as a note, and I checked that it's not the case with the GNU coreutils: on some systems, cp (and maybe cat) would mmap() the source file. When the output is the devnull driver, no read occurs because of course its write function does nothing... So, using a pipe (or dd) maybe a good idea in all cases (I did not check the current BSDs).
>Generally, the data refresh will all happen in the background when the system is powered (depending on the power state).
How does the SSD know when to run the refresh job? AFAIK SSDs don't have an internal clock so it can't tell how long it's been powered off. Moreover does doing a read generate some sort of telemetry to the controller indicating how strong/weak the signal is, thereby informing whether it should refresh? Or does it blindly refresh on some sort of timer?
Pretty much, but it depends a lot on the vendor and how much you spent on the drive. A lot of the assumptions about enterprise SSDs is that they’re powered pretty much all the time, but are left in a low power state when not in use. So, data can still be refreshed on a timer, as long as it happens within the power budget.
There are several layers of data integrity that are increasingly expensive to run. Once the drive tries to read something that requires recovery, it marks that block as requiring a refresh and rewrites it in the background.
So you need to do an fsck? My big question after reading this article (and others like it) is whether it is enough to just power up the device (for how long?), or if each byte actually needs to be read.
The case an average user is worried about is where they have an external SSD that they back stuff up to on a relatively infrequent schedule. In that situation, the question is whether just plugging it and copying some stuff to it is enough to ensure that all the data on the drive is refreshed, or if there's some explicit kind of "maintenance" that needs to be done.
How long does the data refresh take, approx? Let's say I have an external portable SSD that I keep stored data on. Would plugging the drive into my computer and running
A full read would do it, but I think the safer recommendation is to just use a small hdd for external storage. Anything else is just dealing with mitigating factors
Thanks! I think you're right about just using an HDD, but for my portable SSD situation, after a full read of all blocks, how long would you leave the drive plugged in for? Does the refresh procedure typically take a while, or would it be completed in roughly the time it would take to read all blocks?
Keep in mind that when flash memory is read, you don't get back 0 or 1. You get back (roughly) a floating point value -- so you might get back 0.1, or 0.8. There's extensive code in SSD controllers to reassemble/error correct/compensate for that, and LDPC-ish encoding schemes.
Modern controllers have a good idea how healthy the flash is. They will move data around to compensate for weakness. They're doing far more to detect and correct errors than a file system ever will, at least at the single-device level.
It's hard to get away from the basic question, though -- when is the data going to go "poof!" and disappear?
Unless I am misunderstanding the communication protocol between the flash chip and the controller, there is no way for the controller to know that analogue value. It can only see the digital result.
Maybe as a debug feature some registers can be set up adjust the threshold up and down and the same data reread many times to get an idea of how close certain bits are to flipping, but it certainly isn't normal practice for every read.
Typically unused empty space is a good thing, as it will allow drives to run in MLC or SLC mode instead of their native QLC. (At least, this seems to be the obvious implication from performance testing, given the better performance of SLC/MLC compared to QLC.) And the data remanence of SLC/MLC can be expected to be significantly better than QLC.
>as it will allow drives to run in MLC or SLC mode instead of their native QLC
That depends on the SSD controller implementation, specifically whether it proactively moves stuff from the SLC cache to the TLC/QLC area. I expect most controllers to do this, given that if they don't, the drive will quickly lose performance as it fills up. There's basically no reason not proactively move stuff over.
Cheap DRAM-less controllers usually wait until the drive is almost full to start folding. And then they'll only be folding just enough to free up some space. Most benchmark results are consistent with this behavior.
> Also, the timing of their Nov. 13 announcement is pretty bad. There is already chatter that AI may be a bubble bigger than the dotcom bubble. For a company that doesn't have deep pockets, it would be prudent to take the back seat on this.
Unless Mozilla plans to spend millions on cloud GPUs to train their own models, there seems to be little danger of that. They're just building interfaces to existing weights somebody else developed. Their part of the work is just browser code and not in real danger from any AI bubble.
It could still be at risk as collateral damage. If the AI bubble pops, part of that would be actual costs being transmitted to users, which could lead to dramatically lower usage, which could lead to any AI integration becoming irrelevant. (Though I'd imagine the financial shocks to Mozilla would be much larger than just making some code and design irrelevant, if Mozilla is getting more financially tied to the stock price of AI-related companies?)
But yeah, Mozilla hasn't hinted at training up its own frontier model or anything ridiculous like that. I agree that it's downstream of that stuff.
If they just use 3rd party APIs/models, and AI bubble pops, the amount of users of AI in FF will not change.
The upstream might earn less, and some upstreams might fail, but once they have code switching to competition or local isn't a big deal.
That being said
"This could've been a plugin" - actual AI vendors can absolutely just outcompete FF, nobody gonna change to FF to have slightly better AI integration - and if Google decides to do same they will eat Mozilla lunch yet again
The bubble if any is an investment bubble. If somebody likes using LLMs for summaries, or generating pictures or such things, that's not going anywhere. Stable Diffusion and Llama are sticking around regardless of any economical developments.
So if somebody finds Mozilla's embedded LLM summary functionality useful, they're not going to suddenly change their mind just because some stock crashed.
The main danger I guess would be long term, if things crash at the point where they're almost useful but not quite there. Then Mozilla would be left with a functionality that's not as good as it could be and with little hope of improvement because they build on others' work and don't make their own models.
When you try to compress highly compressed or random data the size expands.
At least on the LTO tape drives I have used, will disable compression if the size is larger in an adaptive way.
As tape read and write speeds depend on data size, it is still worth the effort to try and opportunistically compress data on drive.
As this can usually be done without stopping or slowing the tape, there really isn’t much of a downside.
As for the compressed capacity, that is just 30+ years of marketing conventions, which people just ignore as it has always assumed your data was 2:1 compressible.
2.5:1 now apparently, showing my age because I had to go look because last time I had anything to do with LTO it was still 2:1 - guess they got PiedPiper to update the SLDC spec ;).
Native is iirc 30TB - they quote compressed capacity but eh that's very much going to depend on what you are storing and how compressible it is.
And you'll have a rough idea what it is you are going to be storing and how compressible it is if you spending that kind of money.
It's marketing and a little skeezy to quote it and I bet they have some justification for why they arrived at 2.5:1 compression.
EDIT: Yeah it's 30TB - been many years since I had anything to do with LTO but they use a modified version of LZS called SLDC so it's that that they are assuming will get 2.5:1 on "random enteprise data that isn't already compressed" the 2.5 threw me as well because that used to be 2:1 so either they improved SLDC or thought they could wing it - looks like that switched between LTO-5 and LTO-6.
File compression requires additional storage, memory and processing power. Why bother if the tape appliance already handle it ? Data is unusable in compressed format and is hard to deduplicate. Also, often, there is already compression at the storage array level but the data is decompressed when read.
Where I think TUIs had a niche GUIs don't quite reproduce is in the very particular way DOS TUIs processed input.
An old school DOS TUI reads keyboard input one character at a time from a buffer, doesn't clear the buffer in between screens, and is ideally laid out such that a good part of the input is guaranteed to be fixed for a given operation. They also were built without mouse usage.
So an operator can hammer out a sequence like "ArrowDown, ArrowDown, ENTER, Y, ENTER, John Smith, ENTER" and even if the system is too slow to keep up with the input, it still works perfectly.
Modern GUIs almost never make this work near as well. You need to reach for the mouse, input during delays gets lost, the UI may not be perfectly predictable, sometimes the UI may even shift around while things are loading. Then also neither does Linux, I find that the user experience on DOS was far better than with ncurses apps that have all kinds of weirdness.
> I've been patiently waiting for their resurgence. Building embedded/mobile devices is their forte
Wouldn't most of those people have gone elsewhere by now? If you're a mobile device superstar, why would you stick around at Nokia once the mobile device part of it crashed and burned?
A company's just a legal structure, people change over time. And that was more than 10 years ago, and didn't they sell their mobile division to Microsoft?
It's like somebody set out to do what the 90s Geocities couldn't, using modern tech.
reply