Hacker Newsnew | past | comments | ask | show | jobs | submit | mr_mitm's commentslogin

How do you know this?


There isn't one. As far as I know, no one really knows for sure how they bypass all these paywalls. (Most credible theory I heard: They actually just pay for the subscriptions.)

Many sites including Bloomberg have evolved such that even archive.today don’t have the full text of any articles. They’re doing no giveaways whatsoever.

Ghostarchive does a decent job for the same sites in my experience: https://ghostarchive.org/

Update: hmm seems like they're involved in this whole thing too somehow, how strange:

https://news.ycombinator.com/item?id=46629646


That comment is weirdly confusing/confused. But if you try archiving any site on ghostarchive, or clicking on any existing ghostarchive links, it just says "site is down for maintenance".

For now I've given up on using any archiving sites until we can find a safe and reliable alternative.


Most paywalls just allow search engines to read their content just fine. Because they do want discoverability, they want their cake and eat it.

There's a few publications that don't even do that though and archive.is is very good at bypassing them so I do imagine they use logins for those, but for the masses of sites it's not currently necessary.


You can't impersonate Google. Sites check the source IP and they don't overlap with Google Cloud.

Google isn't the only search engine in the world of course. It probably is pretty much the only one that matters in America but the world is not just America either.

It's the only one websites don't block. That's one reason it's so hard to make another search engine.

You can for sites that can't afford the cost of keeping up-to-date with the Google IP list without which they can lose timely indexing. That is many.

What do you mean by “afford the cost”? The list is free of charge (https://support.google.com/a/answer/10026322?hl=en-GB) and maintenance can be fully automated.

I mean cost of server setup and execution.

The server that is providing the content exists already. That's a sunk cost.

"setup and execution".

What serious operator of a service isn't budgeting time to implement and operate critical maintenance functions?

Me for one. Adding an auto-updating IP address blocker to my personal blog site would probably cost more than setting up the whole site did in the first place.

Have you actually priced it, or are you just guessing?

Are you doing regular patching? Automated restarts? Watching for security breaches? Or just praying it stays up forever?

Otherwise, respectfully, I would not classify you as a "serious operator." Your site could live or die, and it would be all the same to you. Or, you've handed it to a third party for management and they don't offer much in the way of resilience or stability.


We're talking about sites that make their living via subscriptions. They should have a great interest at blocking archive.is, which is, by the way, the only service that can reliably bypass many paywalls. Clearly whatever they're doing is not easily replicated.

> We're talking about sites that make their living via subscriptions.

Sorry, but I wasn't. I thought that was clear from "can't afford the cost of keeping up-to-date with the Google IP list".

> They should have a great interest at blocking archive.is

Agreed, and many should have a budget to suit. So I conclude archive.is has put a lot of effort and cost into its defence. And all for free to us, the users.


Then why hasn't anyone built a client-side browser addon that impersonates a suitable search engine?

They have. It's called bypass-paywalls-clean . It works pretty ok.

It just keeps getting banned from the addon catalogs because of complaints from media. The Firefox one was taken down by a french newspaper. So you have to sideload it, which is hard to do on Android.

Edit: it looks like even the github was taken down now: https://github.com/iamadamdev/bypass-paywalls-firefox

But yes it exists. And it works for most sites. It's just hard to get it now.


It's on gitflic.ru now.

Hmm yeah but their adversaries did achieve their goal by pushing it away from the mainstream sites. Now we're into this situation of "how much do I trust this vague Russian site with my browsing activity".

At least the addon declares the sites it's for and ignores the rest but still I'm a lot less comfortable with it. It's more something I'd install in a container now, limiting its usefulness :(

In practice I just use archive.today now.


Yeah it's unavoidable. Bypassing paywalls is not a good idea for tools that depend on browsers' stores distribution.

What's your problem with that theory?

Has people's ability to read messages and formulate sensible replies been going down of late? I see this kind of meaningless replies more and more often these days.

Yes, there's a global intelligence crisis, due to tiktok instagram et al

Meaningless? Its a clear question.

You're accusing him of having a problem with it, which his comment does not imply.

None

In what way would studying black body radiation alter humanity? Oh just the basis for quantum mechanics and thus transistors, lasers, MRIs, photovoltaics, and more.

The point is, you don't know in advance. I admit it's a bit more far fetched with these experiments that are so far removed from everyday life, but they're still worthwhile.


Is anybody claiming it makes you more productive at writing code? I just find it more convenient and more comfortable.

Theoretically, isn't the fact that you are being more convenient and more comfortable likely to increase your productivity too?

Yes, the app could be compromised, or the OS, or the compiler of the app, or of the OS, or the OS of the compiler, or the CPU any of these things run on, etc. etc. None of that is relevant to the definition of E2EE.

It's relevant to how E2EE is described to users. Representing that it's not possible for anyone other than the sender or recipient to read messages is misleading and just incorrect in general.

A particularly relevant point is when it comes to government interception. E.g. it would be perfectly possible for an messaging app to have a "wiretap mode" that the vendor enables for users that are the subject of a relevant warrant.


Being able to self-host is a plus for me. And being able to support myself avoids being told "no" for a feature I want.


Wait, did it just end the session or was your account actually suspended or deactivated? "Kicked out" is a bit ambiguous.

I've seen the Bing chatbot get offended before and terminate the session on me, but it wasn't a ban on my account.


What I fear is a pollution of the open source space with tons of tailored apps that have a lot of overlap, but none of them get meaningful contributions because the maintainer will most likely respond with wontfix to almost everything (if they respond at all).


I build in the open, but what I build is just for me. If someone wants to fork it and modify it, they can go ahead - pretty much all of my stuff is MIT licensed by default.

But I'm not going to start adding features to my bespoke utility to fix someone else's problem.


Shrug, it's hard to have an open app where everyone wants to add/change something and not have it turn into a Turing machine that attempts to do everything.

Sometimes you just want an app does X and Y, but not A, B and Z.


The "Your" shouldn't have been stripped from the title IMHO.


Depending on how you read the owner of an app subscription (the provider or the subscriber) maybe 'My' is better.


I believe it's a play on "Your margin is my opportunity".


It's a US thing


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: