Hacker Newsnew | past | comments | ask | show | jobs | submit | more impure's commentslogin

Oh so Google is finally releasing their VCS system I hear so much about. Well it’s Jj not Piper but it looks like Jj will eventually replace Piper and if half the things I’ve heard about Piper are true it will be very successful.


It’s the front end, not the back end that hosts the repository.


jj is independent of Google.


It does require a google CLA.


What does "CLA" stand for?

edit: apparently Contributor License Agreement


CLAs are generally a statement that you ensure code you contribute is acceptable by the licensing and legal things around them.

They don't want to accidentally bring in GPL-3 or private code into an Apache licensed project, and my understanding is this is a self-certifying statement that you won't and have full rights to make the contributions


it is literally impossible to contribute to jj without a Google account and a signed agreement with Google.


It's Brightbill!


Given that Tricolor and First Brand also went bankrupt recently I'd say yes.


We already have diversification. You can rent a VPS from hundreds of possible companies. And people are very happy with them, it seems every month or two there’s a post here about how some company slashed their cloud bill by switching to a VPS. What we have here is a lock-in and marketing problem.


>You can rent a VPS from hundreds of possible companies. And people are very happy with them, it seems every month or two there’s a post here about how some company slashed their cloud bill by switching to a VPS.

Companies are using higher-level "PaaS" suite of services from AWS such as DynamoDB, RedShift, etc and not just the lower-level "IaaS" such as basic EC2 instances or pure containers. Same "lock-in" situation with using the higher-level services from MS Azure and Google Cloud.

For those dependent on high-level services, migrating to a VPS like Hetzner or self-hosting is not possible unless they re-invent the AWS stack by installing/babysitting a bunch of open-source software. It's going to be a lot more involved than just installing a PostgreSQL db instance on a VPS.


> It's going to be a lot more involved

Yes, and you can't escape that by outsourcing it. The complexity is still there, and it will still bite you when your outsourcer fails to manage it.


Same thing applies to AWS...


I’m not really making a point here as much as an observation, but if my stack that I manage atop VMs in a data center goes down, my customers are pissed at me. If AWS goes down along with half the Internet, my customers are completely sympathetic.


Maybe just for you and after they realize it's part of the ongoing AWS outages, but for most folks, an outage is still their problem, and their SLA, regardless of if it's upstream from them.


I disagree. I think most customers are much more sympathetic to an AWS outage than they are to a self-managed outage. Whether that ought to be the case or not is a different question.


But if your services are up when everyone on AWS is down you look like a wizard.


Unfortunately, people rarely notice when something is working, and the few who do will probably just assume you weren’t on AWS in the first place and move on with their day.


And every comment in those threads is how AWS is webscale and wont go down, while the vps will have uptime of 1 day a month


Amazon offers VPS as well, EC2 instances, were those affected? I think they weren't.


Our actual running instances were pretty much fine throughout, as was the RDS cluster, but we had no way to launch new instances (or auto-scale), and no way to invoke any of the other AWS services (IAM, SQS, Lambda, etc). Also no cloud watch logs/metrics for the duration, so limited visibility.

Overall not that bad for us, but if you had more high-level service dependencies, there would have been impact.


> While most operations are recovered, requests to launch new EC2 instances (or services that launch EC2 instances such as ECS) in the US-EAST-1 Region are still experiencing increased error rates.

> We continue to investigate the root cause for the network connectivity issues that are impacting AWS services such as DynamoDB, SQS, and Amazon Connect in the US-EAST-1 Region. We have identified that the issue originated from within the EC2 internal network.

So, kinda? Some global services depend on us-east-1...

> Global services or features that rely on US-EAST-1 endpoints such as IAM updates and DynamoDB Global tables may also be experiencing issues.

Basically, you know it's going to be a bumpy day when us-east-1 has an issue because your ability to run across regions depends on what the issue is what the impact is.


I looked at the paper. How it's being reported is highly misleading. There were 4 different active training groups. One of the groups benefitted from the training and one of the groups actually got worse. So as a whole phishing training only has a 2% boost. However the message is not that phishing training is useless, only that if applied incorrectly it is useless.


I’m surprised they didn’t lower it given the attempts to politicize the fed and the firing of the BLS chief. Also this is quite a short term outlook. Tariffs are likely to cause a stagflationary scenario which could lower government revenue in the long term.


It does make you wonder what would cause a downgrade. The debates over the debt ceiling have certainly brought the U.S. closer to default than I would ever have thought. It's true that the U.S. can never run out of dollars, so in once sense it's not possible for a bondholder not to get paid back. But the political environment, the potential unreliability of previously iron-clad data, economic disruption from tariffs, and behavior from the Federal Reserve, these all seem to make an unlikely event much more likely.


I actually have a game idea playing around with this idea. Sure, the AI is 'aligned' but what does that even mean? Because if you think about it humans have been pretty terrible.


Absolutely. The reason people worry about AI alignment is because we already have millennia of experience with the intractability of human alignment. So the concern is, what if AI is as bad as we are, but more effective at it?


The tech billionaire answer: "Please dont let it be woke".

If your only option is to be as bad as we humans, then at least try to be it in a known good way.


GitLab is open source, you can self host it. Although the system requirements are quite high, you will not be able to host it on a two euro VPS.

And I wouldn’t be that concerned about contributors. It’s only the very top projects that get any contributors. And even then most of the time contributors are not worth the hassle.


The only feature chaining me to Arc is the automated PIP. That is when you switch between tabs or spaces PIP automatically activates. It is so useful.


Well, if you have a better way to solve this that’s open I’m all ears. But what Cloudflare is doing is solving the real problem of AI bots. We’ve tried to solve this problem with IP blocking and user agents, but they do not work. And this is actually how other similar problems have been solved. Certificate authorities aren’t open and yet they work just fine. Attestation providers are also not open and they work just fine.


> Well, if you have a better way to solve this that’s open I’m all ears.

Regulation.

Make it illegal to request the content of a webpage by crawler if a website operator doesn't explicitly allows it via robots.txt. Institute a government agency that is tasked with enforcement. If you as a website operator can show that traffic came from bots, you can open a complaint with the government agency and they take care of shaking painful fines out of the offending companies. Force cloud hosts to keep books on who was using what IP addresses. Will it be a 100% fix, no, will it have a massive chilling effect if done well, absolutely.


The biggest issue right now seems to be people renting their residential IP addresses to scraper companies, who then distribute large scrapes across these mostly distinct IPs. These addresses are from all over the world, not just your own country, so we'll either need a World Government, or at least massive intergovernmental cooperation, for regulation to help.


I don't think we need a world government to make progress on that point.

The companies buying these services, are buying them from other companies. Countries or larger blocks like the EU can exert significant pressure on such companies by declaring the use of such services as illegal when interacting with websites hosted in the country or block or by companies in them.


It just seems too easy to skirt around via middlemen. The EU (say) could prosecute an EU company directly doing this residential scraping, and it could probably keep tabs on a handful of bank accounts of known bad actors in other countries, and then investigate and prosecute EU companies transferring money to them. But how do you stop an EU company paying a Moldovan company (that has existed for 10 days) for "internet services", that pays a Brazilian company, that pays a Russian company to do the actual residential scraping? And then there's all the crypto channels and other quid pro quo payment possibilities.


Genuinely this isn't a tech specific or even novel problem. There is plenty of prior art when it comes to inhibiting unwanted behavior.

> But how do you stop an EU company paying a Moldovan company (that has existed for 10 days) for "internet services", that pays a Brazilian company, that pays a Russian company to do the actual residential scraping?

The same example could be made with money laundering, and yes it's a real and sizable issue. Yet, the majority of money is not laundered. How does the EU company make sure it will not be held liable, especially the people that made the decision? Maybe on a technical level the perfect crime is possible and not getting caught is possible or even likely given a certain approach. But the uncertainty around it will dissuade many, not all. The same goes for companies selling the services, you might think you have a foolproof way to circumvent the measures put in play, but what if not and the government comes knocking?


Your money laundering analogy is apt. I know very little about that topic, and I especially don't know how much money laundering is really out there (nor do governments), but I'm confident that a lot is. Do AML laws have a chilling effect on it? I think they must, since they surely increase the cost and risk, and similar legislation for scraping should have a similar effect. But AML is a pretty bad solution to money laundering, and I despair if AML-for-scraping is the best possible solution to scraping.


I'm not anti-government, but a technical solution that elliminates the the problem is infinitely better than regulating around it.

The internet is too big and distributed to regulate. Nobody will agree on what the rules should be, and certain groups or countries will disagree in any case and refuse to enforce them.

Existing regulation rarely works, and enforcement is half-assed, at best. Ransomware is regulated and illlegal, but we see articles about major companies infected all the time.

I don't think registering with Cloudflare is the answer, but regulation definitely isn't the answer.


The problem is that a technical solution is impossible.


> Institute a government agency that is tasked with enforcement.

You're forgetting about the first W in WWW...


So what you're saying is that if I were to host a bit torrent tracker in Sweden then the US can't do anything about it?


Agreed. It might not be THE BEST solution, but it is a solution that appears to work well.

Centralization bad yada yada. But if Cloudflare can get most major AI players to participate, then convince the major CDN's to also participate.... ipso facto columbo oreo....standard.


yep, that's why I am writing this now :)

You can see it in the web vs mobile apps.

Many people may not see a problem on wallet gardens but reality is that we have much less innovation in mobile than in web because anyone can spawn a web server vs publish an app in the App Store (apple)


Are they? Until Let's Encrypt came along and democratise the CA scene, it was a hell hole. Web Security was depending on how deep your pockets are. One can argue that the same path is being laid in front us until a Let's Encrypt comes along and democratise it? And here as it's about attestation, how are we going to prevent gatekeeper's doing "selective attestations with arguable criteria"? How will we prevent political forces?


Certificate authorities don't block humans if they 'look' like a bot


AI poisoning is a better protection. Cloudflare is capable of serving stashes of bad data to AI bots as protective barrier to their clients.


AI poisoning is going to get a lot of people killed, be cause the AI won't stop being used.


The current state of the art in AI poisoning is Nightshade from the University of Chicago. It's meant to eventually be an addon to their WebGlaze[1] which is an invite-only tool meant for artists to protect their art from AI mimicry

Nobody is dying because artists are protecting their art

[0] https://nightshade.cs.uchicago.edu/whatis.html

[1] https://glaze.cs.uchicago.edu/webglaze.html


By that logic AI already killing people. We can't presume that whatever can be found on the internet is reliable data, can't we?


If science taught us anything it's that no data is ever reliable. We are pretty sure about so many things, and it's the best available info so we might as well use it, but in terms of "the internet can be wrong" -> any source can be wrong! And I'd not even be surprised if internet in aggregate (with the bot reading all of it) is right more often than individual authors of pretty much anything


Yet we use it every day for police, military, and political targeting with economic and kinetic consequences.


You mean incompetent users of AI will get people killed. You don't get a free pass because you used a tool that sucked.


This is some next level blame shifting. Next you are going to steal motor oil and then complain that your customers got sick when you used it to cook their food.


Okay, let them


You don't think that the AI companies will take efforts to detect and filter bad data for training? Do you suppose they are already doing this, knowing that data quality has an impact on model capabilities?


The current state of the art in AI poisoning is Nightshade from the University of Chicago. It's meant to eventually be an addon to their WebGlaze[1] which is an invite-only tool meant for artists to protect their art from AI mimicry

If these companies are adding extra code to bypass artists trying to protect their intellectual property from mimicry then that is an obvious and egregious copyright violation

More likely it will push these companies to actually pay content creators for the content they work on to be included in their models.

[0] https://nightshade.cs.uchicago.edu/whatis.html

[1] https://glaze.cs.uchicago.edu/webglaze.html


Seems like their poisoning is something that shouldn't be hard to detect and filter on. There is enough perturbation to create visual artifacts people can see. Steganography research is much further along in being undetectable. I would imaging in order to disrupt training sufficiently, you would not be able to have so few perturbations that it would go undetected


They will learn to pay for high quality data instead of blindly relying on internet contents.


I'm not sure if things are as fine as you say they are. Certificate authorities were practically unheard of outside of corporate websites (and even then mostly restricted to login pages) until Let's Encrypt normalized HTTPS. Without the openness of Let's Encrypt, we'd still be sharing our browser history and search queries with our ISPs for data mining. Attestation providers have so far refused to revoke attestation for known-vulnerable devices (because customers needing to replace thousands of devices would be an unacceptable business decision), making the entire market rather useless.

That said, what I am missing from these articles is an actual solution. Obviously we don't want Cloudflare from becoming an internet gatekeeper. It's a bad solution. But: it's a bad solution to an even worse problem.

Alternatives do exist, even decentralised ones, in the form of remote attestation ("can't access this website without secure boot and a TPM and a known-good operating system"), paying for every single visit or for subscriptions to every site you visit (which leads to centralisation because nobody wants a subscription to just your blog), or self-hosted firewalls like Anubis that mostly rely on AI abuse being the result of lazy or cheap parties.

People drinking the AI Kool-Aid will tell you to just ignore the problem, pay for the extra costs, and scale up your servers, because it's *the future*, but ignoring problems is exactly why Cloudflare still exists. If ISPs hadn't ignored spoofing, DDoS attacks, botnets within their network, """residential proxies""", and other such malicious acts, Cloudflare would've been an Akamai competitor rather than a middle man to most of the internet.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: