Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People look at the beta release of the software, and interpret flaws in an early version like a critical fundamental problems.

I am pretty sure the new releases will contain features like better software license handling (e.g. 3 levels for types of licenses - permissive, copy-left, hardcore copy-left), trust score for snippets, possibly some validation of the code for some languages.



> People look at the beta release of the software, and interpret flaws in an early version like a critical fundamental problems.

Maybe because they realize the flaws are fundamentally inherent in the very core of the product. They're using a GPT-3 derivative here. DNN models are not the right tool for this job.


Why following wouldn't work for licenses: Train 3 models:

1. Only permissive licenses - Only include in the training set repos with permissive licenses - MIT, Apache.

2. Copy-left - Step 1 + GPL, excluding AGPL and other "hardcore copy-left licenses".

3. All - Include all code, even unlicensed and AGPL.

User can choose which version they prefer based on profile of their project and their company? Majority of github repos have LICENSE, so it doesn't seem implausible?


Almost all permissively licensed code still require preserving copyright notices or other attribution. So where copilot is creating copyright violations, restricting its training to MIT or Apache licensed code will not resolve the issue.


Does Microsoft actually have a repository of permissive licensed code though?

My guess would be that some significant portion of github code published under a permissive license is actually licensed improperly.

Working that out at scale seems intractable, but maybe the training set doesn't need to be as big as I'm assuming.


There is still something to be said for making a reasonable effort. There are no guarantees in the world.


I'd be more optimistic if the beta were crafted with the idea that it might have issues. It so, it would likely have some way of gathering feedback on suggestions that was a little more nuanced than just accepted/rejected.


This could actually be interesting. If it tuns out that copy-left based code completion is better than other options, it will create a strong incentive to spread it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: