Hacker Newsnew | past | comments | ask | show | jobs | submit | misrasaurabh1's commentslogin

https://github.com/codeflash-ai/codeflash/

Codeflash optimizes any Python code for performance by using AI and verification.

We make all human and AI written code super-intelligent by discovering new algorithms and fixing any performance mistakes.


I've been running into performance bottlenecks in the wild where `copy.deepcopy()` was the bottleneck. After digging into it, I discovered that deepcopy can actually be slower than even serializing and deserializing with pickle or json in many cases!

I wrote up my findings on why this happens and some practical alternatives that can give you significant performance improvements.

*TL;DR:* deepcopy's recursive approach and safety checks create memory overhead that often isn't worth it. The post covers when to use alternatives like shallow copy + manual handling, pickle round-trips, or restructuring your code to avoid copying altogether.

Has anyone else run into this? Curious to hear about other performance gotchas you've discovered in commonly-used Python functions.


I tried contributing some AI Generated optimizations to core Scientific Python packages like NetworkX and Scikit-image. But they were rejected since they were created from AI models.

There is grey area that the new code might cause licensing issues if it derives from other licensed code. But does that mean that no open source code with open license can accept code from AI codegen models? That seems excessive.

Have you contributed AI generated code to open source projects that you don't own? Have you received any such pushback?


It looks like they were rejected because the project has licensing concerns about the generated code, not, strictly speaking, because they were 'created from AI models'.


What we discovered that LLMs aren't really the answer wrt performance, the whole system around verifying correct and performance is the key.


Hi, I am the creator of Codeflash. The idea is to do everything that a professional performance engineer would do and to automate it. So codeflash, analyzes your code, creates correctness tests, creates performance benchmarking tests, executes these tests to figure out the behavior and performance of code. Along side it also profiles and determines the line coverage of your original code. Then it uses all this info to generate multiple optimizations and applies them one by one to see if the new code is indeed an optimization, i.e. is correct and is more performant. If this is the case then it opens a Pull Request for dev's review with all the info.


Codeflash is still new but it is very strong at optimizing general purpose python code, especially PyTorch, NumPy or algorithmic code. If you want to try out optimizations - try optimizing this sample repo https://github.com/codeflash-ai/optimize-me


I see a lot of interest in this field which is amazing! I have been working on this problem for a while and have made a lot of progress! If you are interested in building a general purpose code performance optimizer, I am hiring founding engineers. We are still in stealth so can't say too much but we already have great results on customers are already funded by some of the best investors of silicon valley. If you might be interested, pls reach out at misra.saurabh1@gmail.com


This is super cool and useful. I know that Instabase , which also came from MIT got really popular and useful within finance communities because they allowed for really fast and efficient compute through their own Python DSL. Good to see this as an open source project which everyone can now use.


In the particular case of refactoring, I don't think this is a good idea. I can understand that multi-file system reorganization refactorings shouldn't be done with simple changes, but small enough refactorings can easily happen when working on a piece of code. That is because-

- You have the best mental model about a piece of code and its system the best when you are working on it, not when you read a ticket description of what should happen. This leads to overall better efficiency and less context switching.

- I think that the principles of "leave things better than you first found them" and "with every not-so-minor change, one should think about how would you architect the system the best way possible at this current time" great principles.

- I find that "Refactoring sprints" never really work great. They tend to be inefficient and rarely prioritized tasks by the management. As developers we have responsibility over code and make sure that its in the best state possible as an implicit description of our profession.

Even though it pays to be focused, I think there is merit in the exploration of code-base as it helps figure out new ideas and avenues of improvement.


I once bought a cheap but "smart" HP Ink jet printer for home. I don't have much to print other than the yearly tax documents. I was thinking that I would buy an extra ink cartridge and I should be good for 2 years and I haven't been more wrong. One ink cartridge didn't even last for 100 pages (less than one tax return). The day I realized that my ink cartridge only had 0.2ml (!!) of ink it in, I sold the printer and bought a brother ink tank printer. Now I don't have to care about my printer extorting me.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: