Hacker Newsnew | past | comments | ask | show | jobs | submit | zanussbaum's commentslogin

First embedding models trained from modern-bert-embed!


this was a huge inspiration for the post! i tried to highlight it in the blog but it might have gotten buried

there are a few things that i wasn't able to figure out how to get access to/i wasn't sure if they were possible. for example, a lot of Simon's article takes advantage of the warp scheduler and warp tiling.

i had a hard time finding information on if that's even possible with my M2/metal and the general memory access patterns. it seems like CUDA does have better documentation in this regard


at least on my m2, the compiled kernel ends up using fast math anyways so using WGSL's fma didn't change anything about the actual kernel that gets run


inglor is probably referring to Strassen or Coppersmith–Winograd.


Last I checked the extra mems really hurt on a lot of cases especially for the more complex ones, but I'm no expert.


oh in that case it was because i didn't know about them :) something to try next!


thanks! and yes definitely not at CUDA levels :)


i tried using workgroup shared memory and found it slower than just recomputing everything in each thread although i may have been doing something dumb

i'm excited to try subgroups though: https://developer.chrome.com/blog/new-in-webgpu-128#experime...


you're definitely right, 80% was a bit of an overestimation, especially with respect to CUDA

it would be cool to see if there's some way to get better access to those lower-level primitives but would be surprised

it does seem like subgroup support are a step in the right direction though!


great question, to me webGPU sits a hair high level than CUDA or Vulkan. so you don't have the exact same level of control but can get to 80% performance of it without having to write different kernels specific to the hardware


Has been a huge boost over using Copilot. I accidentally was using Copilot instead of Codeium and was confused why the generations took so long until I realized! Great product


This made my day


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: