The only experience I have is first hand, what my company is doing for our client base. We are doing continuous pretraining and the rest of the alignment stack training on about 10B private tokens + private customer data to produce private custom models for companies in the 500 to 3000 employee range. We built and operate a single rack cluster that cost mid 6 figures in order to be able to do this.
These models get combined with rag for highly specific technical doc authoring and other uses.
These models get combined with rag for highly specific technical doc authoring and other uses.