"Managers and engineers from Meta’s generative AI group and infrastructure team have started four war rooms to learn how DeepSeek works. Two of the mobilized
groups are trying to understand how High-Flyer lowered the cost of training and running DeepSeek. Meta wants to apply those techniques, a number of which a
technical paper from High-Flyer outlined, to Llama, one of the employees said. ...
A third Meta research group is trying to figure out what data High-Flyer might have used to train its models, according to one of the employees with direct
knowledge.
The fourth war room is considering new techniques for restructuring Meta’s models based on attributes of the DeepSeek models, they said. Meta is considering
launching a version of Llama that, like DeepSeek, would include numerous AI models, each trained to handle different tasks. That way, when a customer asks Llama
to handle a certain task, only some parts of the model would need to work on it. That could make the overall model faster and require less computing power to
operate."