Apple has released OpenELM—Open-Source Efficient Language Models—to Hugging Face, a site dedicated to open-source AI code.
Apple has been quietly working on its own AI models in an effort to catch up to rivals Microsoft and Google. A major difference between the iPhone maker and other companies is Apple’s emphasis on privacy, which means running AI models locally, rather than in the cloud.
The company has given the clearest window yet into its plans, releasing OpenELM for others to use. The announcmeent was made by Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, Maxwell Horton, Yanzi Jin, Chenfan Sun, Iman Mirzadeh, Mahyar Najibi, Dmitry Belenko, Peter Zatloukal, and Mohammad Rastegari.
We introduce OpenELM, a family of Open-source Efficient Language Models. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to enhanced accuracy. We pretrained OpenELM models using the CoreNet library. We release both pretrained and instruction tuned models with 270M, 450M, 1.1B and 3B parameters.
Our pre-training dataset contains RefinedWeb, deduplicated PILE, a subset of RedPajama, and a subset of Dolma v1.6, totaling approximately 1.8 trillion tokens. Please check license agreements and terms of these datasets before using them.
The Apple researchers say OpenELM was trained on publicly available data sources, another departure from the status quo.
The release of OpenELM models aims to empower and enrich the open research community by providing access to state-of-the-art language models. Trained on publicly available datasets, these models are made available without any safety guarantees. Consequently, there exists the possibility of these models producing outputs that are inaccurate, harmful, biased, or objectionable in response to user prompts. Thus, it is imperative for users and developers to undertake thorough safety testing and implement appropriate filtering mechanisms tailored to their specific requirements.
Users and researchers interested in using OpenELM can learn more here.