Google Gemma¶
License
Gemma Terms of Use
Last modified: February 21, 2024
By using, reproducing, modifying, distributing, performing or displaying any portion or element of Gemma, Model Derivatives including via any Hosted Service, (each as defined below) (collectively, the "Gemma Services") or otherwise accepting the terms of this Agreement, you agree to be bound by this Agreement.
Section 1: DEFINITIONS 1.1 Definitions (a) "Agreement" or "Gemma Terms of Use" means these terms and conditions that govern the use, reproduction, Distribution or modification of the Gemma Services and any terms and conditions incorporated by reference.
(b) "Distribution" or "Distribute" means any transmission, publication, or other sharing of Gemma or Model Derivatives to a third party, including by providing or making Gemma or its functionality available as a hosted service via API, web access, or any other electronic or remote means ("Hosted Service").
(c) "Gemma" means the set of machine learning language models, trained model weights and parameters identified at ai.google.dev/gemma, regardless of the source that you obtained it from.
(d) "Google" means Google LLC.
(e) "Model Derivatives" means all (i) modifications to Gemma, (ii) works based on Gemma, or (iii) any other machine learning model which is created by transfer of patterns of the weights, parameters, operations, or Output of Gemma, to that model in order to cause that model to perform similarly to Gemma, including distillation methods that use intermediate data representations or methods based on the generation of synthetic data Outputs by Gemma for training that model. For clarity, Outputs are not deemed Model Derivatives.
(f) "Output" means the information content output of Gemma or a Model Derivative that results from operating or otherwise using Gemma or the Model Derivative, including via a Hosted Service.
1.2 As used in this Agreement, "including" means "including without limitation".
Section 2: ELIGIBILITY AND USAGE 2.1 Eligibility You represent and warrant that you have the legal capacity to enter into this Agreement (including being of sufficient age of consent). If you are accessing or using any of the Gemma Services for or on behalf of a legal entity, (a) you are entering into this Agreement on behalf of yourself and that legal entity, (b) you represent and warrant that you have the authority to act on behalf of and bind that entity to this Agreement and (c) references to "you" or "your" in the remainder of this Agreement refers to both you (as an individual) and that entity.
2.2 Use You may use, reproduce, modify, Distribute, perform or display any of the Gemma Services only in accordance with the terms of this Agreement, and must not violate (or encourage or permit anyone else to violate) any term of this Agreement.
Section 3: DISTRIBUTION AND RESTRICTIONS 3.1 Distribution and Redistribution You may reproduce or Distribute copies of Gemma or Model Derivatives if you meet all of the following conditions:
You must include the use restrictions referenced in Section 3.2 as an enforceable provision in any agreement (e.g., license agreement, terms of use, etc.) governing the use and/or distribution of Gemma or Model Derivatives and you must provide notice to subsequent users you Distribute to that Gemma or Model Derivatives are subject to the use restrictions in Section 3.2. You must provide all third party recipients of Gemma or Model Derivatives a copy of this Agreement. You must cause any modified files to carry prominent notices stating that you modified the files. All Distributions (other than through a Hosted Service) must be accompanied by a "Notice" text file that contains the following notice: "Gemma is provided under and subject to the Gemma Terms of Use found at ai.google.dev/gemma/terms". You may add your own intellectual property statement to your modifications and, except as set forth in this Section, may provide additional or different terms and conditions for use, reproduction, or Distribution of your modifications, or for any such Model Derivatives as a whole, provided your use, reproduction, modification, Distribution, performance, and display of Gemma otherwise complies with the terms and conditions of this Agreement. Any additional or different terms and conditions you impose must not conflict with the terms of this Agreement.
3.2 Use Restrictions You must not use any of the Gemma Services:
for the restricted uses set forth in the Gemma Prohibited Use Policy at ai.google.dev/gemma/prohibited_use_policy ("Prohibited Use Policy"), which is hereby incorporated by reference into this Agreement; or in violation of applicable laws and regulations. To the maximum extent permitted by law, Google reserves the right to restrict (remotely or otherwise) usage of any of the Gemma Services that Google reasonably believes are in violation of this Agreement.
3.3 Generated Output Google claims no rights in Outputs you generate using Gemma. You and your users are solely responsible for Outputs and their subsequent uses.
Section 4: ADDITIONAL PROVISIONS 4.1 Updates Google may update Gemma from time to time, and you must make reasonable efforts to use the latest version of Gemma.
4.2 Trademarks Nothing in this Agreement grants you any rights to use Google's trademarks, trade names, logos or to otherwise suggest endorsement or misrepresent the relationship between you and Google. Google reserves any rights not expressly granted herein.
4.3 DISCLAIMER OF WARRANTY UNLESS REQUIRED BY APPLICABLE LAW, THE GEMMA SERVICES, AND OUTPUTS, ARE PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING, REPRODUCING, MODIFYING, PERFORMING, DISPLAYING OR OR DISTRIBUTING ANY OF THE GEMMA SERVICES OR OUTPUTS AND ASSUME ANY AND ALL RISKS ASSOCIATED WITH YOUR USE OR DISTRIBUTION OF ANY OF THE GEMMA SERVICES OR OUTPUTS AND YOUR EXERCISE OF RIGHTS AND PERMISSIONS UNDER THIS AGREEMENT.
4.4 LIMITATION OF LIABILITY TO THE FULLEST EXTENT PERMITTED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL THEORY, WHETHER IN TORT (INCLUDING NEGLIGENCE), PRODUCT LIABILITY, CONTRACT, OR OTHERWISE, UNLESS REQUIRED BY APPLICABLE LAW, SHALL GOOGLE OR ITS AFFILIATES BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, EXEMPLARY, CONSEQUENTIAL, OR PUNITIVE DAMAGES, OR LOST PROFITS OF ANY KIND ARISING FROM THIS AGREEMENT OR RELATED TO, ANY OF THE GEMMA SERVICES OR OUTPUTS EVEN IF GOOGLE OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
4.5 Term, Termination, and Survival The term of this Agreement will commence upon your acceptance of this Agreement (including acceptance by your use, modification, or Distribution, reproduction, performance or display of any portion or element of the Gemma Services) and will continue in full force and effect until terminated in accordance with the terms of this Agreement. Google may terminate this Agreement if you are in breach of any term of this Agreement. Upon termination of this Agreement, you must delete and cease use and Distribution of all copies of Gemma and Model Derivatives in your possession or control. Sections 1, 2.1, 3.3, 4.2 to 4.9 shall survive the termination of this Agreement.
4.6 Governing Law and Jurisdiction This Agreement will be governed by the laws of the State of California without regard to choice of law principles. The UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The state and federal courts of Santa Clara County, California shall have exclusive jurisdiction of any dispute arising out of this Agreement.
4.7 Severability If any provision of this Agreement is held to be invalid, illegal or unenforceable, the remaining provisions shall be unaffected thereby and remain valid as if such provision had not been set forth herein.
4.8 Entire Agreement This Agreement states all the terms agreed between the parties and supersedes all other agreements between the parties as of the date of acceptance relating to its subject matter.
4.9 No Waiver Google will not be treated as having waived any rights by not exercising (or delaying the exercise of) any rights under this Agreement.
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit a72c7f4d0a15 · 5.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit a72c7f4d0a15 · 5.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit b50d6c999e59 · 1.7GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit a72c7f4d0a15 · 5.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit af57093b878e · 5.2GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit a72c7f4d0a15 · 5.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit a72c7f4d0a15 · 5.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit af57093b878e · 5.2GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit a72c7f4d0a15 · 5.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit 430ed3535049 · 5.2GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit e1e4a2e0c8ee · 5.7GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit de923fde2f26 · 6.2GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit 2e51588baf45 · 6.7GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 8-bit 61d0f0df3637 · 9.1GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 2-bit 831e95226882 · 3.7GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 3-bit e4aea70c287e · 4.2GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 3-bit 92dd270cb673 · 4.6GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 3-bit 1fa1f8e1a003 · 4.9GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit bb7b8325814d · 5.2GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit d9c26a968eb4 · 5.5GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit 42630cbe71a2 · 6.2GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit 556abeb39bfd · 6.3GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 6-bit 5c7aded0b8bd · 7.2GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization unknown f689ad351c8d · 17GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit af57093b878e · 5.2GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit a7adb7322d8b · 5.7GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit fd6dc849a9f8 · 6.2GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit 938308342124 · 6.7GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 8-bit 1b7b1e5e2f98 · 9.1GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 2-bit e43e18a484ed · 3.7GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 3-bit f40e21d459ff · 4.2GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 3-bit ffd8ea66cf1f · 4.6GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 3-bit 1676d1f86165 · 4.9GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit 51854f1241b3 · 5.2GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit 5ec372e05241 · 5.5GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit df449b30e2e1 · 6.2GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit dc8f812e49c7 · 6.3GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 6-bit 9e7cfd3fab5b · 7.2GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization F16 1f1e9df10872 · 16GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit a72c7f4d0a15 · 5.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit a65ceace358f · 5.5GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit 800d99beee5d · 6.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit cdf1dca21036 · 6.5GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 8-bit f6dee485fe1a · 9.1GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 2-bit bbdb1a3dd39f · 3.5GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 3-bit d858885a94e0 · 4.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 3-bit e71bd8c16242 · 4.4GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 3-bit 567c802627ea · 4.7GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit 2c55699e04d0 · 5.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 4-bit 81c8c66b7df1 · 5.3GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit 31eb523ac3df · 6.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 5-bit adf00ced5519 · 6.1GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization 6-bit ad2f1a0dacb5 · 7.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 9B quantization F16 056b9586873a · 17GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 030ee63283b5 · 1.6GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 4003359bdf67 · 1.7GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 030ee63283b5 · 1.6GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit b50d6c999e59 · 1.7GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 79f47baf629d · 1.8GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit f2e0978e20a7 · 1.9GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit 2005cd5b9f9a · 2.1GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 8-bit 233b6c11a9a9 · 2.7GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 2-bit 462aa639add9 · 1.3GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 3-bit 1147ae8910c1 · 1.4GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 3-bit 0bb5a6a99256 · 1.5GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 3-bit a5e1a96f2947 · 1.6GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 1f62e695daaf · 1.7GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 073d0876d678 · 1.8GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit 466d55f10b72 · 1.9GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit 79fe7a4a6e26 · 2.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 6-bit 2934d89f4873 · 2.2GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization F16 1cca4a1914b8 · 4.5GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
84B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 4003359bdf67 · 1.7GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 2336b3481fc2 · 1.8GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit b82a85f35c76 · 1.9GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit af9d160f687e · 2.1GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 8-bit 40e5082de8d9 · 2.7GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 2-bit fa08f447fb74 · 1.3GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 3-bit 30edc7975a67 · 1.4GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 3-bit 3755aa666a62 · 1.5GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 3-bit fb3cb86ed3a0 · 1.6GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit e1f170c66fed · 1.7GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 14b9181dd33c · 1.8GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit dbbce6ea20a0 · 1.9GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit 75095004592a · 2.0GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 6-bit 5bd4a449bdbe · 2.2GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization F16 bced1b198253 · 4.5GB params {"repeat_penalty":1}
21B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 030ee63283b5 · 1.6GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 7eceb81ab6c2 · 1.7GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit 08119f9e759a · 1.8GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit c30cb73e47bb · 1.9GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 8-bit 040645156534 · 2.7GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 2-bit 8be00e58cf4f · 1.2GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 3-bit fa7ef5ec4918 · 1.3GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 3-bit 5bb1882722eb · 1.4GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 3-bit 198260e1ebc4 · 1.5GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 0c8c655954ea · 1.6GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 4-bit 1a26b1fdc8cf · 1.6GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit 86981b46f254 · 1.8GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 5-bit 8b4445dd0168 · 1.8GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization 6-bit 25735feffbe0 · 2.1GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
-
拉取模型
-
模型信息 (model)
Manifest Info Size model arch gemma parameters 3B quantization F16 fa8ddc50dcc2 · 5.0GB template <start_of_turn>user {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>
136B params {"penalize_newline":false,"repeat_penalty":1,"stop":["<start_of_turn>","<end_of_turn>"]}
109B
模型详情¶
模型页面:Gemma
此模型卡片对应 Gemma 模型的 7B 基础版本。您还可以访问 2B 基础模型、7B 指令模型 和 2B 指令模型 的模型卡片。
资源和技术文档:
使用条款:条款
作者:Google
模型信息¶
概要描述和简要定义输入与输出。
描述¶
Gemma 是 Google 推出的一系列轻量级、最新技术的开放模型,这些模型基于创建 Gemini 模型的同一研究和技术。它们是文本到文本的、仅解码器的大型语言模型,提供英语版本,具有开放的权重、预训练的变体和指令调优的变体。Gemma 模型非常适合执行各种文本生成任务,包括问答、摘要和推理。它们相对较小的体积使其能够部署在资源有限的环境中,如笔记本电脑、台式电脑或您自己的云基础设施,民主化地访问最先进的 AI 模型,并帮助促进每个人的创新。
上下文长度¶
模型在 8192 个令牌的上下文长度上进行训练。
使用¶
下面我们分享一些代码片段,帮助您快速开始运行模型。首先确保执行 pip install -U transformers
,然后从适合您使用场景的部分复制代码片段。
微调示例¶
您可以在 examples/
目录 下找到微调笔记本。我们提供:
- 在 UltraChat 数据集上使用 QLoRA 执行监督式微调(SFT)的脚本
- 在 TPU 设备上使用 FSDP 进行 SFT 的脚本
- 一个笔记本,您可以在免费的 Google Colab 实例上运行,以对英语引用数据集进行 SFT。您也可以在此处找到该笔记本的副本。
在 CPU 上运行模型¶
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b")
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b")
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
在单个/多个 GPU 上运行模型¶
# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b")
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", device_map="auto")
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
使用不同精度在 GPU 上运行模型¶
- Using
torch.float16
# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b")
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", device_map="auto", revision="float16")
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
- Using
torch.bfloat16
# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b")
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", device_map="auto", torch_dtype=torch.bfloat16)
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
通过 bitsandbytes
的量化版本¶
- Using 8-bit precision (int8)
# pip install bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(load_in_8bit=True)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b")
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", quantization_config=quantization_config)
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
- Using 4-bit precision
# pip install bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(load_in_4bit=True)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b")
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", quantization_config=quantization_config)
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
其他优化¶
- Flash Attention 2
首先确保在您的环境中安装 flash-attn
,执行 pip install flash-attn
。
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
+ attn_implementation="flash_attention_2"
).to(0)
输入和输出¶
- 输入: 文本字符串,例如问题、提示或待总结的文档。
- 输出: 以英文生成的文本,作为对输入的响应,例如对问题的回答或文档的摘要。
模型数据¶
用于模型训练的数据及其处理方式。
训练数据集¶
这些模型是在包括多种来源的文本数据集上训练的,总共有 6 万亿个令牌。这里是主要组成部分:
- 网络文档:多样化的网页文本确保模型接触广泛的语言风格、主题和词汇。主要是英语内容。
- 代码:使模型接触代码有助于其学习编程语言的语法和模式,这改善了其生成代码或理解代码相关问题的能力。
- 数学:在数学文本上进行训练有助于模型学习逻辑推理、符号表示,并应对数学查询。
这些多样化的数据源的结合对于训练能够处理各种不同任务和文本格式的强大语言模型至关重要。
数据预处理¶
这里是应用于训练数据的主要数据清洗和过滤方法:
- CSAM 过滤:在数据准备过程的多个阶段应用了严格的 CSAM(儿童性虐待材料)过滤,以确保排除有害和非法内容。
- 敏感数据过滤:作为使 Gemma 预训练模型安全可靠的一部分,使用自动化技术从训练集中过滤出某些个人信息和其他敏感数据。
- 其他方法:根据内容质量和安全性进行过滤,符合我们的政策。
实施信息¶
有关模型内部的详细信息。
硬件¶
Gemma 使用最新一代的 张量处理单元 (TPU) 硬件 (TPUv5e) 进行训练。
训练大型语言模型需要大量的计算能力。TPU,专为机器学习中常见的矩阵操作设计,提供了该领域的几个优势:
- 性能:TPU 专门设计用来处理训练 LLMs 中涉及的大规模计算。它们可以显著加快训练速度,与 CPU 相比。
- 内存:TPU 通常配备大量高带宽内存,允许处理大型模型和批量大小的训练。这可以提升模型质量。
- 可扩展性:TPU Pods(大型 TPU 集群)为处理大型基础模型不断增长的复杂性提供了可扩展的解决方案。您可以将训练分布到多个 TPU 设备上,以实现更快、更高效的处理。
- 成本效益:在许多情况下,与基于 CPU 的基础设施相比,TPU 可以为训练大型模型提供更具成本效益的解决方案,特别是考虑到由于训练速度更快而节省的时间和资源。
- 这些优势与 Google 承诺的可持续运营 相一致。
软件¶
训练是使用 JAX 和 ML Pathways 进行的。
JAX 允许研究人员利用最新一代的硬件,包括 TPU,以更快、更高效地训练大型模型。
ML Pathways 是 Google 的最新努力,旨在构建能够跨多个任务泛化的人工智能系统。这对于 基础模型 特别适用,包括像这些的大型语言模型。
JAX 和 ML Pathways 是按照 关于 Gemini 模型家族的论文 中描述的方式使用的;"Jax 和 Pathways 的 ‘单一控制器’ 编程模型允许单个 Python 进程编排整个训练运行,极大地简化了开发工作流程。"
评估¶
模型评估指标和结果。
基准测试结果¶
这些模型针对大量不同的数据集和指标进行了评估,以涵盖文本生成的不同方面:
基准测试 | 指标 | 2B 参数 | 7B 参数 |
---|---|---|---|
MMLU | 5-shot, top-1 | 42.3 | 64.3 |
HellaSwag | 0-shot | 71.4 | 81.2 |
PIQA | 0-shot | 77.3 | 81.2 |
SocialIQA | 0-shot | 49.7 | 51.8 |
BooIQ | 0-shot | 69.4 | 83.2 |
WinoGrande | 部分分数 | 65.4 | 72.3 |
CommonsenseQA | 7-shot | 65.3 | 71.3 |
OpenBookQA | 47.8 | 52.8 | |
ARC-e | 73.2 | 81.5 | |
ARC-c | 42.1 | 53.2 | |
TriviaQA | 5-shot | 53.2 | 63.4 |
Natural Questions | 5-shot | 12.5 | 23 |
HumanEval | pass@1 | 22.0 | 32.3 |
MBPP | 3-shot | 29.2 | 44.4 |
GSM8K | maj@1 | 17.7 | 46.4 |
MATH | 4-shot | 11.8 | 24.3 |
AGIEval | 24.2 | 41.7 | |
BIG-Bench | 35.2 | 55.1 | |
平均值 | 45.0 | 56.9 |
伦理和安全性¶
伦理和安全性评估方法和结果。
评估方法¶
我们的评估方法包括对相关内容政策的结构化评估和内部红队测试。红队测试由多个不同团队进行,每个团队都有不同的目标和人工评估指标。这些模型针对与伦理和安全相关的多个不同类别进行了评估,包括:
- 文本到文本内容安全性:对涵盖安全政策的提示进行人工评估,包括儿童性虐待和剥削、骚扰、暴力和血腥、仇恨言论等。
- 文本到文本表征伤害:与相关的学术数据集进行基准测试,例如 WinoBias 和 BBQ 数据集。
- 记忆:自动评估对训练数据的记忆程度,包括个人身份信息曝光的风险。
- 大规模危害:对“危险能力”进行测试,例如化学、生物、放射性和核(CBRN)风险。
评估结果¶
伦理和安全性评估结果在符合 内部政策 的可接受阈值范围内,涵盖了儿童安全、内容安全、表征伤害、记忆、大规模危害等类别。除了稳健的内部评估外,还展示了诸如 BBQ、BOLD、Winogender、Winobias、RealToxicity 和 TruthfulQA 等知名安全基准的结果。
基准测试 | 指标 | 2B 参数 | 7B 参数 |
---|---|---|---|
RealToxicity | 平均值 | 6.86 | 7.90 |
BOLD | 45.57 | 49.08 | |
CrowS-Pairs | top-1 | 45.82 | 51.33 |
BBQ Ambig | 1-shot, top-1 | 62.58 | 92.54 |
BBQ Disambig | top-1 | 54.62 | 71.99 |
Winogender | top-1 | 51.25 | 54.17 |
TruthfulQA | 44.84 | 31.81 | |
Winobias 1_2 | 56.12 | 59.09 | |
Winobias 2_2 | 91.10 | 92.23 | |
Toxigen | 29.77 | 39.59 |
用途和限制¶
这些模型有一定的限制,用户应该注意。
预期用途¶
开放式大型语言模型(LLMs)在各行各业和领域都有广泛的应用。以下潜在用途列表并不全面。此列表的目的是提供关于可能的用例的上下文信息,这些用例是模型创建者在模型培训和开发过程中考虑的一部分。
- 内容创作与沟通
- 文本生成:这些模型可用于生成诸如诗歌、剧本、代码、营销文案和电子邮件草稿等创意文本格式。
- 聊天机器人和对话 AI:为客户服务、虚拟助手或交互式应用提供动力。
- 文本摘要:生成文本语料、研究论文或报告的简洁摘要。
- 研究与教育
- 自然语言处理(NLP)研究:这些模型可以作为研究人员进行 NLP 技术实验、开发算法并推动该领域发展的基础。
- 语言学习工具:支持交互式语言学习体验,帮助语法纠正或提供写作练习。
- 知识探索:通过生成摘要或回答特定主题的问题,协助研究人员探索大量文本。
限制¶
- 训练数据
- 训练数据的质量和多样性显著影响模型的能力。训练数据中的偏见或空白可能导致模型响应的局限性。
- 训练数据集的范围决定了模型能够有效处理的主题领域。
- 上下文和任务复杂性
- LLM 在可以用清晰提示和说明框架的任务上表现更好。开放式或高度复杂的任务可能具有挑战性。
- 上下文的提供量可以影响模型的性能(更长的上下文通常会产生更好的输出,但达到一定程度后会变得更差)。
- 语言歧义性和微妙之处
- 自然语言本质上是复杂的。LLM 可能难以理解微妙之处、讽刺或比喻语言。
- 事实准确性
- LLM 根据它们从训练数据集中学到的信息生成响应,但它们不是知识库。它们可能生成不正确或过时的事实陈述。
- 常识
- LLM 依赖语言中的统计模式。它们可能缺乏在某些情况下应用常识推理的能力。
道德考量和风险¶
大型语言模型(LLMs)的开发引发了几个道德关切。在创建一个开放模型时,我们仔细考虑了以下问题:
- 偏见和公平性
- 在大规模、现实世界的文本数据上训练的LLMs可能反映了嵌入在训练材料中的社会文化偏见。这些模型经过了仔细的审查,输入数据预处理描述和后续评估在此卡片中报告。
- 信息错误和滥用
- LLMs 可能被滥用来生成虚假、误导性或有害的文本。
- 提供了使用模型的负责任指南,请参阅 负责任生成AI工具包。
- 透明度和问责制:
- 本模型卡片总结了模型的架构、能力、限制和评估过程的详细信息。
- 负责任地开发一个开放模型为开发人员和研究人员提供了共享创新的机会,使LLM技术能够在AI生态系统中获得广泛应用。
已识别的风险和缓解措施:
- 偏见的持续存在:鼓励进行持续监控(使用评估指标、人工审查)和在模型训练、微调和其他用例中探索去偏置技术。
- 生成有害内容:内容安全机制和指南是必不可少的。开发人员被鼓励谨慎行事,并根据其特定产品政策和应用用例实施适当的内容安全保障措施。
- 用于恶意目的的滥用:技术限制和开发人员和最终用户教育可以帮助减少对LLMs的恶意应用。提供了教育资源和用于用户标记滥用的报告机制。Gemam模型的禁止使用情况在Gemma禁止使用政策中概述。
- 隐私侵犯:模型是在过滤掉PII(个人可识别信息)的数据上进行训练的。鼓励开发人员遵守隐私法规,采用隐私保护技术。
好处¶
发布时,这一系列模型提供了高性能的开放式大型语言模型实现,从头开始设计,以用于负责任的AI开发,与大小相似的其他开放式模型替代品相比。
使用本文档中描述的基准评估指标,这些模型已经显示出比其他规模相当的开放模型替代方案提供更优越的性能。