Skip to content
View yhinsson's full-sized avatar

Block or report yhinsson

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. airllm airllm Public

    🚀 Optimize memory for large language models, enabling 70B models on a 4GB GPU and 405B Llama3.1 on 8GB VRAM without compression techniques.

    2

  2. yhinsson.github.io yhinsson.github.io Public

    🚀 Optimize inference memory to run 70B language models on a 4GB GPU, and process 405B Llama3.1 with just 8GB VRAM.