juewang (Jue Wang) – Community Activity

New activity in codellama/CodeLlama-70b-Instruct-hf 8 months ago

Context length?

10

#2 opened 8 months ago by

turboderp

New activity in EleutherAI/neox-ckpt-pythia-12b-v1 11 months ago

Missing files?

#1 opened 11 months ago by

juewang

New activity in togethercomputer/LLaMA-2-7B-32K about 1 year ago

Correct the output dtype of rmsnorm_func

2

#13 opened about 1 year ago by

ag0

how to fine tune peft qlora and SFTTrainer?

12

#2 opened about 1 year ago by

NickyNicky

New activity in togethercomputer/RedPajama-INCITE-7B-Instruct about 1 year ago

Poor performance?

4

#6 opened about 1 year ago by

Fionn

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B over 1 year ago

Can you help me fine-tune this with LoRA? (Having an error)

1

#12 opened over 1 year ago by

AayushShah

What kind of machine would be suitable for this model (in amazon sagemaker)?

5

#7 opened over 1 year ago by

juusohugs

Will it be possible to run this on PC with 8 GeForce RTX 3060 with 8 Gb VRAM each?

2

#11 opened over 1 year ago by

ai2p

New activity in togethercomputer/GPT-JT-6B-v1 over 1 year ago

Any way to set the "stop, split by" when running the model locally?

4

#26 opened over 1 year ago by

johnnyracer

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B over 1 year ago

Issue with loading model to GPU when using pipeline

2

#5 opened over 1 year ago by

AlpYu-HubX

Is it a wrong prompt?

4

#8 opened over 1 year ago by

tatyanavidrevich

New activity in togethercomputer/GPT-JT-6B-v1 over 1 year ago

Feature requests and suggestions for V2

9

#4 opened almost 2 years ago by

zhangce

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B over 1 year ago

use accelerate to load model

1

#4 opened over 1 year ago by

adolf669

This model requires A LOT of resources... But how much? Trying to build a chatbot

9

#3 opened over 1 year ago by

joanfmendo

New activity in togethercomputer/GPT-JT-6B-v1 over 1 year ago

Generated Text have issues

10

#22 opened over 1 year ago by

asifahmed

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B over 1 year ago

Is UL2 used?

1

#2 opened over 1 year ago by

JunnanLi

New activity in togethercomputer/GPT-JT-6B-v1 over 1 year ago

Question-Answering over documents

3

#19 opened over 1 year ago by

tmishinev

Confused about bidirectional attention when implementing custom sampling loop

2

#25 opened over 1 year ago by

ericanthonymitchell

Model behavior during adaptation phase

2

#24 opened over 1 year ago by

jlli

Fine Tuning // Download Full Weights

2

#23 opened over 1 year ago by

idop11

PrefixLM finetuning details

1

#21 opened over 1 year ago by

jlli

New activity in togethercomputer/GPT-JT over 1 year ago

How to try it out? I provide WIP

3

#1 opened almost 2 years ago by

billy-ai

New activity in togethercomputer/GPT-JT-6B-v1 over 1 year ago

What is the fine tuning process of GPT-JT-6B-v1 Copied ? Any Docs available ?

5

#15 opened over 1 year ago by

MukeshSharma

Effect of UL2 training objective

1

#20 opened over 1 year ago by

malteos

Hardware requirements for inference?

6

#9 opened almost 2 years ago by

spartanml

How do you use the bidirectional aspect of the model?

11

#1 opened almost 2 years ago by

BigSalmon

Model license

1

#6 opened almost 2 years ago by

kristaller486

Complete noob question - cloned the repository, now what?

3

#17 opened over 1 year ago by

hansintheair

Will using FP32 be better than using FP16?

1

#18 opened over 1 year ago by

Zenwill

New activity in togethercomputer/GPT-JT almost 2 years ago

Generate parameters

2

#5 opened almost 2 years ago by

vonjack

New activity in togethercomputer/GPT-JT-6B-v1 almost 2 years ago

Model sans facts?

2

#10 opened almost 2 years ago by

spartanml

New activity in facebook/opt-66b about 2 years ago

OPT has `max_embedding_size` 2050

1

#3 opened about 2 years ago by

TimeRobber

Jue Wang

AI & ML interests

Organizations

juewang's activity

Context length?

Missing files?

Correct the output dtype of rmsnorm_func

how to fine tune peft qlora and SFTTrainer?

Poor performance?

Can you help me fine-tune this with LoRA? (Having an error)

What kind of machine would be suitable for this model (in amazon sagemaker)?

Will it be possible to run this on PC with 8 GeForce RTX 3060 with 8 Gb VRAM each?

Any way to set the "stop, split by" when running the model locally?

Issue with loading model to GPU when using pipeline

Is it a wrong prompt?

Feature requests and suggestions for V2

use accelerate to load model

This model requires A LOT of resources... But how much? Trying to build a chatbot

Generated Text have issues

Is UL2 used?

Question-Answering over documents

Confused about bidirectional attention when implementing custom sampling loop

Model behavior during adaptation phase

Fine Tuning // Download Full Weights

PrefixLM finetuning details

How to try it out? I provide WIP

What is the fine tuning process of GPT-JT-6B-v1 Copied ? Any Docs available ?

Effect of UL2 training objective

Hardware requirements for inference?

How do you use the bidirectional aspect of the model?

Model license

Complete noob question - cloned the repository, now what?

Will using FP32 be better than using FP16?

Generate parameters

Model sans facts?

OPT has `max_embedding_size` 2050