Is this the same model as the Jina API?

#14

by mjrdbds - opened Jun 19

Discussion

mjrdbds

Jun 19

Thank you for this great model! This is as much a question about this as the Jina AI API.

I'm trying to reconcile the results of this model vs the API. I get different results for the same image.

I'm using the default model, as initialized here, and the Jina AI API.

curl https://api.jina.ai/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <MY_TOKEN>" \
  -d '{
    "input": [
     {"url": "https://fastly.picsum.photos/id/84/1280/848.jpg?hmac=YFRYDI4UsfbeTzI8ZakNOR98wVU7a-9a2tGF542539s"}],
    "model": "jina-clip-v1",
    "encoding_type": "float"
  }'

from transformers import AutoModel
model = AutoModel.from_pretrained('jinaai/jina-clip-v1', trust_remote_code=True)
model.encode_image(["https://fastly.picsum.photos/id/84/1280/848.jpg?hmac=YFRYDI4UsfbeTzI8ZakNOR98wVU7a-9a2tGF542539s"])

These two vectors seem completely different. Is there any guarantee on what model is behind the API?

JoanFM

Jina AI org Jun 19

Hey @mjrdbds ,

This is the same model, let us check what may be going on

JoanFM

Jina AI org Jun 19

Hey @mjrdbds ,

The discrepancy comes from the API not returining the normalized vector.

If u normalize the vector from the API, you will get the same as the local one

mjrdbds

Jun 19

Aah this fixed it - thank you Joan for such a quick response.

For the record - I was hoping to use the API here (and compensate you for this great model), but I couldn't get rate limits fast enough or find an easy way to contact support.

I'll work around, but I do wish I could compensate you and save me a few hours of infra work ;)

mjrdbds changed discussion status to closed Jun 19

saahil-ognawala

Jina AI org Jun 20

@mjrdbds Thank you for your support! We're now enabling normalization on the API, so you can go ahead and use it to save efforts on your end.

Out of curiosity, what kind of rate limit are you hitting on the API, and what limit would you be satisfied with?

JoanFM

Jina AI org Jun 21

Hello @mjrdbds ,

Just out of curiosity, what is actually what you are getting? Is it the rate limit? how many requests per minute are you sending? or is it the latency or throughput?

mjrdbds

Jun 22

Sure - I would have wanted to embed about 10k docs of 1k tokens each in 10 seconds, so 1M tokens per second. I'm not very picky on latency. I care about the 10 seconds for quick iteration on our search stack.

Instead, I'm going through Modal to do this right now, but I'd be happy to switch back to your API.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment