kciwsnurb

@kciwsnurb@aussie.zone

This profile is from a federated server and may be incomplete. Browse more on the original instance.

Ask ChatGPT to pick a number between 1 and 100 (jlai.lu)

kciwsnurb, 2 months ago

The temperature scale, I think. You divide the logit output by the temperature before feeding it to the softmax function. Larger (resp. smaller) temperature results in a higher (resp. lower) entropy distribution.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kciwsnurb, 2 months ago

Each row in the figure is a probability distribution over possible outputs (x-axis labels). The more yellow, the more likely (see the colour map on the right). With a small temperature (e.g., last row), all the probability mass is on 42. This is a low entropy distribution because if you sample from it you’ll constantly get 42, so no randomness whatsoever (think entropy as a measure of randomness/chaos). As temperature increases (rows closer to the first/topmost one), 42 is still the most likely output, but the probability mass gets dispersed to other possible outputs too (other outputs get a bit more yellow), resulting in higher entropy distributions. Sampling from such distribution gives you more random outputs (42 would still be frequent, but you’d get 37 or others too occasionally). Hopefully this is clearer.

Someone in another reply uses the word “creativity” to describe the effect of temperature scaling. The more commonly used term in the literature is “diversity”.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Hallucination is Inevitable: An Innate Limitation of Large Language Models (arxiv preprint) (arxiv.org)

Abstract:...

kciwsnurb, 4 months ago

two totally independent LLMs

How do you propose to get these independent LLMs? If both are trained using similar objectives e.g., masked token prediction, then they won’t be independent.

Also, assuming independent LLMs could be obtained, how do you propose to compute this hallucination probability? Without knowing this probability, you can’t know how many verification LLMs are sufficient for your application, can you?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kciwsnurb, 4 months ago

Can you provide the source of a few of these completely different LLMs?

add even a small amount of change into an LLM […] radically alter the output

You mean perturbing the parameters of the LLM? That’s hardly surprising IMO. And I’m not sure it’s convincing enough to show independence, unless you have a source for this?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kciwsnurb, 4 months ago

You seem very certain on this approach, but you gave no sources so far. Can you back this up with actual research or is this just based on your personal experience with chatgpt4?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kciwsnurb, 4 months ago

only in the podcasts I listen to

Yes definitely. Many of my fellow NLP researchers would disagree with those researchers and philosophers (not sure why we should care about the latter’s opinions on LLMs).

it’s using tokens, which are more like concepts than words

You’re clearly not an expert so please stop spreading misinformation like this.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kciwsnurb, 4 months ago

come up with new and unexpected things that never existed before

I’m not sure this is possible if the tech is still primarily built by learning from data, which by definition, has existed.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...