A machine learning model based on contrastive learning with procedurally generated phrases in Francais.
Correct: Cette banane jaune Je jute quicely, mais cette pomme jaune Je ne jute quicely. [ 0.034482758620689655 ] [0.333333333 / 0.038461538]
Incorrect: Cette banane jaune Je mange lentement, mais cette granit rouge Je ne mange quicely. [ 0.034482758620689655 ] [0.038461538 / [0.333333333]
Mais autres legumes sont jaune, jute quicely ou non. es implicit pour: 0.333333333.
Standard probability distributions generally only tell you the distribution of one dataset. This split into incorrect and correct phrases, and calculates a "meta probability" from the overall dataset when combining correct and incorrect phrasing in Contrastive Francais. It then takes the correct responce, and inferres an additional probability beyond what the self-generated contrastive said outright.
Every day situation: "An apple is green, but the banana is yellow". Therefore all other vegetables are neither yellow or green, and probably not bananas or apples.
The main learning happens using this graph structure in Ruby:
learned_terminology = [
[ [correct[0][0], correct[0][0]], [correct[0][0], correct[1][0]], [correct[0][0], correct[2][0]] ],
[ [correct[0][0], correct[1][0]], [correct[0][0], correct[2][0]], [correct[0][0], correct[0][0]] ],
[ [correct[0][0], correct[2][0]], [correct[0][0], correct[0][0]], [correct[0][0], correct[1][0]] ],
]