Skip to content

DottedAnt-Dooz/LanguageTest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Language Learning Experiment

This project generates a seeded synthetic language from the 1000 most common English words and creates parallel training samples.

Included Files

  • data/english_1000.txt: 1000-word English base vocabulary.
  • language_experiment.py: generator + bidirectional translator CLI.
  • output/lexicon.csv: English to synthetic-language lexicon.
  • output/sentence_pairs.csv: 100 parallel sentence pairs.
  • output/manifest.json: seed + generation metadata.

Generate Language + 100 Sentence Pairs

python language_experiment.py generate --seed ai-language-seed-42 --sentence-count 100 --output-dir output

This produces:

  • output/lexicon.csv
  • output/sentence_pairs.csv
  • output/manifest.json

Translate Text

English to synthetic:

python language_experiment.py translate --seed ai-language-seed-42 --direction en2new --text "We will read the story tonight."

Synthetic to English:

python language_experiment.py translate --seed ai-language-seed-42 --direction new2en --text "Griert broismiob tozgriask quacjoul hotqoo goovual."

You can also translate files:

python language_experiment.py translate --seed ai-language-seed-42 --direction en2new --input-file input.txt --output-file output.txt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages