This project generates a seeded synthetic language from the 1000 most common English words and creates parallel training samples.
data/english_1000.txt: 1000-word English base vocabulary.language_experiment.py: generator + bidirectional translator CLI.output/lexicon.csv: English to synthetic-language lexicon.output/sentence_pairs.csv: 100 parallel sentence pairs.output/manifest.json: seed + generation metadata.
python language_experiment.py generate --seed ai-language-seed-42 --sentence-count 100 --output-dir outputThis produces:
output/lexicon.csvoutput/sentence_pairs.csvoutput/manifest.json
English to synthetic:
python language_experiment.py translate --seed ai-language-seed-42 --direction en2new --text "We will read the story tonight."Synthetic to English:
python language_experiment.py translate --seed ai-language-seed-42 --direction new2en --text "Griert broismiob tozgriask quacjoul hotqoo goovual."You can also translate files:
python language_experiment.py translate --seed ai-language-seed-42 --direction en2new --input-file input.txt --output-file output.txt