Applied AI Scientist and AI Architect working across enterprise speech AI, multilingual AI, LLM systems, agentic AI, knowledge graphs, retrieval-based AI, evaluation systems, and AI architecture.
- 👋 Hi, I’m Tatjana Chernenko
- 📫 Contact: tatjana.chernenko.work@gmail.com
- Applied AI Scientist and AI Architect with work spanning enterprise speech AI, multilingual AI, applied NLP, LLM systems, agentic AI, RAG, knowledge graphs, evaluation and benchmarking systems, terminology intelligence, and AI-ready data architecture.
- My public GitHub contains a selective set of personal, academic, and research-oriented technical artifacts. Most recent enterprise work is not public due to confidentiality and employer constraints.
- Speech AI, multilingual AI, and language technologies
- Applied NLP, terminology intelligence, and specialised-vocabulary handling
- LLM systems, GenAI
- Retrieval-augmented generation (RAG)
- Agentic AI workflows
- Knowledge graphs, knowledge-enhanced AI, and workflow automation
- Evaluation, benchmarking, and reliability-oriented AI quality systems
- Enterprise AI architecture, data foundations, and governance-aware AI execution
- LREC 2026: A Dataset for Evaluating ASR on Specialized Vocabulary*
- US Patent: Semantic Domain Assignment Referencing Governance Domains and Term Databases
T. Chernenko, B. Schork, M. DANEI — US Patent 12,518,105 (2026) - US Patent Application: Adaptive Fidelity Pipeline for Minimizing Hallucinations and Skipped Content in Speech-to-Text Systems
US Patent App. 250089US01 (2025) - US Patent: System and Method Performing Terminology Disambiguation
T. Chernenko, B. Schork, M. DANEI — US Patent 12,386,820 (2025) - US Patent: Detection of Abbreviation and Mapping to Full Original Term
T. Chernenko, A. Snitko, J. Scharnbacher, M. Vasiltschenko — US Patent 12,067,370 (2024)
-
CHERTOY: Word Sense Induction for Web Search Result Clustering
Academic NLP research project at the Institute for Computational Linguistics, Heidelberg University, based on the SemEval-2013 WSI task. Built an unsupervised word sense induction pipeline for clustering ambiguous web-search snippets into semantically coherent subtopic groups using sense2vec word representations, vector-mixture bag-of-words snippet embeddings, and MeanShift clustering; evaluated 40 controlled experimental variants across preprocessing, embedding models, compositional representations, and clustering algorithms, improving pairwise clustering quality over baseline.GitHub: CHERTOY System
-
Natural Language Generation from Structured Inputs for Image Description Generation
Academic research project at the Institute for Computational Linguistics, Heidelberg University, on structured-to-text generation for image description. Built an encoder-decoder architecture with a feed-forward encoder over normalized attribute vectors and an LSTM decoder for sequence generation, using MS COCO, V-COCO, and COCO-a to model objects, actions, semantic roles, spatial relations, and descriptive attributes under automatic and human evaluation.GitHub: Data-to-text Generation
-
LexRank-based Text Summarization with Semantic Similarity Enhancements
Research project on extractive summarization extending LexRank with semantic-similarity features to improve sentence ranking and summary quality in longer documents.GitHub: Text Summarization with LexRank
Selected older repositories in areas such as predictive maintenance, anomaly detection, reinforcement learning, speech adaptation, and data augmentation remain available in the profile history as secondary technical artifacts.